views:

188

answers:

7

I want to store some objects and then be able to retrieve them later as efficiently as possible. I will also remove some of them under certain conditions. It seems a hash map would be the right choice.

But, from what I've seen, hash maps always associate a value with another? For example, "john" and "555-5555", his phone number.

Now, my situation. Suppose I have a bunch of people, and each person is connected to other people. So, I need each person to store its contacts.

What I'm doing is have each person have a hashmap, and then I'd add to the hash otherPerson, otherPerson. Basically, the key is the value. Am I doing it wrong?

EDIT I don't think the HashSet would solve my problem because I have to retrieve the value to update it and there is no get method. Remove returns a boolean, so I can't even remove it to put it back again, which would probably be a bad idea anyway.

+2  A: 

Well, the data structure you'd be looking for here, would be a HashSet (or some other kind of set), I think (if your framework/library offers it). A set just says "I have the following items" instead of "I have the following items mapped to the following values". Which would be what you're modeling here.

As for HashSet vs. other implementations (if present): That all depends on what you're doing. If you need fast lookup, i. e. "is this element in the set?" questions, then hashing is a good thing. Other underlying data structures are perhaps better optimized for other set operations, such as union, intersection, etc.

Joey
+2  A: 

A hash table/map simply requires that you have a way to get the values you're interested in looking up later; that's what the key is for.

However, in your specific case, it sounds like you're looking for a way to store relationships between people, and what you're keeping track of is whether or not person A has a relationship with person B. A better representation for that sort of thing is an adjacency list.

John Feminella
A: 

Am I missing something or don't you simply need an ArrayList<Person>?

JRL
The array list uses the element index to find it. The index is not important to me and has no relation to the element.I could use indexOf(Object) and then use the result with a get, but wouldn't that be pretty inefficient? Wouldn't it have to go through the data structure twice? That seems pretty bad. I'll be doing this a lot, so I thought it would be better to use the hash map.
they changed my name
It might be, then again, you'd have to profile it to really be sure. At this point, you're storing a list of contacts, so an ArrayList seems the most straightforward. I'd worry about a bottleneck when you measure it. But you might have prior experience that tells you this would be a bottleneck, I don't know.
JRL
+3  A: 

If all you need is checking if A is one of B's contacts, then Set is choice. It has contains() for that purpose.

Otherwise, the most suitable might be Map, as you need efficient retrieval operation. You said currently you use same object as key and value, but I'm not sure how you get the the key in the first place. Say you'd like to get contact A from B's contacts, and you use something like 'B.contacts.get(A)', where do you get A from? If you already have A, what's for to get it from the map again? (maybe there are multiple instances of the same person?)

Unless there are multiple instances of the same person, I'd say for each Person, define a ID like unique attribute, and use that as the key for the contacts map. Also, do you define equal()/hashCode() for person class? Map/Set uses hashCode() and equal() for finding the match. Depending on your usage, you might need to consider rewrite them for efficiency.

bryantsai
A: 

I would just store the contacts in a List<Person>. E.g.

public class Person {
    private List<Person> contacts;
}

With regard to editing the individual contact, it is really not the parent Person's responsibility to do that. It should at highest add/remove contacts. You can perfectly do that by contacts.add(otherPerson) or contacts.remove(otherPerson).

When you want to edit an individual Person, which may be one of the contacts, just get a handle to it independently, e.g. personDAO.find(personId) and then update it accordingly. It's actually also the Person's own responsibility to edit own details. With a good ORM under the hood, the changes will be reflected in the contact list of other Persons.

BalusC
+2  A: 

I don't think the HashSet would solve my problem because I have to retrieve the value to update it and there is no get method.

This is a puzzling statement. Why would you want to retrieve a value using a get method to update it? Surely, if you know which object you need to retrieve from the set/map, you don't need to retrieve it.

For example:

HashSet<Person> relations = ...
Person p = ...
if (relations.remove(p)) {
    // we removed an object such that p.equals(obj) is true.
}

Now if you are worried that the object that was removed was equal to, but not identical to p, it seems to me that something is wrong with your design. Either:

  • you should not be creating multiple Person instances that are equal, or
  • you should not be caring that Person instances are not identical, or
  • you should not have overridden equals(Object).

In short, the problem is that you are not managing object identity properly.

Stephen C
A: 

If you need to iterate through the people, or require them to have ordering, consider TreeMap or TreeSet instead of hashing.

ZoFreX