views:

121

answers:

5

Hi all,

This is a trivial programming question. I am not an expert in Java. Say I use objects of custom classes Company and Employee, in a manner similar to what many RDBMS examples do:

class Employee
{
    Company company;
}

class Company
{
    String name;
}

I need to guarantee that different Company objects have unique names - i.e. no two such objects may have the same name, because from my point of view it makes no sense, and also simply eats memory - if two employees work at IBM, then there is a single Company object with that name, period.

My thoughts right now go along of making Company constructor private - so that the job of allocating Company objects with arbitrary names is delegated to a trusted method - which, suppose, will reject any subsequent attempt to create an object with a name that already exists or return an existing or new object (creating one if necessary).

The problem is, I am not sure how to accomplish this elegantly. One thing that would be nice is not having to do a O(n) lookup every time an Company object with a name is requested - so maybe a hash map or a binary tree is there for my convenience? I would also like to override the way the Company objects are identified - which leads me to this: will I be overriding Object.equals and/or Object.hashCode methods?

+4  A: 

You could override equals and hashCode and store them in a HashMap or HashSet.

Fabian Steeg
how would you look them up?
Michael Borgwardt
@Michael: see the flyweight pattern mentioned below.
aperkins
@Michael My basic idea was it would allow all sorts of usages that might be natural, like `map.get(company)` or `set.contains(company)`.
Fabian Steeg
@Michael: which isn't too difficult to implement either, since you can simply implement hashCode as `return name.hashCode();` and .equals() as `this == other`.
Lie Ryan
@Fabian: except that those usages will work *without* overriding anything, but they will not help in keeping values unique.
Michael Borgwardt
@Lie: or you could, you know, just not do anything at all, with the same effect.
Michael Borgwardt
@Michael: except that overriding `equals` and `hashCode` properly will allow `hashMap` and `hashSet` to eliminate duplicate values for you. Doing nothing does not give the same effect.
Lie Ryan
@Lie: except it does, if duplicate values are prevented at creation as amn intends to do. It really helps to read the question...
Michael Borgwardt
@Michael Borgwardt: except you've misunderstood the context. The HashSet we're talking about is used inside the factory method to prevent the construction of duplicate values, just as amn intended to do. How do you think the factory method can prevent the construction of duplicates if it doesn't have a list/set of all existing instances? It really helps to admit you have misunderstood.
Lie Ryan
@Lie: It is you who still misunderstands. hashCode and equals within the domain class are only useful to recognize *existing* duplicates, but you want to *avoid* creating duplicates. The only way to do that is to have a Map of instances keyed to the name - which makes hashCode and equals within the domain class unnecessary.
Michael Borgwardt
@Michael True, I see your point - I guess the answer is more to the title of the question than to the actual question :-)
Fabian Steeg
+6  A: 

Take a look at the flyweight pattern.

What I would do is something like:

// incomplete, but you get the idea hopefully
CompanyFactory
{
    private Map<String, Company> companies;

    public getCompany(final String name)
    {
        Company company;

        company = compaines.get(name);

        if(company == null)
        { 
            company = new Company(name);
            companies.put(name, company);
        }

        return (company);
    }
}
TofuBeer
+1 - This is the exact problem that the flyweight pattern was designed to solve. :)
aperkins
Yes, that's what I started doing. However, I am concerned about references to Company objects - it looks like that this map, even though useful here, will retain the references to all Company objects ever created, even though there may be no more "good" references left to these objects. I am thinking of a WeakHashMap, but the weak refers to the keys, not values :'-(
amn
I believe when the key is removed from the WeakHashMap the value is also removed, and is thus eligible for garbage collection.
TofuBeer
Yes you are right, but my keys are Strings - when are these removed? So, I am not entirely sure how a WeakHashMap<String, Company> will even behave, considering it will use `String`s as keys...
amn
If you allow Companies to be collected with a WeakHashMap, then you can't guarantee your requirement: "I need to guarantee that different Company objects have unique names". Unless you have a point where you know that a company has been deleted, and your Factory object can be informed when it's deleted. No need for a WeakHashMap here, I think.
nojo
I can't follow your logic where you state I cannot guarantee the requirement? I can't see how the fact that I have a weak-key based hash map affects uniqueness of names? Also, in effect, notifying Factory object of deleted Companies is accomplished with so called reference queues, which is again a form of dealing with weak references. Or I simply misunderstood your point...
amn
+1  A: 

One thing that would be nice is not having to do a O(n) lookup every time an Company object with a name is requested - so maybe a hash map or a binary tree is there for my convenience?

That sounds right, yes.

I would also like to override the way the Company objects are identified - which leads me to this: will I be overriding Object.equals and/or Object.hashCode methods?

If you ensure that there are never two instances with the same key value, you actually don't have to do that.

Michael Borgwardt
A: 

When I want to create a collection of objects that can be looked by by unique names, I store a Map of Strings (names) to Objects. I can then look up the object related to a name.

Strictly speaking, you don't need to touch equals() and hashCode() to do that, as you are not storing your objects as keys. Implementing equals() and hashCode() correctly can be difficult to get right, and keys in a Map implementation like HashMap (which can give you efficient lookups) is sensitive to these methods. Using the existing (and crucially immutable) String class as keys helps you get this lookup functionality right.


Update: If you make your constructor private, as you mentioned, you can prevent the creation of new Company instances. Providing some other method of creating Company instances (Factory pattern) allows you ensure that 'new' Company instances are only really new if they are not already stored by name, otherwise the existing instance for a given name is returned (example of Singleton)

Brabster
Actually, storing a String as the key isn't mathematically sufficient. It is mathematically possible for 2 different Strings to hash to the same value.
John Engelman
...which puts them in the same bucket, and the String equals() method takes over to ensure uniqueness...?
Brabster
@John Actually it is sufficient.
EJP
A: 

If the name of an object can change over time, consider including a generated unique ID for each object. Your mapping would be from name to unique ID, then from unique ID to the actual object.

By the way, if the objects were all known at compile time, you could use an Enum.

Andy Thomas-Cramer