Better to use a list of pairs, or two lists?

views:

499

answers:

+2 Q:

Better to use a list of pairs, or two lists?

I'm writing a method that forms part of the public interface of a Java class. It broadly allows the caller to specify the values to assign to a number of database entities - so they must supply both the IDs of the entities themselves, and the values to assign to them.

I'm wavering between implementing this as a List<Pair<Integer, Integer>> or just two List<Integer> arguments. Both would obviously work, and neither would cause any implementation or efficiency problems within my method. It's basically the same information in any case (a 2xn array), just striped differently.

So I'd like some opinions on which one you think would be better, and why.

Advantages I see so far for the list of pairs:

More accurately reflects the actual relationship between the entities
Eliminates some classes of dynamic error (e.g. mismatched list lengths)

Advantages for the pair of lists:

Does not rely on any non-JDK classes (simple as Pair is to grasp as a concept)
Does not require construction of any auxiliary objects just to carry the data around
The callers are more likely to have the arguments in separate lists anyway, so they don't need to realign the data before calling the method

Both cases have identical type-safety, as well as the same possibility for argument mismatch (e.g. entering values first and IDs second when it should be the other way around). This latter problem can be avoided by creating a trivial wrapper around Integer called something like PrimaryKey, which has its own pros and cons and is orthogonal to this issue anyway as this can be used just as well in both cases.

However there's a middle ground that could feasibly be a third option - a trivial container class with integer fields for objectId and value. This doesn't enlist the compiler's help in ensuring the objects are correct through typing, but it does provide an extra layer of security in the assignments. I don't think I'd go for this, though, as I don't like the idea of polluting a public interface with a trivial class like this.

+18 A:

I would strongly recommend tying that data together, possibly as a Pair, but more as a specific container. That way you're at liberty to introduce extra functionality related to those two objects.

Keeping the two lists separate is going to be a pain. You'll have to pass both around together and keep them in sync. The overhead of creating a new object for this is negligible, and precisely what OOP is designed for (you'll be creaing non-JDK classes just by writing new Java code for your application, don't forget).

I create minor classes like this all the time. They enforce strong type safety and provide the scope for enhancement and refactoring. IDEs can more readily identify and perform refactorings.

Brian Agnew 2009-09-22 08:36:23

All good points about the point of OO, and options for refactoring - thanks.

Andrzej Doyle 2009-09-22 10:04:37

As an additional bonus - If you have list of pairs, you can easily make list of triples out of this, should you need to. Requirements change, you know.

Ula Krukar 2009-09-22 13:51:38

Which is why I advocate not using a Pair object explicitly :-)

Brian Agnew 2009-09-22 15:52:20

+2 A:

When the API is public make sure you just implement the core functionality because everything you give to a client can never be taken back.

Think about a method with just 2 Parameters assign(Integer entityId, Integer someOtherId) and let the client figure it out how to add more items.

If you want to allow convenient acces give them a util class that comes with an interface "Assignment" which just defines getEntityId() and getSomeOtherId() and with a default implementation like DefaultAssignment which holds final references to the ids.

Then your clients can choose which way they want to go and you dont have to make sure to support your "Pair" class.

MrWhite 2009-09-22 08:37:23

Good point about the interface, this seems to come to a similar conclusion to Brian Agnew's answer. Objects for everything! :-)

Andrzej Doyle 2009-09-22 10:07:27

Exactly and good luck.

MrWhite 2009-09-23 17:42:35

+5 A:

I feel that the analysis you've gone through already identifies the pros and cons of the two methods very well.

I would also lean towards using the Pair class, citing the advantages that you've listed:

More accurately reflects the actual relationship between the entities

Eliminates some classes of dynamic error (e.g. mismatched list lengths)

The big one for me would the top one.

Always write code which demonstrates the intent. Don't code by thinking solely about the implementation.

Using an class like Pair shows the intent that there is a pair of values which represents an ID and the actual entity itself. It also shows that one Pair object is one discrete unit of data which needs to be handled together -- something that cannot be infered from having two separate Lists of IDs and entities.

coobird 2009-09-22 08:37:54

+1 for the bolded part in particular - on reflection it makes sense for the clients to gather the arguments in the form that my method really uses them.

Andrzej Doyle 2009-09-22 10:05:48

In my opinion, go with the list of pairs (or your own more specific container) since this seems to be what most naturally reflects the problem you are solving. And it is only one variable to keep track of, and send to methods, instead of two.

Thomas Padron-McCarthy 2009-09-22 08:38:23

+1 A:

I would go for the List of Pairs, if the two integers really belong together and are supposed to represent "one thing".

Another disadvantage of using two separate lists is that the compiler doesn't check for you if the lists are always the same size. And you leave the interpretation up to the client of your method; the client needs to match up the elements in the two lists.

I don't see using a class like Pair as a big disadvantage "because it relies on non-JDK classes", as you say, but if you really don't want to use something like Pair, you could even make it a list of arrays: List<Integer[]>, where each array contains the two Integer objects. (In my opinion that's uglier than using Pair).

Jesper 2009-09-22 08:38:45

+5 A:

Why not go with a Dictionary (map type, whatever Java calls it nowadays, I think it used to be Hashtable)? This would map IDs to values and sidesteps your dilemma.

Daren Thomas 2009-09-22 08:53:17

+1 good point - I'm surprised I didn't think of that. It accurately maps the relationship between the two things as well as any of the other suggestions.

Andrzej Doyle 2009-09-22 10:08:36

a background in scripting language might help. There is an old perl (yuck!) adage: If you are not thinking in hashes (dictionaries), you are not thinking in perl...

Daren Thomas 2009-09-22 11:31:06

+2 A:

The list-of-pairs approach is the obvious thing to do, as it more accurately reflects the intention -- but a Map might be even better, if the IDs are unique.

If you have millions of these pairs, and memory consumption becomes a major consideration, you might want to switch to pairs-of-lists (since there will be one less object per pair).

mfx 2009-09-22 08:56:37

Go with the List of Pairs or a Map.

I'm not buying any of the "con" arguments you listed.

You should always err on the side of greater abstraction. You're writing an object-oriented language, and encapsulation and information hiding are, in my opinion, the most important reason for thinking in objects. It hides implementation details from users, letting them concentrate on what your class provides. They can forget about how your class accomplishes what they want. Forcing your clients to maintain two lists is asking them to know more than they need to.

duffymo 2009-09-22 09:49:27

Write the unit tests first. Then you can easily see what is easiest to work with, and then choose that as your implementation.

Thorbjørn Ravn Andersen 2009-09-22 10:06:05

-1 this may be true, but has nothing to do with the question.

finnw 2009-09-22 11:17:06

He asks for opinions what is better. The best is the one that he likes to work with!

Thorbjørn Ravn Andersen 2009-09-22 13:23:53

You might consider making the parameter an Iterator<Pair>. If the calling class has a List<Pair>, getting the iterator is trivial. If not, creating the iterator might be easier to construct, faster and/or more natural. In some cases, it might even help you eliminiating out-of-memory problems.

ammoQ 2009-09-22 10:44:15

I half-disagree with this point:

Does not rely on any non-JDK classes (simple as Pair is to grasp as a concept)

because although there is no Pair class in the JDK, there is an interface: Map.Entry which fits your intended use as a key/value association.

finnw 2009-09-22 11:22:55

ansaurus

tags:

views:

answers:

Better to use a list of pairs, or two lists?

related questions