views:

37

answers:

3

I am writing an interface to a sorted collection of objects. As is the usual, I leave it up to the user to specify how these items are sorted. I am currently torn however, between offering a key-value interface (where the sort key is explicitly separate from the value) or a value-only interface (where the value is either also the sort key, or the user must handle a separate sort key by passing in some comparison function).

In my view, a key-value interface forces the user to always have a key separate from the value, even when some value naturally forms its own key. It does take away responsibility for handling the key from the user however, possibly leading to simpler and cleaner user code when using my API. A value-only interface allows for a more compact representation of values that are their own key, but forces the user to track and handle their own keys in cases where there is a natural key-value distinction.

There is literature to support both approaches of course, though it seems to me (could be wrong on this) that older literature tends to prefer a value-only approach, while newer literature prefers a key-value approach.

I am curious as to your preferences in cases like these. Have we arrived at a point in time where one is generally preferred over the other? If not, what do you usually use, and why?

A: 

Well, this seems to answer your requirements. When there is no keyseelctor specified, the collection assumes that the ordering is defined on the items themselves- they should probably implement IComparable. Otherwise- In case you have items that theoretically can be compared in many ways- you can specify a key selector that would return a projection of an item that is IComaprable by itself.

For example, if you have a class Person with a Name and an Age, and you wish to keep a sorted list by age, You create the collection in the following manner:

new OrderedCollection(person => person.Age);

BTW check out the SortedDictionary and SortedList collections that come with .Net. They might come in handy.

Vitaliy
Could you elaborate as to why you would make that choice?
Soonil
+1  A: 

You seem to be torn between what are called (in various languages, with various distinction, but essentially...)

  • A Dictionary (aka a hashtable even a hash)
  • A List / Array

These two types of containers serve different purposes, altough, with a few tweaks, there's little a List can do that a Dictionary cannot. That's simply because the Dictionary has more information.

In general it is better to be explicit than implicit.
If I hand you a "2 dollars blue crayon" (a list), you'll infer that it is an object with the following properties: {Price = 2$, Color = blue, Type = Crayon} (Dictionary). However such parsing (when needed) requires both effort and knowledge of the domain or the implicit structure of the data; the dictionary approach allows you to handle information in a generic way.

There are a few cases when lists are "better", but that is often tied to a technical / operational consideration, rather than to a intrinsic property of the list approach. For example for implementing search indexes with a simple full text engine, mashing all properties together may be required (This mirrors the way the end-user types in un-labelled keywords, expecting them to be found in any property).

A practical recommendation, in response to the question is to offer:

  • An API which exposes the usage/behavior of both Lists and Dictionary
  • An optimized (size-wise and/or space-wise, depending where it matters) implementation that preserves some of the List's approach "edge" with regards to performance, at least until the container is used as a Dictionary.

Unless of a priori obvious concern regarding performance (eg: containers will receive tens of thousands of items, there will thousands of container instances, the containers will transit over a slow channel etc.), I would focus on the API first, taking my ease with the implementation, for example basing it on a Dictionary. At a later date, if that becomes necessary, the implementation can be revised, for example with the two lists and a map schema.

One last thing: there are plenty of libraries (or even language-builtins) which offer this kind of polymorphic containers: maybe see if one is available for your target system/language...

mjv
You have grasped the problem, but if you are recommending a value-only (List) structure over a key-value (Dictionary), please provide some reasoning.
Soonil
It seems to me that one could look at dictionaries as an (almost) special case of lists (while we are talking about sorted collection at least). Lists abstract away the sort key to let the user deal with it in whatever way they choose, while dictionaries have an explicit sort key attached to their list of values. This makes lists more generic, though perhaps, not always as efficient.
Soonil
Thanks for the detailed answer. I was leaning towards a value-only implementation, but your comments, combined with some more research of my own has pushed me towards a key-value approach. After all, while conceptually a Set may be in some ways a higher abstraction than a Map, you can implement a Set from a Map, but not vice versa.This is more of an exercise for me, rather than something I need, but there are unfortunately the language I'm currently working in (C#) suffers from some serious design flaws when it comes to such containers :/
Soonil
Soonil, you seem to regret the fact that with dictionaries, unlike lists, the order is not `implicitly controlled by the user` (unless he/she makes the effort of specifying a compare function etc. and yet the order then is bound to the function rather than ad-hoc insertion order). With the list you can add items in a given order, remove a few, insert somewhere within, and a default iteration over the list will return in the expected order. That is useful, and can well be emulated in a [suped-up] dictionary however.
mjv
A: 

Just about anything a list can do a dictionary can do, if one creates a wrapper struct or object for key+value, and simply leaves the value part blank. The only thing a dictionary can't really do is add a value of a particular type and then respond directly to a GetEnumerator request with an enumerator that returns that type. It would be easy, however, to construct a wrapper class around a dictionary to do that.

A major difference between a dictionary and a list, however, is the way they handle records where the Key part is identical but the Value part is not. While I'm not a huge fan of the way Dictionary handles things (the default indexed property uses 'replace' semantics, while the add method uses fail-if-exists semantics; my preference would have been to have a mode parameter for "add"), it allows for changing the value associated with a particular key. A "list" can only try to do that by removing the old value and adding a new one, an operation which may cause semantic problems.

supercat