ansaurus

Question

How to handle mismatched lists for a Zip algorithm

Answer 1

+9 A:

There's no "right" or "wrong" way to do this, the implementation details are up to you. You can take several approaches:

1) F# approach: throw an exception.

2) Haskell and Python approach: Truncate output to minimum length of your inputs.

zip [1; 2; 3] ['a'; 'b'; 'c'; 'd'; 'e']
  = [ (1, 'a'); (2, 'b'); (3, 'c')]

It can be occasionally useful to truncate, as is normally the case when you zip a finite and infinite sequence.

3) Ruby approach: zero or nil out unavailable values:

zip [1; 2; 3] ['a'; 'b'; 'c'; 'd'; 'e']
  = [ (1, 'a'); (2, 'b'); (3, 'c'); (nil, 'd'); (nil, 'e')]

4) Shorten tuples 2- and 1-tuples as needed:

zip [1; 2; 3] ['a'; 'b'; 'c'; 'd'; 'e']
  = [ (1, 'a'); (2, 'b'); (3, 'c'); ('d'); ('e')]

I don't know any language which does this.

Juliet 2009-09-24 20:28:16

So what you're saying is that different implementations all do it differently. Figures... :(

Mason Wheeler 2009-09-24 21:11:03

Hence the reason why there's no "wrong" approach ;) In my opinion, you can probably narrow down the best implementation by process of elimination: throwing an exception seems acceptable. Silently truncating a list is appropriate when you use languages that lend themselves easily to infinite lists, which seems to exclude Delphi. Substituting null values seems appropriate when your language doesn't treat primitives like ints and bools different from any other object type, which would exclude Delphi again. It seems best to throw an exception by default.

Juliet 2009-09-24 21:32:30

If you think you might want to ever support passing in a function acting as a lazy evaluator, keep in mind that you would have to just truncate in that case, since you cannot know in advance how many elements are available - so if that might be relevant, you might want to do the same for "regular" zipping, to keep the behavior consistent. This is probably more relevant when you're dealing with closures, though - in my opinion, they lend themeselves better to the lazy evaluation approach than having to declare variables in some class and use that for state management.

Michael Madsen 2009-09-24 21:57:39

The correct approach is #2. The other approaches have the incorrect type. Throwing an exception changes the type of the function so that it's non-total (it doesn't always return a list). That's probably not what you want. Returning nulls is cheating. The type of the function says that for lists of types `[a]` and `[b]` it will return a list `[(a, b)]`. Well, `null` is of neither type `a` nor `b`, so your type would be a lie.

Apocalisp 2009-10-12 00:45:30

That's not always true. In Delphi, **nil** (null) is of type Pointer, and all objects are accessed through transparent pointer references. It's an accepted fact that any object reference can be **nil**, and your code is expected to be able to handle it as a valid value, one way or another. (But if you're using value types and not objects, things get more complicated.)

Mason Wheeler 2009-10-12 12:59:35

Answer 2

+1 A:

Python truncates to the length of the minimum sequence. That works for me.

Steve 2009-09-24 20:31:36

`zip` truncates, but you can give `map` multiple sequences, and it extends with `None`; e.g. `map(tuplify, [1,2,3], [4,5]) == [(1,4), (2,5), (3,None)]` given `def tuplify(*s): return tuple(s)`.

ephemient 2009-09-25 20:54:20

Answer 3

+2 A:

A common thing to do is to zip a list with its own tail. For example to turn a list of points into a path that visits those points. The tail is obviously one shorter than the list, but this is certainly not an exceptional case. Another common thing is to zip a list with all of its tails to get a list of unordered pairs that can be constructed from the list (to construct a complete graph, for example:

liftM2 (=<<) zip tails

This would be really difficult if zip threw exceptions or returned nulls. The expected behaviour therefore is to truncate the output to the length of the shorter list. This is consistent with the type of the function.

Apocalisp 2009-10-12 01:11:35

ansaurus

tags:

views:

answers:

How to handle mismatched lists for a Zip algorithm

related questions