views:

102

answers:

1

I'm trying to be a good erlanger and avoid "++". I need to add a tuple to the end of a list without creating a nested list (and hopefully without having to build it backwards and reverse it). Given tuple T and lists L0 and L1:

When I use [T|L0] I get [tuple,list0].

But when I use [L0|T], I get nested list [[list0]|tuple]. Similarly, [L0|L1] returns [[list0]|list1].

Removing the outside list brackets L0|[T] produces a syntax error.

Why is "|" not symmetric? Is there a way to do what I want using "|"?

+10  A: 

| is not "symmetric" because a non-empty list has a head and a tail where the head is a single item and the tail is another list. In the expression [foo | bar] foo denotes the head of the list and bar is the tail. If the tail is not a proper list, the result won't be a proper list either. If the head is a list, the result will simply be a list with that list as its first element.

There is no way to append at the end of a linked list in less than O(n) time. This is why using ++ for that is generally shunned. If there were special syntax to append at the end of the list, it would still need to take O(n) time and using that syntax wouldn't make you any more of a "good erlanger" than using ++ would.

If you want to avoid the O(n) cost per insertion, you'll need to prepend and then reverse. If you're willing to pay the cost, you might just as well use ++.

A little more detail on how lists work:

[ x | y ] is something called a cons cell. In C terms it's basically a struct with two members. A proper list is either the empty list ([]) or a cons cell whose second member is a proper list (in which case the first member is called its head, and the second member is called its tail).

So when you write [1, 2, 3] this creates the following cons cells: [1 | [2 | [3 | []]]]. I.e. the list is represented as a cons cell whose first member (its head) is 1 and the second member (the tail) is another cons cell. That other cons cell has 2 as its head and yet another cons cell as its tail. That cell has 3 as its head and the empty list as its tail.

Traversing such a list is done recursively by first acting on the head of the list and then calling the traversal function on the tail of the list.

Now if you want to prepend an item to that list, this is very easy: you simply create another cons cell whose head is the new item and whose tail is the old list.

Appending an item however is much more expensive because creating a single cons cell does not suffice. You have to create a list that is the same as the old one, except the tail of the last cons cell must be a new cons cell whose head is the new element and whose tail is the empty list. So you can't append to a list without going through the whole list, which is O(n).

sepp2k
Thanks for the considered detail in your response! Now I'm very curious: what are the internal mechanics of the prepend and append operations? Can you recommend any resources on erlang's internal operations?
tkersh
@tkersh: I hope my edit clarified things for you.
sepp2k
Brilliant! Basically it's a singly linked list but since it's immutable, we can't add to the tail? Makes perfect sense, thanks again.
tkersh
@tkersh: Exactly. No problem. Do consider marking this answer as accepted if it was helpful to you.
sepp2k
I think part of your suggestion to prepend-then-reverse needs more context. It's perfectly fine to just append the value if it's done once as an operation. In this case, both operations end up being O(n). However, when you repetitively append the value in a recursive function, you then need to append N times, giving you O(n²) complexity. That's where prepending-then-reversing is worth it. Prepending is O(1) and you do it N times. Reversing is O(n) too, which gives you a O(2n), or just O(n), which is better than O(n²).
I GIVE TERRIBLE ADVICE
Note also that if the list is meant to be output with `io:format/2-3` or other IO functions, you can just use a cons cell. The Erlang drivers don't care if the list is nested or improper (most of the time) and can output it without a problem. In these cases, prepend-then-reverse becomes a useless operation.
I GIVE TERRIBLE ADVICE