tags:

views:

1200

answers:

8

I was at an interview for a C position in which they presented me with an idiom that I haven't previously encountered. This is a trick that simplifies implementation of various algorithms involving linked lists and I'm wondering if anybody else has encountered this.

Say we have a linked list record defined so:

typedef struct _record
{
    char* value;
    struct _record* next;
} record;

We need a function that inserts a new record so that the entire list remains sorted with respect to the value's in the records. The following implementation is simpler than anything I would have used, albeit less readable.

void insert_sorted(record** r, const char* value)
{
    record* newrec = NULL;
    while(*r && strcmp(value, (*r)->value) > 0)
     r = &((*r)->next); /* move r to point to the next field of the record */
    newrec = malloc(sizeof(record));
    newrec->value = strdup(value);
    newrec->next = *r;
    *r = newrec;
}

When the function is called, r points to the head pointer of the list. During the while loop, r is updated to point to the next field of the record that comes just before the point where we want to put the new record in. The last line of the function either updates the head pointer of the list (if the insertion happens at the beginning) or the next field of the previous record, which is quite cool.

A couple of questions:

  • Does this idiom have a name or is it mentioned in any literature?

  • Are there others like it in the C language?

I thought I knew C pretty well and had pointers and indirection pretty well figured out, but this one took me a while to fully understand.

+4  A: 

I don't see anything I'd call an idiom per se. It looks like standard coding for when you deal with datastructures in C.

My only complaint would be that the callers pointer (*r) is modified. Depending on the use of the function, I'd expect thats an unexpected side effect. Besides removing the unexpected side effect, using a local variable to play the role of *r would make the code more readable.

Frank Schwieterman
Updating *r has the effect of returning a pointer to the new node. That is intentional, otherwise there is no clear way to get at the new node if the values are not unique.
ConcernedOfTunbridgeWells
Why not just make the function return the record* to the new node?
Adam Jaskiewicz
r is used to modify the head pointer of the list, in case it's empty. record *newlist = NULL; insert_sorted( Now newlist points to a single-element linked list.
aib
aib, thats a good scenario/idea. But when the new element does not become the list's head, the caller's pointer is then points to something besides the head. If the point is to tell the caller the head, it should do just that.
Frank Schwieterman
You're right, I missed that the last *r= assignment is unconditional. It should only happen when the head pointer needs to be updated.
aib
+3  A: 

What would be the idiom here? Surely not the implementation of a linked list. The usage of a pointer to pointer construct? The compact loop?

Personally I'd use a pointer return value instead of operating on an input value. Because seeing this input data type would ring a bell, which makes me copy my value before handing it to your function.

Ronny
My thoughts exactly.
Konrad Rudolph
+1 for using a return value, far cleaner IMO.
unwind
+1  A: 

I don't know if this has a name or even if it's some special idiom but, since memory is relatively plentiful nowadays, my linked lists (where the language libraries don't make them available) are a special variant which simplifies the code greatly.

For a start, they're always doubly-linked since this makes traversal in both directions easier, for both processing and insert/delete operations.

An 'empty' list actually consists of two nodes, the head and the tail. By doing this, inserts and deletes do not need to worry about whether the node they're deleting is the head or tail, they can just assume it's a middle node.

Insertion of a new node y before node x then become a simple:

x -> prev -> next = y
y -> next = x
y -> prev = x -> prev
x -> prev = y

Deletion of node x is a simple:

x -> prev -> next = x -> next
x -> next -> prev = x -> prev
free x

Traversal is adjusted to ignore the extraneous head and tail:

n = head -> next
while n != tail
    process n
    n = n -> next

This all serves to make the code a lot easier to understand without all the special handling of the edge cases, at the cost of a couple of nodes of memory.

paxdiablo
Your traversal doesn't allow the pointer to n to be modified, to complete the insert.
Frank Schwieterman
Oops, thanks, @Frank, it's been a while since I did this since I'm mostly doing Java now and it has just about every data structure under the sun already implemented somewhere. Fixed it.
paxdiablo
AmigaOS use this kind of double-linked list extensively.
Tommy
A useful pattern, but this does not answer the question
finnw
@finnw, the question had two parts: (1) "Does this idiom have a name or is it mentioned in any literature?" (2) "Are there others like it in the C language?" - I think you'll find I was answering the second part. That's from vague memory, of course, it was a year and a half ago :-)
paxdiablo
+9  A: 

I'd say the idiom is "the kind of code which gave 'c' a bad name"

  • Unwarrantedly clever
  • Unwarrantedly compact
  • Surprising side effects on caller
  • No error handling on malloc
  • Only works for US English strings
Will Dean
I mistakenly took a job in 2001 for a company that was using C. I knew it was a bad idea, And bad went to worse. I quit a few months later. I shudder thinking about having to go back to such a language.
Tim
It looks straightforward, not "clever", and typical example code, which disregards error checking and uses an obvious library function.
Paul Nathan
Yeah, it's pretty straightforward. I don't really see anything "clever" to it, and it isn't especially compact. I can disregard the lack of error checking and the US English stuff for example code, but that's a pretty nasty side effect.
Adam Jaskiewicz
The code in the question is a bad implementation of a good idiom. I think the opposite is true: this kind of code is what make C useful.
Arkadiy
Wow, Dean really hates C. It's not too clever or compact, it is a typical C code. It does to caller exactly what caller wants. It is good practice to omit error handling while posting here if it obscures the point.
buti-oxa
See, "exactly what caller wants" to me doesn't mean "change the pointer I pass you to point to the new node"; that's an unexpected side-effect.
Adam Jaskiewicz
I think it's been quite a while now since speed of code was more important than readability of code. You can ALWAYS throw faster hardware into the mix, smarter humans are not so easy to come by.
paxdiablo
@Tim, I love that "I mistakenly took a job in 2001 for a company" - what, you were walking past their building and accidentally fell in through the door and signed the acceptance form? :-)
paxdiablo
The passed-in pointer isn't marked const, so of course that signals to the caller that it might be changed?
unwind
c/c++ flame war revives ? :) I barely hold my self here. Bit i will hold ...
Ilya
Ha - funny to come back to this lot. I don't hate 'C' at all, though I think it feels very old fashioned nowadays, and it involves a lot of hard work which is gone with more modern languages. But I *do* think this kind of stuff *is* a good example of why people have moved on.
Will Dean
People not moved on. People still write more C code than everything else. People moved on where it's appropriate. Operating systems, embedded and lot more written in C and it's not gonna change in visible future. Ok i promised to hold. It will be last one :)
Ilya
Agree with the last two points. Strongly disagree with the third. That's the conventional (and simplest) way to mimic pass-by-reference in C. If a similar C# method was defined as `void InsertSorted(ref Record r, string value)` you would not complain about the "unexpected" side-effect would you? It is obvious from the function signature that it can modify `r`.
finnw
+1  A: 

Instead of returning the value of the new node as an in/out parameter, you are better off having that be the return value of the function. This simplifies both the calling code, and the code inside the function (you can get rid of all those ugly double indirections).

record* insert_sorted(const record* head, const char* value)

You are missing error handling for the malloc/strdup failing case btw.

Greg Rogers
I don't think `*head` should be const, because this function might need to modify its `next` pointer.
finnw
+2  A: 

This is a well known thing - double pointer iteration (that's my name for it, I don't know the official name). The goal is to be able to locate a position in single linked list, and then insert before that position (inserting after it is trivial). Implemented naively, this requires two pointers (current and prev) and special code for beginning of the list (when prev == NULL).

The code the way I usually write it is something along the lines of

void insertIntoSorted(Element *&head, Element *newOne)
{
  Element **pp = &head;
  Element *curr;
  while ((curr = *pp) != NULL && less(curr, newOne)) {
    pp = &(pp->next);
  }
  newOne->next = *pp;
  *pp = newOne;
}

Update:

I've forgot theother purpose for this trick - a more important one. It's used to delete elements from single linked lists:

// returns deleted element or NULL when key not found
Element deleteFromList(Element *&head, const ElementKey &key)
{
  Element **pp = &head;
  Element *curr;
  while ((curr = *pp) != NULL && !keyMatches(curr, key)) {
    pp = &(pp->next);
  }
  if (curr == NULL) return NULL;
  *pp = (*pp)->next; // here is the actual delete
  return curr;
}
Arkadiy
C doesn't have references and the type of pp should be Element **, not Element *.
Evan Teran
Fine, so I allowed myself some liberties here. This trick is as valuable in C++ as it is in C
Arkadiy
+1 for the name
finnw
+4  A: 

I've used similar to this to insert into a binary tree. Because when iterating the tree, you usually stop when your pointer becomes NULL (you ran off the tree).

So to insert, you have 3 options,

1: use a variable which tracks the previous value of your iterating pointer.

2: stop when the pointer you would follow is NULL before you follow it, works but slightly less elegant in my opinion.

3: or a more elegant solution is simply use a pointer to a pointer, so you can just do: *it = new_node(); and it'll add it where the NULL used to be in your tree.

For a linked list, while this code works nicely, I usually just use a doubly linked list which makes it trivial to insert at any location.

Evan Teran
This is what I was looking for - both a recognition of the pattern and another use for it - in binary trees. Thanks!
gooli
A: 

To answer the original question, this is known as a pointer centric approach instead of the naive node centric approach. At maiaco.com there is an article that includes an example of the naive node centric approach and a much better example implementation of the pointer centric approach.