ansaurus

Question

Accessing node data through POSIX tdelete()

Answer 1

+1 A:

I've never used this set of functions before, but you appear to be correct. However, I would try using the pointer returned from tfind() as the rootp argument to tdelete(). This should make the second search essentially a no-op. AFAICT, the data structure allows any node to be treated as a root so you would find the node and then delete it from the subtree rooted at the node that you found.

D.Shawley 2010-09-11 14:36:12

Oh, good catch on treating any node as a root.

llasram 2010-09-11 14:43:55

I've considered this. Unfortunately nodes do not store a parent pointer: Deleting an arbitrary node would leave a dangling pointer further up the tree.

Matt Joiner 2010-09-11 14:50:33

Answer 2

A:

Like D.Shawley, I have never used these function before, so I wrote a little test program that leaks memory.

#include <search.h>
#include <stdio.h>
#include <stdlib.h>

int fx(const void *a, const void *b) {
  if (*(int*)a < *(int*)b) return -1;
  return (*(int*)a > *(int*)b);
}

int main(void) {
  int i, *elem;
  void *root = NULL, *val;

  for (i = 0; i < 10; i++) {
    elem = malloc(sizeof *elem); /* LEAK, leak, leak */
    *elem = i/2;
    val = tsearch(elem, &root, fx); /* assume all OK */
    printf("i: %d; elem: %p (%d); val: %p (%x)\n",
           i, (void*)elem, *elem, val, *(int*)val);
  }

  for (i = -1; i < 6; i++) {
    val = tfind(&i, &root, fx);
    printf("i: %d; ", i);
    if (val) {
      printf("@%p (%x)\n", val, *(int*)val);
    } else {
      printf("NOT FOUND\n");
    }
  }

  return 0;
}

Running it outputs

i: 0; elem: 0xcb8010 (0); val: 0xcb8030 (cb8010)
i: 1; elem: 0xcb8060 (0); val: 0xcb8030 (cb8010)
i: 2; elem: 0xcb8080 (1); val: 0xcb80a0 (cb8080)
i: 3; elem: 0xcb80d0 (1); val: 0xcb80a0 (cb8080)
i: 4; elem: 0xcb80f0 (2); val: 0xcb8110 (cb80f0)
i: 5; elem: 0xcb8140 (2); val: 0xcb8110 (cb80f0)
i: 6; elem: 0xcb8160 (3); val: 0xcb8180 (cb8160)
i: 7; elem: 0xcb81b0 (3); val: 0xcb8180 (cb8160)
i: 8; elem: 0xcb81d0 (4); val: 0xcb81f0 (cb81d0)
i: 9; elem: 0xcb8220 (4); val: 0xcb81f0 (cb81d0)
i: -1; NOT FOUND
i: 0; @0xcb8030 (cb8010)
i: 1; @0xcb80a0 (cb8080)
i: 2; @0xcb8110 (cb80f0)
i: 3; @0xcb8180 (cb8160)
i: 4; @0xcb81f0 (cb81d0)
i: 5; NOT FOUND

Apparently the values "inside the binary tree" are copies of pointers to the data. The copies are managed by the <search.h> functions and the data should be managed by the program.

The memory 0xcb8030 should be freed by a call to tdelete(), the corresponding 0xcb8010 (elem with value 0) should be freed by the program (as well as the previously lost 0xcb8060).

pmg 2010-09-11 15:24:00

Matt Joiner 2010-09-11 15:48:56

Change my program by adding the necessary `tdelete()` and `free()` calls so it doesn't leak. `tdelete()` and `free()` calls will all be to different things: there is no overlap.

pmg 2010-09-11 15:58:07

search.h is not C89, bsearch/qsort/... are declared in stdlib.h; you dont know the well known "standard"-compare definition <return *(int*)a-*(int*)b;> ?

2010-09-11 18:14:01

`tsearch`, `tfind`, `tdelete`, etc are functions not defined by the C Standard, no matter what header they're declared in. My documentation suggested `<search.h>` so I used it. I misread the documentation and coded the compare function to return -1, 0, or 1 rather than <0, 0, or >0; but you're right about the "standard"-compare.

pmg 2010-09-11 18:35:34

Answer 3

+1 A:

Matt, I think that your interpretation is correct. For individual deletion of elements this seems unavoidable. (But anyhow it is only a factor of 2 for an operation of \Omega \log N cost ;)

Long time that I didn't use these function, but looking through old code it looks that you can avoid part of the overhead if you destroy the tree all at once (if this is your case) by using two calls of twalk:

count the number N of elements in your tree with an appropriate call to twalk
allocate an array of size N of pointers to tree items
fill this array by a second twalk through the tree
run through this array and for each tree node first delete its data and then the node itself

If your really need such a thing you can find a thread safe C++ encapsulation of these calls in our library parXXL in the directory par/sys.

Jens Gustedt 2010-09-11 16:01:29

I'm glad someone hit on the fact that 2logN is still O(log N). I think destroying the tree can be done quicker: Store the data for the root node, call delete on the root node, then destroy the data. Repeat with the new root node until there is no root node.

Matt Joiner 2010-09-12 05:17:51

I see your answer is inspired by `stree< T >::itemp* par::sys::stree< T >::collect`, I wonder now if my suggestion works without the allocation of storage for the nodes.

Matt Joiner 2010-09-12 05:20:46

Take a look at http://code.google.com/p/anacrolix/source/browse/public/stackoverflow/tdelete_data_access.c#33, you might find this better than use of `collect` in parXXL. Build and valgrind instrs are at the top in a comment.

Matt Joiner 2010-09-12 05:58:27

@Matt. Great, yes works. The problem with your approach is thecomplexity of the `tdelete` operation. Nothing in POSIX states howthis is done, nothing imposes that this is done with just sometree-rotations. So your approach has `N × T(tdelete)` operations.If you do the correct `twalk`, you can be sure to do a leafs-firstdeletion, which might be a bit more tolerant against bogusimplementations. I don't say that we achieve that in parXXL, tough ;-)

Jens Gustedt 2010-09-12 07:42:13

@Jens Gustedt: Good to know, this makes sense. Does the RB tree eglibc employs mean that the node retrieved by `tfind` will be the root node when `tdelete` is called?

Matt Joiner 2010-09-12 07:56:58

@Matt: no idea, this is not only a question of the algorithm they use, AFAIR they claim to have implemented one from Knuth. But it is also really a question of how the particular implementation is done, and there only the source code can tell. I never looked into that.

Jens Gustedt 2010-09-12 08:22:32

ansaurus

tags:

views:

answers:

Accessing node data through POSIX tdelete()

related questions