tags:

views:

191

answers:

6

Hi, I am trying to study C, and I am running in to troubles using char* and char arrays. I am using a generic hash-set container from a library (which I don't want to describe in details). This library includes the function

void *HashSetLookup(hashset *h, const void *elemAddr);

which I have to use to search in the hash set to see if the element already exists there (the hash and compare functions are part of the hashset struct). In this case I use the hashset to store pointers to C-strings, or more specifically (char * *) . My problem is that the following code gives a segmentation fault:

    char word[1024];
    /* Some code that writes to the word buffer */
    HashSetLookup(stopList, &word);

while this code works fine (and as expected):

    char word[1024];
    /* The same code as before that writes to the word buffer */
    char* tmp = strdup(word);
    HashSetLookup(stopList, &tmp);
    free(tmp);

I thought char word[] and char* were basically the same thing. The only difference being that char word[1024] is in the stack with a fixed length of 1024, but tmp in the heap occupying only as much space as necessary (strlen(word)+1).

Therefore I don't understand why I have to make a copy of the string in the heap to be able to call this function. Why does this happen? Is there some more fundamental difference between char* tmp = strdup("something") and char word[1024] = "something"?

A: 

You must have missed something in the first example because "word" is not being used at all.

Anyhow, in most of the environments, when you write 'char *s = "Hello World"', it gets created on the code segment and cannot be modified. Code segment means that it is part of the executable code which must not be modified but only read. When you try to write it, you get a seg fault.

However 'char[]' gets created on data segment so it can be modified without any problem.

sharjeel
He says in the comment for the first example that there is some code left out that writes to word.
Jergason
you are right. In the first code it should have been HashSetLookup(stopList, I have fixed that now. Thank you.
Siggi
+1  A: 

Hard to tell without the documentation of HashSetLookup.

But it expects a const void * as its second parameter, so you should pass tmp, and not &tmp, because tmp is already a pointer.

I don't see need for char ** here at all.

Also, you might probably be interested in what HashSetLookup() returns.

Eiko
Thank you for your answer.I actually need to store char** since the HashSet will copy whatever the void pointer points to to a fixed size chunk of memory. This would not work well for strings since they are of various sizes. Therefore all the HashSet functions will need to take in char** since I only want to store pointers to strings stored in the heap. And you are right, I am interested in whatever the HashSetLookup() returns. But since it crashed at this point I didn't bother including it in the code.
Siggi
Your signatures and the values you pass do not represent that. If you think it is working now, it is most probably due to just another bug.
Eiko
A: 

Try this code:

char word[1024];
/* Some code that writes to the word buffer */
HashSetLookup(stopList, word);
// this should also work
// HashSetLookup(stopList, &word[0]);

When you declare an array the variable name of the array itself is a pointer to the array. Hence, to refer to the first array location simply use the variable name. When you insert an ampersand in front of the variable, as you did in your example, you are actually referring to the address of the pointer (so the type of that dereferencing is char **).

Similarly, since word[0] refers to the contents of the first location of the array word you can insert an ampersand in front of it to refer to a pointer that points to word[0].

Praetorian
Siggi
A: 

Since you mentioned char** I think the problem is with the function trying to write to the location pointed by the second argument, i.e. when you write:

HashSetLookup( stopList, &word );

it goes and tries assigning an address to word (and that's why it needs the address of it.), which overwrites the buffer with a pointer.

This is demonstrated by the following silly snippet (keep in mind that address of array is still the address of its first element):

#include <stdio.h>
#include <stdlib.h>

void func( void* boo )
{
        char** ptr = ( char** )boo;
        printf( "func: got %p\n", boo );
        *ptr = "bad func";
}

int main( int argc, char* argv[] )
{
        char buf[128], *p;
        func( &buf ); /* buf got overwritten */
        printf( "array: %s\n", buf );
        p = malloc( 128 );
        func( &p ); /* p got new value */
        printf( "malloc: %s\n", p );
        return 0;
}
Nikolai N Fetissov
I thought something similar at first. But if you look at the prototype for HashSetLookup you see that the second argument is a pointer to a const. And secondy, this is only a lookup function and it shouldn't write anything at all.
Siggi
+4  A: 

You mention you need a char ** and there lies the problem: for an array, word and &word mean the same thing - the actual location of the array contents. The reason it works when you use a pointer is because the "pointer" is stored at a different location, while it points to the same array. You don't need an strdup, you simply need to create a pointer:

char* tmp = word;
HashSetLookup(stopList, &tmp);
casablanca
This works perfectly. Thank you.
Siggi
Glad it works. It's one of those subtle differences between an array and a pointer.
casablanca
Eiko
casablanca