views:

180

answers:

3

Hi all,

I'm writing a C code for a class. This class requires that our code compile and run on the school server, which is a sparc solaris machine. I'm running Linux x64.

I have this line to parse (THIS IS NOT ACTUAL CODE BUT IS INPUT TO MY PROGRAM):

while ( cond1 ){

I need to capture the "while" and the "cond1" into separate strings. I've been using strtok() to do this. In Linux, the following lines:

char *cond = NULL;
cond = (char *)malloc(sizeof(char));
memset(cond, 0, sizeof(char));
strcpy(cond, strtok(NULL, ": \t\(){")); //already got the "while" out of the line

will correctly capture the string "cond1".Running this on the solaris machine, however, gives me the string "cone1".

Note that in plenty of other cases within my program, strings are being copied correctly. (For instance, the "while") was captured correctly.

Does anyone know what is going on here?

+10  A: 

The line:

cond = (char *)malloc(sizeof(char));

allocates exactly one char for storage, into which you are then copying more than one - strcpy needs to put, at a bare minimum, the null terminator but, in your case, also the results of your strtok as well.

The reason it may work on a different system is that some implementations of malloc will allocate at a certain resolution (e.g., a multiple of 16 bytes) no matter what actual value you ask for, so you may have some free space there at the end of your buffer. But what you're attempting is still very much undefined behaviour.

The fact that the undefined behaviour may be to work sometimes in no way abrogates your responsibility to avoid such behaviour.

Allocate enough space for storing the results of your strtok and you should be okay.

The safest way to do this is to dynamically allocate the space so that it's at least as big as the string you're passing to strtok. That way there can be no possibility of overflow (other than weird edge cases where other threads may modify the data behind your back but, if that were the case, strtok would be a very bad choice anyway.

Something like (if instr is your original input string):

cond = (char*)malloc(strlen(instr)+1);

This guarantees that any token extracted from instr will fit within cond.

As an aside, sizeof(char) is always 1 by definition, so you don't need to multiply by it.

paxdiablo
Do something like `char *token = strtok(NULL, "stuff");` then allocate your new buffer as `cond = (char*)malloc(strlen(token)+1);` then `strcpy(cond, token);` You save some memory depending on how long those tokens are kept around, and if the input string happens to be a monster.
John
The problem with that is that you're mallocing and freeing for every single token instead of every input string (line, most likely). I suspect the performance hit for that would outweigh any gain in less memory used, unless the lines were truly huge. But that's a subjective thing and depends on your needs, so I won't argue the point too much.
paxdiablo
+1  A: 

You only allocated memory for 1 character but you trying to store at least 6 characters (you need space for the terminating \0). The quick and dirty way to solve this is just say

char cond[128]

instead of malloc.

Apprentice Queue
+2  A: 

cond is being allocated one byte. strcpy is copying at least two bytes to that allocation. That is, you are writing more bytes into the allocation than there is room for.

One way to fix it to use char *cond = malloc (1000); instead of what you've got.

wallyk
Good call. Is it just pure luck, then, that other strings that are way longer than this are being copied correctly?
strictlyrude27
@strictlyrude27: Yes, pure coincidence.
Michael Foukarakis