tags:

views:

1531

answers:

7

Hello,

I have developed my own version of strtok. Just to practice the use of pointers.

Can anyone see any limitations with this or anyway I can improve.

void stvstrtok(const char *source, char *dest, const char token) 
{
    /* Search for the token. */
    int i = 0;
    while(*source)
    {
     *dest++ = *source++;
     if(*source == token)
     {
      source++;
     }
    }
    *dest++ = '\0';
    }

int main(void)
{
    char *long_name = "dog,sat ,on ,the,rug,in ,front,of,the,fire";
    char buffer[sizeof(long_name)/sizeof(*long_name)];

    stvstrtok(long_name, buffer, ',');

    printf("buffer: %s\n", buffer);

   getchar();

   return 0;
}
+1  A: 

strtok allows you to iterate through all the tokens. It does this by assuming that the source string is writable and inserting nulls into it at token breaks. The destination buffer is a pointer to the character offset withing the source buffer. You can use this fact to know when you have reached the end + also keep "state" between calls.

Strtok is not a good function to use, since it destroys the source string. It is also not re-entrant.

an0nym0usc0ward
as for re-entrancy: strtok_r()
phresnel
+1  A: 

strtok() will save some state, so you can call it multiple times to get multiple tokens. Also, strtok() will "split" the source string so you get multiple destination strings, each one being a token.

All your code does, from what I see, is ignore any input char that equals the token seperator, and continue copying to the null-termination of the source.

edit: Additionally, consider there are two sequencing token-seperators: The first will be ignored by your function, the second will be written to the destination, whereas strtok() will define a seqeunce of 2 or more delimiters as a single delimiter (man page: http://man.cx/?page=strtok )

phresnel
+4  A: 

A side note: The word 'token' is usually used to describe the parts of the string that are returned. Delimiter is used to describe the thing that separates the tokens. So to make your code more clear you should rename token to delimiter and rename dest to token_dest.

Differences in your function and strtok:

There are several differences between your function and strtok.

  • What your function does is simply remove the token separators
  • You only call your function once to process all parts of the string. With strtok you call it multiple times for each part of the string (subsequent times with NULL as the first param).
  • strtok also destroys the source string, whereas your code uses its own buffer (I think better to use your own buffer as you did).
  • strtok stores the position of the next token after each call where the first parameter is NULL. This position is then used for subsequent calls. This is not thread safe though and your function would be thread safe.
  • strtok can use multiple different delimiters, whereas your code uses just one.

That being said, I will give suggestions for how to make a better function, not a function that is closer to strtok's implementation.

How to improve your function (not emulate strtok):

I think it would be better to make the following changes:

  • Have your function simply return the 'next' token
  • Break out of your loop when you have *source or *source == delimiter
  • Return a pointer to the first character of your source string that contains the next token. This pointer can be used for subsequent calls.
Brian R. Bondy
+2  A: 

This code doesn't function at all like strtok(). What were you trying to do, exactly? But as far as improvements, your code has a serious bug: if the length of source subtracted by the number of occurrences of token is greater than the length of dest you've got yourself a very classic Stack overflow, which seems somewhat ironic to me at the moment. This won't happen in the main that you've used, but using the function elsewhere is bound to lead you into the path of uncertainty and the slough of despair.

ebencooke
+1  A: 

strtok destroys the input string with the NUL character, which makes it kind of hostile.

You need to also consider the case of "xyz,,pdq" how many tokens will strtok pull out of that string if ',' is the delimiter.

What do you want your function to do in this case?

EvilTeach
+1  A: 

Also, strtok(...) supports multiple delimiter characters. Look into the definitions of strspn(...) and strcspn(...), as they can be used to re-implement strtok(...).

Roboprog
+1  A: 

By the Way, long_name is pointer to char and sizeof(long_name) is sizeof(char*). not the size of what the long_name points to.