tags:

views:

389

answers:

5

How can I strip a string with all \n and \t in C?

+4  A: 

This works in my quick and dirty tests. Does it in place:

#include <stdio.h>

void strip(char *s) {
    char *p2 = s;
    while(*s != '\0') {
     if(*s != '\t' && *s != '\n') {
      *p2++ = *s++;
     } else {
      ++s;
     }
    }
    *p2 = '\0';
}

int main() {
    char buf[] = "this\t is\n a\t test\n test";
    strip(buf);
    printf("%s\n", buf);
}

And to appease Chris, here is a version which will make a copy first and return it (thus it'll work on literals). You will need to free the result.

char *strip_copy(const char *s) {
    char *p = malloc(strlen(s) + 1);
    if(p) {
     char *p2 = p;
     while(*s != '\0') {
      if(*s != '\t' && *s != '\n') {
       *p2++ = *s++;
      } else {
       ++s;
      }
     }
     *p2 = '\0';
    }
    return p;
}
Evan Teran
It's rather unsafe to do this operation in place. What if the first line of `main()` changed to `char *buf = ...;` ? What if this was used in more convoluted code and the coders had forgotten which arguments were writable buffers and which weren't?
Chris Lutz
if you need to do it in a copy, then you can always make a copy first. Such code is still fairly simple.
Evan Teran
@Lutz, I don't see how passing `char*` or `char[]` makes a difference. Also, there's a bug in `strip`: the string isn't null-terminated.
strager
@strager: it isn't a string if it isn't terminated...
Evan Teran
@strager: he is consered about passing something like: `char *buf = "hello\tworld"` which is illegal because you can't modify a pointer to a literal. my `strip_copy` addresses that.
Evan Teran
@Teran, You can in C89, IIRC. But I see your point.
strager
@strager - I don't think you could do that in C89. GCC on OS X won't let me do it in any mode.
Chris Lutz
But +1 for defying my expectation that strip-in-place would be inefficient.
Chris Lutz
@Chris: thanks :-). I've made an efficient version of strip_copy which (to my knowledge) does no more work than necessary ;).
Evan Teran
It might be reuseful to also pass in the string "\t\n" as a parameter so you can reuse to strip other stuff
gnibbler
@gnibber - Then Evan would just be writing the function I outlined in my answer.
Chris Lutz
@Chris: indeed, we'll leave that one as an exercise for the reader.
Evan Teran
+2  A: 

Basically, you have two ways to do this: you can create a copy of the original string, minus all '\t' and '\n' characters, or you can strip the string "in-place." However, I bet money that the first option will be faster, and I promise you it will be safer.

So we'll make a function:

char *strip(const char *str, const char *d);

We want to use strlen() and malloc() to allocate a new char * buffer the same size as our str buffer. Then we go through str character by character. If the character is not contained in d, we copy it into our new buffer. We can use something like strchr() to see if each character is in the string d. Once we're done, we have a new buffer, with the contents of our old buffer minus characters in the string d, so we just return that. I won't give you sample code, because this might be homework, but here's the sample usage to show you how it solves your problem:

char *string = "some\n text\t to strip";

char *stripped = strip(string, "\t\n");
Chris Lutz
+1 for providing all the information needed to learn **how** to do it. Odds are he's a beginner and this information will be useful.
Evan Teran
+2  A: 

If you want to replace \n or \t with something else, you can use the function strstr(). It returns a pointer to the first place in a function that has a certain string. For example:

// Find the first "\n".
char new_char = 't';
char* pFirstN = strstr(szMyString, "\n");
*pFirstN = new_char;

You can run that in a loop to find all \n's and \t's.

If you want to "strip" them, i.e. remove them from the string, you'll need to actually use the same method as above, but copy the contents of the string "back" every time you find a \n or \t, so that "this i\ns a test" becomes: "this is a test".

You can do that with memmove (not memcpy, since the src and dst are pointing to overlapping memory), like so:

char* temp = strstr(str, "\t");
// Remove \n.
while ((temp = strstr(str, "\n")) != NULL) {
// Len is the length of the string, from the ampersand \n, including the \n.
     int len = strlen(str);
 memmove(temp, temp + 1, len); 
}

You'll need to repeat this loop again to remove the \t's.

Note: Both of these methods work in-place. This might not be safe! (read Evan Teran's comments for details.. Also, these methods are not very efficient, although they do utilize a library function for some of the code instead of rolling your own.

Edan Maor
this looks like it would be very inefficient for a long string. You are searching the string (from the start) over and over again. In addition you are doing a strlen each and every time a char is found). Finally, you are copying the tail end of the string each time a char is found...
Evan Teran
You're right, efficiency definitely wasn't a big concern (this is actually lifter straight from code I had lying around, which only worked with small strings). The strlen can actually be moved outside the string, I think, and the search can start each time from the last found location instead of the beginning, negating the "search every time problem". However, I don't see any way of removing the need to memmove, if he wants it done in place. Any suggestions?
Edan Maor
If you want to see an efficient in place strip then take a look at my answer :-P.
Evan Teran
Yep, I looked at it right after posting my comment, now I feel stupid :) At least now I have something to change in my project...
Edan Maor
+1 for a working answer and providing relevant caveats.
Evan Teran
This is what I thought the answer was before I saw Evan's efficient version. +1 for library functions - `memmove()` is sadly relatively unknown.
Chris Lutz
A: 

This is a c string function that will find any character in accept and return a pointer to that position or NULL if it is not found.

#include <string.h>

char *strpbrk(const char *s, const char *accept);

Example:

char search[] = "a string with \t and \n";

char *first_occ = strpbrk( search, "\t\n" );

first_occ will point to the \t, or the 15 character in search. You can replace then call again to loop through until all have been replaced.

Hayato
A: 

I like to make the standard library do as much of the work as possible, so I would use something similar to Evan's solution but with strspn() and strcspn().

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define SPACE " \t\r\n"

static void strip(char *s);
static char *strip_copy(char const *s);

int main(int ac, char **av)
{
    char s[] = "this\t is\n a\t test\n test";
    char *s1 = strip_copy(s);
    strip(s);
    printf("%s\n%s\n", s, s1);
    return 0;
}

static void strip(char *s)
{
    char *p = s;
    int n;
    while (*s)
    {
        n = strcspn(s, SPACE);
        strncpy(p, s, n);
        p += n;
        s += n + strspn(s+n, SPACE);
    }
    *p = 0;
}

static char *strip_copy(char const *s)
{
    char *buf = malloc(1 + strlen(s));
    if (buf)
    {
        char *p = buf;
        char const *q;
        int n;
        for (q = s; *q; q += n + strspn(q+n, SPACE))
        {
            n = strcspn(q, SPACE);
            strncpy(p, q, n);
            p += n;
        }
        *p++ = '\0';
        buf = realloc(buf, p - buf);
    }
    return buf;
}
finnw