views:

1471

answers:

6
+1  A: 

Edit: I originally had a malloced workspace, which I though might be clearer. However, doing it w/o extra memory is almost as simple, and I'm being pushed that way in comments and personal IMs, so, here comes...:-)

void squeezespaces(char* row, char separator) {
  char *current = row;
  int spacing = 0;
  int i;

  for(i=0; row[i]; ++i) {
    if(row[i]==' ') {
      if (!spacing) {
        /* start of a run of spaces -> separator */
        *current++ = separator
        spacing = 1;
      }
    } else {
      *current++ = row[i];
      spacing = 0;
  }
  *current = 0;    
}
Alex Martelli
doing a malloc seems a bit heavy handed!
Mitch Wheat
because you would allocate the memory how?
hhafez
I mean using malloc at all. Can be done in place...
Mitch Wheat
True that :) didn't think of that
hhafez
It's feasible to just overwrite the existing row (just make sure to append the closing `'\0'`!-), I just thought I'd make things simpler for the OP. If the OP comments and/or edits (best: _both_;-) to ask for "no auxiliary memory to be allocated", I'll edit the answer accordingly.
Alex Martelli
OK, I give up (and, you know who you are, stop IRC'ing and IM'ing me), here's the no-extra-memory version instead -- hopefully just as simple;-). Happy now?-)
Alex Martelli
+5  A: 

Why not use strtok() directly? No need to modify the input

All you need to do is repeat strtok() until you get 3 non-space tokens and then you are done!

hhafez
Hadn't thought of that!
Jergason
Happens to the best of us mate ;) see my comment on Alex Martelli's answer, I said the exact same thing lol
hhafez
+8  A: 

If I may voice the "you're doing it wrong" opinion, why not just eliminate the whitespace while reading? Use fscanf("%s", string); to read a "word" (non whitespace), then read the whitespace. If it's spaces or tabs, keep reading into one "line" of data. If it's a newline, start a new entry. It's probably easiest in C to get the data into a format you can work with as soon as possible, rather than trying to do heavy-duty text manipulation.

Chris Lutz
That is something else I hadn't thought of. Hrrm.
Jergason
If you know you're going to be reading a fixed number of columns, you could make a linked-list structure with named entries for each column. If you don't, you still might want to encapsulate the data in a linked list structure to keep track of the number of entries. Or you can use a dynamic array, but be warned that you'll have to pass around the size of the array everywhere you go (which you've probably been warned about a million times already if you've ever read anything about C).
Chris Lutz
I could always stand to be warned a few more times. C is not my strong point. I am a wussy scripting-language web programmer man.
Jergason
There's nothing wrong with scripting languages. My native tongue is Perl and I'm currently stretching out my Python, but I enjoy C as well. It's good to learn C, because you learn to do things a new way. It almost made my Perl unreadable because I wanted to process text one character at a time with `getc()` for about an hour until I remembered how much easier it was to do it the way Perl wants you to.
Chris Lutz
Careful for buffer overflow using fscanf.
Robert
Echoing @Robert, use `fgets` in conjunction with `sscanf`.
Sinan Ünür
I could have sworn there was a way with `scanf()` to limit the number of characters read into a buffer, but I can't find it. It should, however, be noted. You can even use `strchr()` to find whitespace in order to allocate the right amount of space for each buffer before you `sscanf()` the buffer.
Chris Lutz
A: 

You could read a line then scan it to find the start of each column. Then use the column data however you'd like.

#include <stdio.h>
#include <string.h>
#include <ctype.h>

#define MAX_COL 3
#define MAX_REC 512

int main (void)
{
    FILE *input;
    char record[MAX_REC + 1];
    char *scan;
    const char *recEnd;
    char *columns[MAX_COL] = { 0 };
    int colCnt;

    input = fopen("input.txt", "r");

    while (fgets(record, sizeof(record), input) != NULL)
    {
     memset(columns, 0, sizeof(columns));  // reset column start pointers

     scan = record;
     recEnd = record + strlen(record);

     for (colCnt = 0; colCnt < MAX_COL; colCnt++ )
     {
       while (scan < recEnd && isspace(*scan)) { scan++; }  // bypass whitespace
       if (scan == recEnd) { break; }
       columns[colCnt] = scan;  // save column start
       while (scan < recEnd && !isspace(*scan)) { scan++; }  // bypass column word
       *scan++ = '\0';
     }

     if (colCnt > 0)
     {
      printf("%s", columns[0]);
      for (int i = 1; i < colCnt; i++)
      {
       printf("#%s", columns[i]);
      }
      printf("\n");
     }
    }

    fclose(input);
}

Note, the code could still use some robust-ification: check for file errors w/ferror; ensure eof was hit w/feof; ensure entire record (all column data) was processed. It could also be made more flexible by using a linked list instead of a fixed array and could be modified to not assume each column only contains a single word (as long as the columns are delimited by a specific character).

Robert
A: 

The following code modifies the string in place; if you don't want to destroy your original input, you can pass a second buffer to receive the modified string. Should be fairly self-explanatory:

#include <stdio.h>
#include <string.h>

char *squeeze(char *str)
{
  int r; /* next character to be read */
  int w; /* next character to be written */

  r=w=0;
  while (str[r])
  {
    if (isspace(str[r]) || iscntrl(str[r]))
    {
      if (w > 0 && !isspace(str[w-1]))
        str[w++] = ' ';
    }
    else
      str[w++] = str[r];
    r++;
  }
  str[w] = 0;
  return str;
}

int main(void)
{
  char test[] = "\t\nThis\nis\ta\b     test.";
  printf("test = %s\n", test);
  printf("squeeze(test) = %s\n", squeeze(test));
  return 0;
}
John Bode
A: 

Here's an alternative function that squeezes out repeated space characters, as defined by isspace() in <ctype.h>. It returns the length of the 'squidged' string.

#include <ctype.h>

size_t squidge(char *str)
{
    char *dst = str;
    char *src = str;
    char  c;
    while ((c = *src++) != '\0')
    {
        if (isspace(c))
        {
            *dst++ = ' ';
            while ((c = *src++) != '\0' && isspace(c))
                ;
            if (c == '\0')
                break;
        }
        *dst++ = c;
    }
    *dst = '\0';
    return(dst - str);
}

#include <stdio.h>
#include <string.h>

int main(void)
{
    char buffer[256];
    while (fgets(buffer, sizeof(buffer), stdin) != 0)
    {
        size_t len = strlen(buffer);
        if (len > 0)
            buffer[--len] = '\0';
        printf("Before: %zd <<%s>>\n", len, buffer);
        len = squidge(buffer);
        printf("After:  %zd <<%s>>\n", len, buffer);
    }
    return(0);
}
Jonathan Leffler