views:

109

answers:

4

This is a classical problem, but I can not find a simple solution.

I have an input file like:

1 3 9 13 23 25 34 36 38 40 52 54 59 
2 3 9 14 23 26 34 36 39 40 52 55 59 63 67 76 85 86 90 93 99 108 114 
2 4 9 15 23 27 34 36 63 67 76 85 86 90 93 99 108 115 
1 25 34 36 38 41 52 54 59 63 67 76 85 86 90 93 98 107 113 
2 3 9 16 24 28 
2 3 10 14 23 26 34 36 39 41 52 55 59 63 67 76 

Lines of different number of integers separated by a space.

I would like to parse them in an array, and separate each line with a marker, let say -1.

The difficulty is that I must handle integers and line returns.

Here my existing code, it loops upon the scanf loop (because scanf can not begin at a given position).

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {

  if (argc != 4) {
    fprintf(stderr, "Usage: %s <data file> <nb transactions> <nb items>\n", argv[0]);
    return 1;
  }
  FILE * file;
  file = fopen (argv[1],"r");
  if (file==NULL) {
    fprintf(stderr, "Error: can not open %s\n", argv[1]);
    fclose(file);
    return 1;
  }
  int nb_trans = atoi(argv[2]);
  int nb_items = atoi(argv[3]);
  int *bdd = malloc(sizeof(int) * (nb_trans + nb_items));
  char line[1024];
  int i = 0;

  while ( fgets(line, 1024, file) ) {
    int item;
    while ( sscanf (line, "%d ", &item )){
      printf("%s %d %d\n", line, i, item);
      bdd[i++] = item;
    }
    bdd[i++] = -1;
  }

  for ( i = 0; i < nb_trans + nb_items; i++ ) {
    printf("%d ", bdd[i]);
  }
  printf("\n");
}
+1  A: 

Read in the input as a string, do a search for a newline, create a new string with -1 where the newline would be, and repeat this until all newlines are replaced with -1. While you're doing this, you could also count the number of spaces so you'll know how large to declare your array. (You should probably do that after replacing the newlines, though.)

Then create your array.

Next, use sscanf or something to interpret the integers from the string in a loop and add them to the array in the right place until all the integers (including the -1s) have been interpreted.

EDIT: ...And that seems to be pretty close to what you're doing already, going by the code you added to your question while I was typing up my answer.

JAB
+3  A: 

You have a number of options open to you, but this is in general how I would attack it:

Read in the input file as a text file - that is as a bunch of strings - with fgets(). This will read until a line break or EOF is hit. Use a string tokenizer function that scans each line read for spaces and returns the substring before the space. You now have a string representation of an integer. Parse that into an actual int if you wish, or store the substring itself in your array. If you do switch it to an int, you need to beware of overflow if it gets too big.

Michael Dorgan
This seems a bit more efficient than my answer. Ah well, it's been a while since I've actually used C. I should practice with it more.
JAB
It's been a while for me as well, but I've not needed to use C++ string stuff much yet so it all is still fairly fresh for me. Thanks for the comment.
Michael Dorgan
A: 

OK I found the solution, sorry for the noise, I should have search more...

http://stackoverflow.com/questions/2195823/reading-unknown-number-of-integers-from-stdin-c

Instead of my scanf loop, use this one:

  while ( fgets(line, 1024, file) ) {
    int item;
    for (p = line; ; p = e) {
        item = strtol(p, &e, 10);
        if (p == e)
            break;
        bdd[i++] = item;
    }
    bdd[i++] = -1;
  }
Jérôme
Lol, looks a lot like my suggestion :)
Michael Dorgan
It is. The main information in this solution is the function strtol, which gives you back the end pointer of the search.
Jérôme
Again, beware of integer overflow. I'm not sure how strol will handle a string too big to fit into a long.
Michael Dorgan
OK thanks. In my case, I am sure that entries are not more than an integer (actually not more than 10000), no problem on this side
Jérôme
A: 

Here's a complete C program that shows how you can do this. It basically reads in lines at a time with fgets, then uses sscanf to process each of the inetegrs on that line.

It has rudimentary error checking but it hasn't been tested with bad data (line non-numerics) but it should be a good start. Just replace the printf statements with code that will append each number to an array:

#include <stdio.h>
#include <string.h>
#include <errno.h>

int main (void) {
    char line[1000];
    FILE *fIn;
    char *str;
    int val, num;

    // Open input file and process line by line.

    if ((fIn = fopen ("infile.txt", "r")) == NULL) {
        fprintf (stderr, "Cannot open infile.txt, errno = %d\n", errno);
        return 1;
    }

    while (fgets (line, sizeof (line), fIn) != NULL) {
        // Check if line was too long.

        if (line[strlen (line) - 1] != '\n') {
            fprintf (stderr, "Line too long: [%s...]\n", line);
            fclose (fIn);
            return 1;
        }

        // Oyput the line and start processing it.

        printf ("%s   ", line);
        str = line;

        // Skip white space and scan first inetegr.

        while (*str == ' ') str++;

        num = sscanf (str, "%d", &val);

        // Process the integer if it was there.

        while ((num != 0) && (num != EOF)) {
            // Print it out then skip to next.

            printf ("[%d] ", val);
            while ((*str != ' ') && (*str != '\0')) str++;
            while (*str == ' ') str++;
            num = sscanf (str, "%d", &val);
        }

        // -1 for line separator.

        printf ("[%d]\n", -1);
    }

    // Close input file and exit.

    fclose (fIn);

    return 0;
}

And here's the output to show you that it's working:

1 3 9 13 23 25 34 36 38 40 52 54 59
   [1] [3] [9] [13] [23] [25] [34] [36] [38] [40] [52] [54] [59] [-1]
2 3 9 14 23 26 34 36 39 40 52 55 59 63 67 76 85 86 90 93 99 108 114
   [2] [3] [9] [14] [23] [26] [34] [36] [39] [40] [52] [55] [59] [63] [67] [76] [85] [86] [90] [93] [99] [108] [114] [-1]
2 4 9 15 23 27 34 36 63 67 76 85 86 90 93 99 108 115
   [2] [4] [9] [15] [23] [27] [34] [36] [63] [67] [76] [85] [86] [90] [93] [99] [108] [115] [-1]
1 25 34 36 38 41 52 54 59 63 67 76 85 86 90 93 98 107 113
   [1] [25] [34] [36] [38] [41] [52] [54] [59] [63] [67] [76] [85] [86] [90] [93] [98] [107] [113] [-1]
2 3 9 16 24 28
   [2] [3] [9] [16] [24] [28] [-1]
2 3 10 14 23 26 34 36 39 41 52 55 59 63 67 76
   [2] [3] [10] [14] [23] [26] [34] [36] [39] [41] [52] [55] [59] [63] [67] [76] [-1]
paxdiablo