tags:

views:

649

answers:

6

I've already got some code to read a text file using fscanf(), and now I need it modified so that fields that were previously whitespace-free need to allow whitespace. The text file is basically in the form of:

title: DATA
title: DATA
etc...

which is basically parsed using fgets(inputLine, 512, inputFile); sscanf(inputLine, "%*s %s", &data);, reading the DATA fields and ignoring the titles, but now some of the data fields need to allow spaces. I still need to ignore the title and the whitespace immediately after it, but then read in the rest of the line including the whitespace.

Is there anyway to do this with the sscanf() function?

If not, what is the smallest change I can make to the code to handle the whitespace properly?

UPDATE: I edited the question to replace fscanf() with fgets() + sscanf(), which is what my code is actually using. I didn't really think it was relevant when I first wrote the question which is why I simplified it to fscanf().

+2  A: 

I highly suggest you stop using fscanf() and start using fgets() (which reads a whole line) and then parse the line that has been read.

This will allow you considerably more freedom in regards to parsing non-exactly-formatted input.

Andreas Bonini
I updated the question to show that I actually do use fgets(), but I don't understand what exactly it will help. I still have to parse the line once I read it in.
Graphics Noob
Once you've got the entire string, walk it yourself rather than using sscanf.
Anon.
Yes, do it with pointers, or even better use regular expressions. If you used C++ I would have suggested boost; I don't know of any good C libraries but there must be some. I heard POSIX supports them.
Andreas Bonini
+2  A: 

The simplest thing would be to issue a

fscanf("%*s");

to discard the first part and then just call the fgets:

fgets(str, stringSize, filePtr);
Matteo Italia
+4  A: 

If you cannot use fgets() use the %[ conversion specifier (with the "exclude option"):

char buf[100];
fscanf(stdin, "%*s %99[^\n]", buf);
printf("value read: [%s]\n", buf);

But fgets() is way better.


Edit: version with fgets() + sscanf()

char buf[100], title[100];
fgets(buf, sizeof buf, stdin); /* expect string like "title: TITLE WITH SPACES" */
sscanf(buf, "%*s %99[^\n]", title);
pmg
For this particular case, how is `fgets` "way better"?
Pavel Minaev
Well ... the requirements keep changing (first there was no space in the string). It isn't better for this particular case, but it is better to use `fgets()` now, in antecipation for the next change of requirements :)
pmg
I updated the question to show that I actually do use fgets() to read the line, then sscanf() to parse it, but is there a better way to parse the line after fgets()?
Graphics Noob
`fgets()` + `sscanf()` is a good way to parse (simple) strings.
pmg
If it gets more complicated, you can try regex. If it gets even more complicated, you'll need to write a proper parser (though something like flex/bison, or many other parser generators, would help with that).
Pavel Minaev
+2  A: 

If you insist on using scanf, and assuming that you want newline as a terminator, you can do this:

scanf("%*s %[^\n]", str);

Note, however, that the above, used exactly as written, is a bad idea because there's nothing to guard against str being overflown (as scanf doesn't know its size). You can, of course, set a predefined maximum size, and specify that, but then your program may not work correctly on some valid input.

If the size of the line, as defined by input format, isn't limited, then your only practical option is to use fgetc to read data char by char, periodically reallocating the buffer as you go. If you do that, then modifying it to drop all read chars until the first whitespace is fairly trivial.

Pavel Minaev
+1  A: 

A %s specifier in fscanf skips any whitespace on the input, then reads a string of non-whitespace characters up to and not including the next whitespace character.

If you want to read up to a newline, you can use %[^\n] as a specifier. In addition, a ' ' in the format string will skip whitespace on the input. So if you use

fscanf("%*s %[^\n]", &str);

it will read the first thing on the line up to the first whitespace ("title:" in your case), and throw it away, then will read whitespace chars and throw them away, then will read all chars up to a newline into str, which sounds like what you want.

Be careful that str doesn't overflow -- you might want to use

fscanf("%*s %100[^\n]", &str)

to limit the maximum string length you'll read (100 characters, not counting a terminating NUL here).

Chris Dodd
A: 
Norman Ramsey