views:

749

answers:

6

In an answer there was an interesting statement: "It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets() to get each line in and then sscanf() that."

Could you expand upon when/why it might be better to use fgets() and sscanf() to read some file?

+2  A: 

In short, buffer overflows.

In general, you should always use calls which let you specify a maximum read size, or you risk writing past the end of an array, into your other variables/outside the current stack frame's data blocks.

ojrac
while buffer overflows are one problem with the scanf() family of functions, they are unrelated to the problem asked about here. -1
Sparr
@Sparr, fgets() controls the buffer use assuming valid parameters are passed. fscanf() only controls the buffer use to the extent that the format string contains well-formed width specifiers that are much harder to keep matched to the actual size of the buffer. That *is* an aspect of what is asked about here.
RBerteig
+9  A: 

Imagine a file with three lines:

   1
   2b
   c

Using fscanf() to read integers, the first line would read fine but on the second line fscanf() would leave you at the 'b', not sure what to do from there. You would need some mechanism to move past the garbage input to see the third line.

If you do a gets() and sscanf(), you can guarantee that your file pointer moves a line at a time, which is a little easier to deal with. In general, you should still be looking at the whole string to report any odd characters in it.

I prefer the latter approach myself, although I wouldn't agree with the statement that "it's almost always a bad idea to use fscanf()"... fscanf() is perfectly fine for most things.

Chris Arguin
+1  A: 

Basically, there's no way to to tell that function not to go out of bounds for the memory area you've allocated for it.

A number of replacements have come out, like fnscanf, which is an attempt to fix those functions by specifying a maximum limit for the reader to write, thus allowing it to not overflow.

cyberconte
while buffer overflows are one problem with the scanf() family of functions, they are unrelated to the problem asked about here. -1
Sparr
"Could you expand upon why it might be better to use fgets() and sscanf() to read some file." I was expanding upon his question. I reject your overambitious "-1"
cyberconte
I take "expand on why" to mean that your answer should be based on the premise already presented, that being the file pointer issue. If he wanted OTHER reasons he would not have linked to the origin of the question, or quoted the relevant part of it.
Sparr
Ahh. i understood it as other reasons as truly 'other reasons', since he explained the reason already in the question ;) Different read on the same question i guess.
cyberconte
Since the original question referenced had to do with using fscanf() to read whole lines, there is a lot more relevance to the comparison to fgets() and concerns about the buffer than just the question of where the file pointer landed on failure to match, although that was the example cited in the other thread.
RBerteig
+1  A: 

When fscanf() fails, due to an input failure or a matching failure, the file pointer (that is, the position in the file from which the next byte will be read) is left in a position other than where it would be had the fscanf() succeeded. This is typically undesirable in sequential file reads. Reading one line at a time results in the file input being predictable, while single line failures can be handled individually.

Sparr
A: 

It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets() to get each line in and then sscanf() that.

You can always use ftell() to find out current position in file, and then decide what to do from there. Basicaly, if you know what you can expect then feel free to use fscanf().

mtasic
+2  A: 

The case where this comes into play is when you match character literals. Suppose you have:

int n = fscanf(fp, "%d,%d", &i1, &i2);

Consider two possible inputs "323,A424" and "323A424".

In both cases fscanf() will return 1 and the next character read will be an 'A'. There is no way to determine if the comma was matched or not.

That being said, this only matters if finding the actual source of the error is important. In cases where knowing there is malformed input error is enough, fscanf() is actually superior to writing custom parsing code.

Don