tags:

views:

669

answers:

5

Hi,

I'm reading in a .txt file. I'm using fscanf to get the data as it is formatted. The line I'm having problems with is this:

result = fscanf(fp, "%s", ap->name);

This is fine until I have a name with a whitespace eg: St Ives So I use this to read in the white space:

result = fscanf(fp, "%[^\n]s", ap->name);

However, when I try to read in the first name (with no white space) it just doesn't work and messes up the other fscanf.

But I use the [^\n] it works fine within a different file I'm using. Not sure what is happening.

If I use fgets in the place of the fscanf above I get "\n" in the variable.

Edit//

Ok, so if I use:

result = fscanf(fp, "%s", ap->name);
result = fscanf(fp, "%[^\n]s", ap->name);

This allows me to read in a string with no white space. But When I get a "name" with whitespace it doesn't work.

A: 

I'm not sure how you mean [^\n] is suppose to work. [] is a modifier which says "accept one character except any of the characters which is inside this block". The ^ inverts the condition. %s with fscanf only reads until it comes across a delimiter. For strings with spaces and newlines in them, use a combination of fgets and sscanf instead, and specify a restriction on the length.

Mads Elvheim
+1  A: 

Jumm wrote:

If I use fgets in the place of the fscanf above I get "\n" in the variable.

Which is a far easier problem to solve so go with it:

fgets( ap->name, MAX, fp ) ;
nlptr = strrchr ( ap->name, '\n' ) ;
if( nlptr != 0 )
{
    *nlptr = '\0' ;
}
Clifford
There is no problem with fgets() because its second parameter sets a size limit - you are thinking of gets().
anon
I deleted my comment in response to Neil's comment as I was incorrect and apologies for jumping the gun here... Yes, indeed you are correct Neil...need a caffeine kick... :)
tommieb75
By deleting your comment, you have simply made Neil's comment look like a criticism of my post, which I don't think was his intent.
Clifford
A: 

There is no such thing as I gather you are trying to imply a regular expression in the fscanf function which does not exist, not that to my knowledge nor have I seen it anywhere - enlighten me on this.

The format specifier for reading a string is %s, it could be that you need to do it this way, %s\n which will pick up the newline.

But for pete's sake do not use the standard old gets family functions as specified by Clifford's answer above as that is where buffer overflows happen and was used in a infamous worm of the 1990's - the Morris Worm, more specifically in the fingerd daemon, that used to call gets that caused chaos. Fortunately, now, that has now been patched. And furthermore, a lot of programmers have been drilled into the mentality not to use the function.

Even Microsoft has adopted a safe version of gets family of functions, that specifies a parameter to indicate the length of buffer instead.

EDIT My bad - I did not realize that Clifford indeed has specified the max length for input...Whoops! Sorry! Clifford's answer is correct! So +1 to Clifford's answer.

Thanks Neil for pointing out my error...

Hope this helps, Best regards, Tom.

tommieb75
Wrong - see my comment to Clifford's answer.
anon
This works for single strings but strings which have whitespace it doesn't. Thanks for the extra info though =]
jumm
A: 

I found the problem.

As Paul Tomblin said, I had an extra new line character in the field above. So using what tommieb75 said I used:

result = fscanf(fp, "%s\n", ap->code);
result = fscanf(fp, "%[^\n]s", ap->name);

And this fixed it!

Thanks for your help.

jumm
If you do this, make sure ap->code and ap->name have enough storage.
Alok
+2  A: 

One problem with this:

result = fscanf(fp, "%[^\n]s", ap->name);

is that you have an extra s at the end of your format specifier. The entire format specifier should just be %[^\n], which says "read in a string which consists of characters which are not newlines". The extra s is not part of the format specifier, so it's interpreted as a literal: "read the next character from the input; if it's an "s", continue, otherwise fail."

The extra s doesn't actually hurt you, though. You know exactly what the next character of input: a newline. It doesn't match, and input processing stops there, but it doesn't really matter since it's the end of your format specifier. This would cause problems, though, if you had other format specifiers after this one in the same format string.

The real problem is that you're not consuming the newline: you're only reading in all of the characters up to the newline, but not the newline itself. To fix that, you should do this:

result = fscanf(fp, "%[^\n]%*c", ap->name);

The %*c specifier says to read in a character (c), but don't assign it to any variable (*). If you omitted the *, you would have to pass fscanf() another parameter containing a pointer to a character (a char*), where it would then store the resulting character that it read in.

You could also use %[^\n]\n, but that would also read in any whitespace which followed the newline, which may not be what you want. When fscanf finds whitespace in its format specifier (a space, newline, or tab), it consumes as much whitespace as it can (i.e. you can think of it consuming the longest string that matches the regular expression [ \t\n]*).

Finally, you should also specify a maximum length to avoid buffer overruns. You can do this by placing the buffer length in between the % and the [. For example, if ap->name is a buffer of 256 characters, you should do this:

result = fscanf(fp, "%255[^\n]%*c", ap->name);

This works great for statically allocated arrays; unfortunately, if the array is dyamically sized at runtime, there's no easy to way to pass the buffer size to fscanf. You'll have to create the format string with sprintf, e.g.:

char format[256];
snprintf(format, sizeof(format), "%%%d[^\n]%%*c", buffer_size - 1);
result = fscanf(fp, format, ap->name);
Adam Rosenfield