tags:

views:

294

answers:

4

I need to read a file and send the text from it to a string so I can parse it. However, the program won't know exactly how long the file is, so what would I do if I wanted to use fgets, or is there a better alternative?

Note:

char *fgets(char *str, size_t num, FILE *stream);
A: 

Allocate a buffer (the one that str points to), and pass the size of the buffer for num. The actual space taken up will only be the length of the text read by fgets.

Something like:

char str[1000];
fgets(str, 1000, &file);

If the next line only has 10 characters before the newline, then str will hold those 10 characters, the newline, and the null terminator.

Edit: just in case there is any confusion, I didn't intend the above to sound as if the extra space in the buffer isn't in use. I only meant to illustrate that you don't need to know ahead of time how long your string is going to be, as long as you can put a maximum length on it.

danben
How's that? Unless you reallocate the buffer, any extra space is still being used.
Matthew Flaschen
I am referring to the space in the buffer, not the space in memory. Also, the OP's question was not about how to save memory.
danben
+1  A: 

You can use fgets iteratively, but a simpler alternative is (stdio.h's) getline. It's in POSIX, but it's not standard C.

Since you're using C++ though, can you use std::string functions like iostream's getline?

Matthew Flaschen
+3  A: 

Don't forget that fgets() reads a line at a time, subject to having enough space.

Humans seldom write lines longer than ... 80, 256, pick a number ... characters. POSIX suggests a line length of 4096. So, I usually use:

char buffer[4096];

while (fgets(buffer, sizeof(buffer), fp)) 
{
    ...process line...
}

If you are worried that someone might provide more than 4K of data in a single line (and a machine generated file, such as HTML or Javascript, might contain that), then you have to decide what to do next. You can do any of the following (and there are likely some other options I've not mentioned):

  1. Process the over-long lines in bits without assuming that there was a newline in between.
  2. Allocate memory for a longer line (say 8K to start with), copy the initial 4K into the allocated buffer, and read more data into the second half of the buffer, iterating until you find the end of line.
  3. Use the POSIX 2008 function getline() which is available on Linux. It does memory allocation for you.
Jonathan Leffler
A: 

If you're not on a POSIX system and don't have getline available, take a look at Chuck Falconer's public domain ggets/fggets functions which dynamically grow a buffer to consume an entire line. (That link seems to be down right now, but archive.org has a copy.)

jamesdlin