views:

777

answers:

5

Using C I would like to read in the contents of a text file in such a way as to have when all is said and done an array of strings with the nth string representing the nth line of the text file. The lines of the file can be arbitrarily long.

What's an elegant way of accomplishing this? I know of some neat tricks to read a text file directly into a single appropriately sized buffer, but breaking it down into lines makes it trickier (at least as far as I can tell).

Thanks very much!

A: 

For C (as opposed to C++), you'd probably wind up using fgets(). However, you might run into issues due to your arbitrary length lines.

Amber
+4  A: 

Breaking it down into lines means parsing the text and replacing all the EOL (by EOL I mean \n and \r) characters with 0. In this way you can actually reuse your buffer and store just the beginning of each line into a separate char * array (all by doing only 2 passes).

In this way you could do one read for the whole file size+2 parses which probably would improve performance.

Dan Cristoloveanu
This is definitely the best way, though it might require more than one pass over the whole file. You need to count the lines (so that you can allocate the correct size array), replace \n with 0, and then assign the start of each line to the correct spot in the array. Of course you can do this in two passes.
Dan Olson
A very nice idea. I'm going to give it a whirl.
Zach Conn
+1 Not counting the initial copy from the file to the buffer, you can do a single pass with `realloc()` and `strtok()`.
pmg
Agree that this requires 2 passes. AT least I don't know a way right now with one pass. Updated the post accordingly.
Dan Cristoloveanu
Why does it require two passes? Allocate space for the array, as much as you think you'll need, using `malloc()`. Start with the buffer. For each '\n', substitute 0, and put the address of the next char into the array. Keep track of the array size; if it's going to overflow, `realloc()` it.
David Thornley
This worked great. Thanks!
Zach Conn
It's one pass. use fseek/ftell to find the file size. malloc it, read the file in in one io. Make one pass to put a NUL at the new line positions, to make them strings. push_back the start of each line, as you go through the file.
EvilTeach
push_back is stl (templates => c++). However it's true one could use realloc with doubling size of the line array every time. That should scale good enough.
Dan Cristoloveanu
A: 

Perhaps a Linked List would be the best way to do this? The compiler won't like having an array with no idea how big to make it. With a Linked List you can have a really large text file, and not worry about allocating enough memory to the array.

Unfortunately, I haven't learned how to do linked lists, but maybe somebody else could help you.

jonescb
Arbitrary size is an appealing feature of linked lists, but to get it you trade away random access. For instance you can't get line number 5 without first getting lines 0-4. But building a linked list as an intermediate structure is a good idea, you could then build the array easily.
Dan Olson
Unfortunately a linked list isn't very appropriate in this case due to some details that I left out of the question (in short, I need random access).I could, of course, read everything into a linked list, then copy the contents to an array, but I was hoping for a more elegant approach.
Zach Conn
A: 

If you have a good way to read the whole file into memory, you are almost there. After you've done that you could scan the file twice. Once to count the lines, and once to set the line pointers and replace '\n' and (and maybe '\r' if the file is read in Windows binary mode) with '\0'. In between scans allocate an array of pointers, now that you know how many you need.

Bill Forster
A: 

It's possible to read the number of lines in the file (loop fgets), then create a 2-dimensional array with the first dimension being the number of lines+1. Then, just re-read the file into the array.

You'll need to define the length of the elements, though. Or, do a count for the longest line size.

Example code:

inFile = fopen(FILENAME, "r");
lineCount = 0;
while(inputError != EOF) {
    inputError = fscanf(inFile, "%s\n", word);
    lineCount++;
}
fclose(inFile);
  // Above iterates lineCount++ after the EOF to allow for an array
  // that matches the line numbers

char names[lineCount][MAX_LINE];

fopen(FILENAME, "r");
for(i = 1; i < lineCount; i++)
    fscanf(inFile, "%s", names[i]);
fclose(inFile);
Hyppy