tags:

views:

513

answers:

5

What is the simplest way (least error-prone, least lines of code, however you want to interpret it) to open a file in C and read its contents into a string (char*, char[], whatever)?

+7  A: 

I tend to just load the entire buffer as a raw memory chunk into memory and do the parsing on my own. That way I have best control over what the standard lib does on multiple platforms.

This is a stub I use for this. you may also want to check the error-codes for fseek, ftell and fread. (omitted for clarity).

char * buffer = 0;
long length;
FILE * f = fopen (filename, "rb");

if (f)
{
  fseek (f, 0, SEEK_END);
  length = ftell (f);
  fseek (f, 0, SEEK_SET);
  buffer = malloc (length);
  if (buffer)
  {
    fread (buffer, 1, length, f);
  }
  fclose (f);
}

if (buffer)
{
  // start to process your data / extract strings here...
}
Nils Pipenbrinck
Awesome, that worked like a charm (and is pretty simple to follow along). Thanks!
Chris Bunch
I would also check the return value of fread, since it might not actually read the entire file due to errors and what not.
freespace
Along the lines of what freespace said, you might want to check to ensure the file isn't huge. Suppose, for instance, that someone decided to feed a 6GB file into that program...
rmeador
Definitely, just like Nils said originally, I'm going to go look up the error codes on fseek, ftell, and fread and act accordingly.
Chris Bunch
Seeking to the end just so you can call ftell? Why not just call stat?
dicroce
like rmeador said, fseek will fail on files >4GB.
KPexEA
True. For large files this solution sucks.
Nils Pipenbrinck
I haven't suggested using stat simply because it's not ANSI C. (At least I think so). Afaik the "recommended" way to get a file-size is to seek to the end and get the file offset.
Nils Pipenbrinck
This is good and easy... but it will choke if you need to read from a pipe rather than an ordinary file, which is something that most UNIX programs will want to do at some point.
Dan
Don't forget to free the buffer when you are done.
Tim
+2  A: 

"simplest way" and "least error-prone" are often opposites of each other.

Andy Lester
+1  A: 

If "read its contents into a string" means that the file does not contain characters with code 0, you can also use getdelim() function, that either accepts a block of memory and reallocates it if necessary, or just allocates the entire buffer for you, and reads the file into it until it encounters a specified delimiter or end of file. Just pass '\0' as the delimiter to read the entire file.

This function is available in the GNU C Library, http://www.gnu.org/software/libc/manual/html_mono/libc.html#index-getdelim-994

The sample code might look as simple as

char* buffer = NULL;
ssize_t bytes_read = getdelim( &buffer, 0, '\0', fp);
if ( bytes_read != -1) {
  /* Success, now the entire file is in the buffer */
dmityugov
I've used this before! It works very nicely, assuming the file you're reading is text (does not contain \0).
ephemient
+3  A: 

Another, unfortunately highly OS-dependent, solution is memory mapping the file. The benefits generally include performance of the read, and reduced memory use as the applications view and operating systems file cache can actually share the physical memory.

POSIX code would look like:

    int fd = open("filename", O_RDONLY);
    int len = lseek(fd, 0, SEEK_END);
    void *data = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);

Windows on the other hand is little more tricky, and unfortunately I don't have a compiler in front of me to test, but the functionality is provided by CreateFileMapping() and MapViewOfFile().

Jeff Mc
A: 

If the file is text, and you want to get the text line by line, the easiest way is to use fgets().

char buffer[100];
FILE *fp = fopen("filename", "r");                 // do not use "rb"
while (fgets(buffer, sizeof(buffer), fp)) {
... do something
}
fclose(fp);
selwyn