views:

70

answers:

2

Hey, so lets say I get a file as the first command line argument.

int main(int argc, char** argv) {
    unsigned char* fileArray;

    FILE* file1 = fopen(argv[1], "r");
}

Now how can I go about reading that file, char by char, into the char* fileArray?

Basically how can I convert a FILE* to a char* before I know how big I need to malloc the char*

I know a possible solution is to use a buffer, but my problem here is I'm dealing with files that could have over 900000 chars, and don't see it fit making a buffer that is that large.

+1  A: 

There are a couple of approaches you can take:

  • specify a maximum size that you can handle, then you just allocate once (whether as a global or on the heap).
  • handle the file in chunks if you're worried about fitting it all into memory at once.
  • handle an arbitrary size by using malloc with realloc (as you read bits in).

Number 1 is easy:

static char buff[900001];                  // or malloc/free of 900000
count = fread (buff, 1, 900001, fIn);
if (count > 900000)                        // problem!

Number 2 is probably the best way to do it unless you absolutely need the whole file in memory at once. For example, if your program counts the number of words, it can sequentially process the file a few K at a time.

Number 3, you can maintain a buffer, used and max variable. Initially set max to 50K and allocate buffer as that size.

Then try read in one 10K chunk to a fixed buffer tbuff. Add up the current used and the number of bytes read into tbuff and, if that's greater than max, do a realloc to increase buffer by another 50K (adjusting max at the same time).

Then append tbuff to buffer, adjust used, rinse and repeat. Note that all those values (10K, 50K and so on) are examples only. There are different values you can use depending on your needs.

paxdiablo
+2  A: 

If only "real" files (not stream, devices, ...) are used, you can use stat/fstat or something like

int retval=fseek(file1,0,SEEK_END); // succeeded if ==0  (file seekable, etc.)
long size=ftell(file1); // size==-1 would be error
rewind(file1);

to get the file's size beforehand. Then you can malloc and read. But since file1 might change in the meantime you still have to ensure not to read beyond your malloced size.

smilingthax