tags:

views:

386

answers:

4

If I have a large binary file (say it has 100,000,000 floats), is there a way in C (or C++) to open the file and read a specific float, without having to load the whole file into memory (i.e. how can I quickly find what the 62,821,214th float is)? A second question, is there a way to change that specific float in the file without having to rewrite the entire file?

I'm envisioning functions like:

float readFloatFromFile(const char* fileName, int idx) {
    FILE* f = fopen(fileName,"rb");

    // What goes here?
}

void writeFloatToFile(const char* fileName, int idx, float f) {
    // How do I open the file? fopen can only append or start a new file, right?

    // What goes here?
}
+14  A: 

You know the size of a float is sizeof(float), so multiplication can get you to the correct position:

FILE *f = fopen(fileName, "rb");
fseek(f, idx * sizeof(float), SEEK_SET);
float result;
fread(&result, sizeof(float), 1, f);

Similarly, you can write to a specific position using this method.

Greg Hewgill
Okay, great. It should be fread( though right?
Switch
Oh, you're quite right. I'll fix that.
Greg Hewgill
+4  A: 

fopen allows to open a file for modification (not just to append) by using either the rb+ or wb+ mode on fopen. See here: http://www.cplusplus.com/reference/clibrary/cstdio/fopen/

To position the file to a specific float, you can use the fseek by using index*sizeof(float) as the offset ad SEEK_SET as the orign. See here: http://www.cplusplus.com/reference/clibrary/cstdio/fseek/

Konamiman
+2  A: 

Here is an example if you would like to use C++ streams:

#include <fstream>
using namespace std;

int main()
{
    fstream file("floats.bin", ios::binary);
    float number;

    file.seekp(62821214*sizeof(float), ios::beg);
    file.read(reinterpret_cast<char*>(&number), sizeof(float));
    file.seekp(0, ios::beg); // move to the beginning of the file
    number = 3.2;
    // write number at the beginning of the file
    file.write(reinterpret_cast<char*>(&number), sizeof(float));
}
AraK
A: 

One way would be to call mmap() on the file. Once you've done that, you can read/modify the file as if it was an in-memory array.

Of course that method only works if the file is small enough to fit in your process's address space... if you're running in 64-bit mode, you'll be fine; in 32-bit mode, a file with 100,000,000 floats should fit, but another order or two of magnitude above that and you might run into trouble.

Jeremy Friesner