ansaurus

Question

Reading binary files without buffering the whole file into memory in C++

Answer 1

+7 A:

you could use memory mapped files to do this. open with createFile, use createFileMapping then MapViewOfFile to get a pointer to the data.

best regards, don

Don Dickinson 2009-08-19 18:56:10

This is exactly what I needed, thanks!

Zain 2009-08-20 17:12:41

Answer 2

+1 A:

I believe you want MapViewOfFile.

Drew Hoskins 2009-08-19 18:58:02

Answer 3

+4 A:

Not sure what you mean by CreateFile buffering - CreateFile won't read in the entire contents of the file, and besides, you need to call CreateFile before you can call ReadFile.

ReadFile will do what you want - the OS may do some read ahead of data to opportunisticly cache data, but it will not read the entire 500 MB of file in.

If you really want to have no buffering, pass FILE_FLAG_NO_BUFFERING to CreateFile, and ensure that your file accesses are a multiple of volume sector size. I strongly suggest you do not do this - the system file cache exists for a reason and helps with performance. Caching files in memory should have no effect on the overall system's memory usage - under memory pressure the system file cache will shrink.

As others have mentioned, you can use memory mapped files as well. The difference between memory mapped files and ReadFile is mainly just the interface - ultimately the file manager will satisfy the requests in a similar manner, including some buffering. The interface appears to be a bit more intuitive, but be aware that any errors that occur will result in an exception that will need to be caught otherwise it will crash your program.

Michael 2009-08-19 19:08:38

He may be worried about virtual memory - in a 32-bit address space there may not be enough room for his 500 MB files. The question of whether it's actually copied into RAM wouldn't be relevant.

Drew Hoskins 2009-08-19 19:15:36

Right, but you don't have to read 500 MB at a time.

Michael 2009-08-19 19:21:35

Answer 4

+4 A:

Calling CreateFile() does not itself buffer or otherwise read the contents of the target file. After calling CreateFile(), you must call ReadFile() to obtain whatever parts of the file you want, for example to read the first kilobyte of a file:

DWORD cbRead;
BYTE buffer[1024];
HANDLE hFile = ::CreateFile(filename,
                            GENERIC_READ,
                            FILE_SHARE_READ,
                            NULL,
                            OPEN_EXISTING,
                            FILE_ATTRIBUTE_NORMAL,
                            NULL);
::ReadFile(hFile, sizeof(buffer), &cbRead, NULL);
::CloseHandle(hFile);

In addition, if you want to read a random portion of the file, you can use SetFilePointer() before calling ReadFile(), for example to read one kilobyte starting one megabyte into the file:

DWORD cbRead;
BYTE buffer[1024];
HANDLE hFile = ::CreateFile(filename,
                            GENERIC_READ,
                            FILE_SHARE_READ,
                            NULL,
                            OPEN_EXISTING,
                            FILE_ATTRIBUTE_NORMAL,
                            NULL);
::SetFilePointer(hFile, 1024 * 1024, NULL, FILE_BEGIN);
::ReadFile(hFile, sizeof(buffer), &cbRead, NULL);
::CloseHandle(hFile);

You may, of course, call SetFilePointer() and ReadFile() as many times as you wish while the file is open. A call to ReadFile() implicitly sets the file pointer to the byte immediately following the last byte read by ReadFile().

Additionally, you should read the documentation for the File Management Functions you use, and check the return values appropriately to trap any errors that might occur.

Windows may, at its discretion, use available system memory to cache the contents of open files, but data cached by this process will be discarded if the memory is needed by a running program (after all, the cached data can just be re-read from the disk if it is needed).

Matthew Xavier 2009-08-19 19:20:09

ansaurus

tags:

views:

answers:

Reading binary files without buffering the whole file into memory in C++

related questions