I would recommend that you use Memory-Mapped Files. (see also http://msdn.microsoft.com/en-us/library/aa366556.aspx). The following simple code shows one way to do this:
LPCTSTR pszSrcFilename = TEXT("Z:\\test.dat");
HANDLE hSrcFile = CreateFile (pszSrcFilename, GENERIC_READ, FILE_SHARE_READ,
NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN,
NULL);
HANDLE hMapSrcFile = CreateFileMapping (hSrcFile, NULL, PAGE_READONLY, 0, 0, NULL);
PBYTE pSrcFile = (PBYTE) MapViewOfFile (hMapSrcFile, FILE_MAP_READ, 0, 0, 0);
DWORD dwInFileSizeHigh, dwInFileSizeLow;
dwInFileSizeLow = GetFileSize (hInFile, &dwInFileSizeHigh);
After some simple steps you have a pointer pSrcFile
which represent the whole file contents. Is this not what you need? The total size of the memory block in stored in dwInFileSizeHigh
and dwInFileSizeLow
: ((__int64)dwInFileSizeHigh << 32)+dwInFileSizeLow
.
This uses the same feature of the Windows kernel that is used to implement the swap file (page file). It is buffered by the disk cache and very efficient. If plan to access the file mostly sequentially, including the flag FILE_FLAG_SEQUENTIAL_SCAN in the call to CreateFile()
will hint this fact to the system, causing it to try to read ahead for even better performance.
I see that file which you read in the test example has the name "Z:\test.dat". If it is a file coming from a network drive you will see a clear performance advantage. Morover corresponds with http://msdn.microsoft.com/en-us/library/aa366542.aspx you hav the limit about 2 GB instead of 16Mb. I recommend you to map files till 1 GB and then just create a net view with respect of MapViewOfFile
(I am not sure that you code need work with so large files). More then that, on the same MSDN page you can read following
The size of the file mapping object
that you select controls how far into
the file you can "see" with memory
mapping. If you create a file mapping
object that is 500 Kb in size, you
have access only to the first 500 Kb
of the file, regardless of the size of
the file. Since it does not cost you
any system resources to create a
larger file mapping object, create a
file mapping object that is the size
of the file (set the dwMaximumSizeHigh
and dwMaximumSizeLow
parameters of
CreateFileMapping
both to zero) even
if you do not expect to view the
entire file. The cost in system
resources comes in creating the views
and accessing them.
So the usage of memory mapped files is really cheap. If your program reads only portions of the file contents skipping large parts of the file, then you will also have a large performance advantage because it will read only the parts of file which you really accessed (rounded to 16K pages).
More clean code for for file mapping is following
DWORD MapFileInMemory (LPCTSTR pszFileName,
PBYTE *ppbyFile,
PDWORD pdwFileSizeLow, OUT PDWORD pdwFileSizeHigh)
{
HANDLE hFile = INVALID_HANDLE_VALUE, hFileMapping = NULL;
DWORD dwStatus = NO_ERROR;
const DWORD dwSourceId = MSG_SOURCE_MAP_FILE_IN_MEMORY;
__try {
hFile = CreateFile (pszFileName, FILE_READ_DATA, 0, NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN,
NULL);
if (hFile == INVALID_HANDLE_VALUE) {
dwStatus = GetLastError();
__leave;
}
*pdwFileSizeLow = GetFileSize (hFile, pdwFileSizeHigh);
if (*pdwFileSizeLow == INVALID_FILE_SIZE){
dwStatus = GetLastError();
__leave;
}
hFileMapping = CreateFileMapping (hFile, NULL, PAGE_READONLY, 0, 0, NULL);
if (!hFileMapping){
dwStatus = GetLastError();
__leave;
}
*ppbyFile = (PBYTE) MapViewOfFile (hFileMapping, FILE_MAP_READ, 0, 0, 0);
if (*ppbyFile == NULL) {
dwStatus = GetLastError();
__leave;
}
}
__finally {
if (hFileMapping) CloseHandle (hFileMapping);
if (hFile != INVALID_HANDLE_VALUE) CloseHandle (hFile);
}
return dwStatus;
}
BOOL UnmapFileFromMemory (LPCVOID lpBaseAddress)
{
return UnmapViewOfFile (lpBaseAddress);
}