views:

1027

answers:

4

Hi,

I need to create big relatively big (1-8 GB) files. What is the fastest way to do so on Windows using C or C++ ? I need to create them on the fly and the speed is really an issue. File will be used for storage emulation i.e will be access randomly in different offsets and i need that all storage will be preallocate but not initialized, currently we are writing all storage with dummy data and it's taking too long.

Thanks.

+16  A: 

Use the Win32 API, CreateFile, SetFilePointerEx, SetEndOfFile, and CloseHandle. In that same order.

The trick is in the SetFilePointerEx function. From MSDN:

Note that it is not an error to set the file pointer to a position beyond the end of the file. The size of the file does not increase until you call the SetEndOfFile, WriteFile, or WriteFileEx function.

Windows explorer actually does this same thing when copying a file from one location to another. It does this so that the disk does not need to re-allocate the file for a fragmented disk.

Brian R. Bondy
Tested, it's working as expected thanks Brian.
Ilya
This will work fast only on NTFS and exFAT, not on FAT32, FAT16 ..This is because these file system have an "initialized size"
Dominik Weber
+1  A: 

Check out memory mapped files.

They very much match the use case you describe, high performance and random access.

I believe they don't need to be created as large files. You just set a large max size on them and they will be expanded when you write to parts you haven't touched before.

Laserallan
Using memory mapped files also introduces more complications: errors are reported via structured exceptions instead of function return values, and you won't be able to map an entire 8 GB file into memory on 32-bit Windows because you only have 2 GB of virtual address space (or 3 GB if you're lucky).
bk1e
You'll definitely need to use a (or multiple if you are using many parts of the file independently) window to map what's relevant into memory. It's not like you have the entire file accessible if using standard file IO anyway. It just done using fseeks rather than changing what's mapped to memory.
Laserallan
A: 

If you're using NTFS then sparse files are the way to go:

A file in which much of the data is zeros is said to contain a sparse data set. Files like these are typically very large—for example, a file containing image data to be processed or a matrix within a high-speed database. The problem with files containing sparse data sets is that the majority of the file does not contain useful data and, because of this, they are an inefficient use of disk space.

The file compression in the NTFS file system is a partial solution to the problem. All data in the file that is not explicitly written is explicitly set to zero. File compression compacts these ranges of zeros. However, a drawback of file compression is that access time may increase due to data compression and decompression.

Support for sparse files is introduced in the NTFS file system as another way to make disk space usage more efficient. When sparse file functionality is enabled, the system does not allocate hard drive space to a file except in regions where it contains nonzero data. When a write operation is attempted where a large amount of the data in the buffer is zeros, the zeros are not written to the file. Instead, the file system creates an internal list containing the locations of the zeros in the file, and this list is consulted during all read operations. When a read operation is performed in areas of the file where zeros were located, the file system returns the appropriate number of zeros in the buffer allocated for the read operation. In this way, maintenance of the sparse file is transparent to all processes that access it, and is more efficient than compression for this particular scenario.

Stu Mackellar
No - he needs to pre-allocate the extents.
Dominik Weber
A: 

Hello,

Use "fsutil" command:

E:\VirtualMachines>fsutil file createnew Usage : fsutil file createnew Eg : fsutil file createnew C:\testfile.txt 1000

Reagds

P.S. it is for Windows: 2000/XP/7

opal