tags:

views:

470

answers:

2

I need to write data into drive. I have two options:

  1. write raw sectors.(_write(handle, pBuffer, size);)
  2. write into a file (fwrite(pBuffer, size, count, pFile);)

Which way is faster? I expected the raw sector writing function, _write, to be more efficient. However, my test result failed! fwrite is faster. _write costs longer time. I've pasted my snippet; maybe my code is wrong. Can you help me out? Either way is okay by me, but I think raw write is better, because it seems the data in the drive is encrypted at least....

#define SSD_SECTOR_SIZE 512
int g_pSddDevHandle = _open("\\\\.\\G:",_O_RDWR | _O_BINARY, _S_IREAD | _S_IWRITE);
TIMER_START();
while (ulMovePointer < 1024 * 1024 * 1024)
{
    _write(g_pSddDevHandle,szMemZero,SSD_SECTOR_SIZE);
    ulMovePointer += SSD_SECTOR_SIZE;
}
TIMER_END();
TIMER_PRINT();
FILE * file = fopen("f:\\test.tmp","a+");
TIMER_START();
while (ulMovePointer < 1024 * 1024 * 1024)
{
    fwrite(szMemZero,SSD_SECTOR_SIZE,1,file);
    ulMovePointer += SSD_SECTOR_SIZE;
}
TIMER_END();
TIMER_PRINT();
+18  A: 

Probably because a direct write isn't buffered. When you call fwrite, you are doing buffered writes which tend to be faster in most situations. Essentially, each FILE* handler has an internal buffer which is flushed to disk periodically when it becomes full, which means you end up making less system calls, as you only write to disk in larger chunks.

To put it another way, in your first loop, you are actually writing SSD_SECTOR_SIZE bytes to disk during each iteration. In your second loop you are not. You are only writing SSD_SECTOR_SIZE bytes to a memory buffer, which, depending on the size of the buffer, will only be flushed every Nth iteration.

Charles Salvia
+5  A: 

In the _write() case, the value of SSD_SECTOR_SIZE matters. In the fwrite case, the size of each write will actually be BUFSIZ. To get a better comparison, make sure the underlying buffer sizes are the same.

However, this is probably only part of the difference.

In the fwrite case, you are measuring how fast you can get data into memory. You haven't flushed the stdio buffer to the operating system, and you haven't asked the operating system to flush its buffers to physical storage. To compare more accurately, you should call fflush() before stopping the timers.

If you actually care about getting data onto the disk rather than just getting the data into the operating systems buffers, you should ensure that you call fsync()/FlushFileBuffers() before stopping the timer.

Other obvious differences:

  • The drives are different. I don't know how different.

  • The semantics of a write to a device are different to the semantics of writes to a filesystem; the file system is allowed to delay writes to improve performance until explicitly told not to (eg. with a standard handle, a call to FlushFileBuffers()); writes directly to a device aren't necessarily optimised in that way. On the other hand, the file system must do extra I/O to manage metadata (block allocation, directory entries, etc.)

I suspect that you're seeing a different in policy about how fast things actually get on to the disk. Raw disk performance can be very fast, but you need big writes and preferably multiple concurrent outstanding operations. You can also avoid buffer copying by using the right options when you open the handle.

janm
The result turen out to be fwrite is 10 times faster than _write...SSD_SECTOR_SIZE is 512
Macroideal
If you call fflush() after each call to fwrite, the performance should come out roughly equal. However, as janm mentioned, there are other variables involved here as well, such as OS cache.
Charles Salvia
If you really care about performance, I'd try larger writes, say 1MB at a time. Or even just a single call to the write function and then a call to the flush function. Modern drives don't write sectors, they write tracks. To get an accurate comparison, you should flush buffers.
janm
syncing is extremely important, if you're writing a total amount much smaller than total memory, as the OS may never actually write the data until well after the program terminates. You can use FlushFileBuffers to force the data to actually hit disk: http://msdn.microsoft.com/en-us/library/aa364439%28VS.85%29.aspx
bdonlan