views:

334

answers:

6

Hi all,

My problem is this: I have a C/C++ app that runs under Linux, and this app receives a constant-rate high-bandwith (~27MB/sec) stream of data that it needs to stream to a file (or files). The computer it runs on is a quad-core 2GHz Xeon running Linux. The filesystem is ext4, and the disk is a solid state E-SATA drive which should be plenty fast for this purpose.

The problem is Linux's too-clever buffering behavior. Specifically, instead of writing the data to disk immediately, or soon after I call write(), Linux will store the "written" data in RAM, and then at some later time (I suspect when the 2GB of RAM starts to get full) it will suddenly try to write out several hundred megabytes of cached data to the disk, all at once. The problem is that this cache-flush is large, and holds off the data-acquisition code for a significant period of time, causing some of the current incoming data to be lost.

My question is: is there any reasonable way to "tune" Linux's caching behavior, so that either it doesn't cache the outgoing data at all, or if it must cache, it caches only a smaller amount at a time, thus smoothing out the bandwidth usage of the drive and improving the performance of the code?

I'm aware of O_DIRECT, and will use that I have to, but it does place some behavioral restrictions on the program (e.g. buffers must be aligned and a multiple of the disk sector size, etc) that I'd rather avoid if I can.

A: 

You can adjust the page cache settings in /proc/sys/vm, (see /proc/sys/vm/dirty_ratio, /proc/sys/vm/swappiness specifically) to tune the page cache to your liking.

Charles Salvia
I'd look more at the IO scheduler than the page cache. The tunables you mention won't help here, as *all* the written blocks are dirty by definition, and never need to be swapped back in.
Andy Ross
+2  A: 

If you have latency requirements that the OS cache can't meet on its own (the default IO scheduler is usually optimized for bandwidth, not latency), you are probably going to have to manage your own memory buffering. Are you writing out the incoming data immediately? If you are, I'd suggest dropping that architecture and going with something like a ring buffer, where one thread (or multiplexed I/O handler) is writing from one side of the buffer while the reads are being copied into the other side.

At some size, this will be large enough to handle the latency required by a pessimal OS cache flush. Or not, in which case you're actually bandwidth limited and no amount of software tuning will help you until you get faster storage.

Andy Ross
+2  A: 

If we are talking about std::fstream (or any C++ stream object)

You can specify your own buffer using:

streambuf* ios::rdbuf ( streambuf* streambuffer);

By defining your own buffer you can customize the behavior of the stream.

Alternatively you can always flush the buffer manually at pre-set intervals.

Note: there is a reson for having a buffer. It is quicker than writting to a disk directly (every 10 bytes). There is very little reason to write to a disk in chunks smaller than the disk block size. If you write too frquently the disk controler will become your bottle neck.

But I have an issue with you using the same thread in the write proccess needing to block the read processes.
While the data is being written there is no reason why another thread can not continue to read data from your stream (you may need to some fancy footwork to make sure they are reading/writting to different areas of the buffer). But I don't see any real potential issue with this as the IO system will go off and do its work asyncroniously (potentially stalling your write thread (depending on your use of the IO system) but not nesacerily your application).

Martin York
I agree. Manually flushing the buffer would be a good temporary work-around so that he could have the time to do what you're suggesting ... multiple threads.
Steve Lazaridis
Good idea regarding the multiple threads. But why would a custom buffer help? He's having issues with the OS page cache. When his app flushes the fstream buffer, the data isn't necessarily going directly to disk; it's just going to the OS page cache. He'd need something like O_DIRECT to bypass the page cache. Using an fstream buffer seems pointless to me if he has a continuous stream of data that is constantly being written to disk.
Charles Salvia
His question has nothing to do with C++ buffering. It has to do with the filesystem/OS
Zanson
@Zanson: Not the way I read it, but as always I could be wrong (I can see how it could be interpreted that way). But I think it would be better for the original person asking the question to tell me I am wrong rather than some random person that has different interpretation on the question than I do.
Martin York
The question is about Linux buffering, not the C++ buffering. Also the observed buffer sizes (megabytes) don't match libstdc++ behavior.
MSalters
@MSalters: Maybe. But I am not convinced. It reads to me like the OP suspects some caching problem that he is attributing to Linux. There is no hard evidense in the question that absolutely means this is a Linux page cache problem as apposed to a stream buffering problem. But considering the code is causing the read to halt while a write is being performed sort of suggests that the OP is a little inexperienced in this area and thus unawre of the difference. As for the mega-bytes that seems like speculation rather than a quntative analyses (otherwise I would expect to see a more exact number).
Martin York
Just for the record: we're current using write(), not fwrite() or C++ streams, to do the writing. So application-layer caching should not be an issue, it's the OS's buffering that is problematic.
Jeremy Friesner
+5  A: 

You can use the posix_fadvise() with the POSIX_FADV_DONTNEED advice (possibly combined with calls to fdatasync()) to make the system flush the data and evict it from the cache.

See this article for a practical example.

Hasturkun
A: 

Well, try this ten pound hammer solution that might prove useful to see if i/o system caching contributes to the problem: every 100 MB or so, call sync().

wallyk
I've tried that, it wasn't sufficient :^)
Jeremy Friesner
A: 

You could use a multithreaded approach—have one thread simply read data packets and added them to a fifo, and the other thread remove packets from the fifo and write them to disk. This way, even if the write to disk stalls, the program can continue to read incoming data and buffer it in RAM.

LnxPrgr3