views:

66

answers:

4

Hi all,

Any recommendations out there for Windows application tuning resources (books web sites etc.)?

I have a C++ console application that needs to feed a hardware device with a considerable amount of data at a fairly high rate. (buffer is 32K in size and gets consumed at ~800k bytes per second)

It will stream data without under runs, except when I perform file IO like opening a folder etc... (It seems to be marginally meeting its timing requirements).

Anyway.. a good book or resource to brush up on realtime performance with windows would be helpful.

Thanks!

+1  A: 

The buffer will last for only 40 milliseconds. You can't guarantee zero under-runs on Windows with such strict timing requirements. In user mode land, you are looking at, potentially, hundreds of milliseconds when kernel threads do what they need to do. They run with higher priorities that you can ever gain. The thread quantum on the workstation version is 3 times the clock tick, already beyond 40 milliseconds (3 x 15.625 msec). You can't even reliably compete with user mode threads that boosted their priority and take their sweet old time.

If a bigger buffer is not an option then you are looking at a device driver to get this kind of service guarantee. Or something in between that can provide a larger buffer.

Hans Passant
+3  A: 

The best you can hope for on commodity Windows is "usually meets timing requirements". If the system is running any processes other than your target app, it will occasionally miss deadlines due scheduling inconsistencies. However, if your app/hardware can handle the rare but occasional misses, there are a few things you can do to reduce the number of misses.

  1. Set your process's priority to REALTIME_PRIORITY_CLASS
  2. Change the scheduler's granularity to 1ms resolution via the timeBeginPeriod() function (part of the Windows Multimedia libraries)
  3. Avoid as many system calls in your main loop as possible (this includes allocating memory). Each syscall is an opportunity for the OS to put the process to sleep and, consequently, is an opportunity for the non-deterministic scheduler to miss the next deadline

If this doesn't get the job done for you, you might consider trying a Linux distribution with realtime kernel patches applied. I've found those to provide near-perfect timing (within 10s of microseconds accuracy over the course of several hours). That said, nothing short of a true-realtime OS will actually give you perfection but the realtime-linux distros are much closer than commodity Windows.

Rakis
Thanks for the info, @Rakis. WRT: Windows, unfortunately we are stuck with only a Windows driver for a custom piece of hardware.
JeffV
+1  A: 

The first thing I would do is tune it to where it's as lean as possible. I use this method. For these reasons. Since it's a console app, another option is to try out LTProf, which will show you if there is anything you can fruitfully optimize. When that's done, you will be in the best position to look for buffer timing issues, as @Hans suggested.

Mike Dunlavey
+1  A: 

Optimizing software in C++ from agner.com is a great optimization manual.

As Rakis said, you will need to be very careful in the processing loop:

  • No memory allocation. Use the stack and preallocated memory instead.
  • No throws. Exceptions are quite expensive, in win32 they have a cost even not throwing.
  • No polymorphism. You will save some indirections.
  • Use inline extensively.
  • No locks. Try lock-free approaches when possible.
fnieto
Thanks, my processing in this case is basically receiving the blocks from tcp (ZMQ actually), aggregating them and feeding them to the hardware API.
JeffV