views:

73

answers:

1

Have a 1MB pipe:

if (0 == CreatePipe(&hRead,&hWrite,0,1024*1024))
{
printf("CreatePipe failed\n");
return success;
}

Sending 4000 bytes at a time (bytesReq = 4000)

while ((bytesReq = (FileSize - offset)) != 0)
{


//Send data to Decoder.cpp thread, converting to human readable CSV
        if ( (0 == WriteFile(hWrite,
                               readBuff,
                               bytesReq,
                               &bytesWritten,
                               0) ) || 
                               (bytesWritten != bytesReq) )
        {
             printf("WriteFile failed error = %d\n",GetLastError());
             break;
        } 

}  

Only 4 bytes at a time being read in at another thread, on other end of pipe.  

When I made the pipe smaller, the total time of sending and reading got a lot smaller.

Changed the Pipe Size to –
1024*1024 = 2 minutes (original size)
1024*512 = 1min 47 sec
10,000 = 1min 33 sec
Anything below 10k, 1min 33 sec

How can this be?

+4  A: 

Less waiting.

If the pipe buffer is too big, then one process writes all the data and closes it's end of the pipe before the second process even begins.

When the pipe is too big, the processes are executed serially.

S.Lott
@S.Lott: So the read end of the pipe doesn't start until all the writing is done or pipe is full? I thought it can start reading as soon as there is data available in the pipe?
Tommy
It **can**, but it may not. And in this case, that's the most logical explanation. Your processes are not overlapping in time. Why not? The reader must be waiting for the writer. Why? Because the reader isn't getting scheduled. Why not? The OS is waiting for some event to make the reader able to run. That's my theory.
S.Lott
@S.Lott: yeah, that's true, that stinks, the whole reason I used a pipe and thread is so that the time overlaps.
Tommy
@Tommy: so experiment. Try different-sized pipes and measure the performance. You'll find one that's optimal. That's important to know. It needs to be a configuration parameter, since different OS's (and different releases of an OS) may have different rules for scheduling.
S.Lott
@S.Lott: I seemed to have found that any size magnitudes smaller than 1MB all happen at the same time of 1min 33sec. I need to find out what else the scheduler is based on. Also, wouldn't a smaller pipe be more serial, you mentioned when the pipe is too big, the processes are executed serially? How is that so?
Tommy
@Tommy: any size below 1Mb and the writer side fills the pipe quickly. The OS interleaves writing and reading to maximize parallelism. Check `top` or whatever to see that both parts are using equal shares of CPU. At this point, you are limited by some other resource. For example the process that writes to the pipe could be reading an external file (slowly) or the process that reads from the pipe could be writing a log (slowly). You have **many** possible bottlenecks. The pipe isn't one of them.
S.Lott
@S.Lott: I see. Yes the process that writes to the pipe does read in an external file and the process that reads from the pipe is writing a log. I am not sure if there is any way to really speed them up. Thank you for the knowledge.
Tommy
@Tommy: More parallel operations. Make each process into smaller steps. Increase the number of steps. Reduce the amount of data touched at each step.
S.Lott