views:

187

answers:

7

Hi,

I am writing an application for embedded linux where 5% of processor time is going in reading a file and 95% on processing it. Can I get some performance improvement if I read file in one thread and keeps on processing in another thread?

I am reading from mmc card which has DMA support. Filesize is of 20mb and it is devided in chunks of 2 kb. I will queue chunks from reader thread and process it in processor thread. So thread sync is needed while inserting and deleting from queue only.

I am programming for ARM9.

What should be fast single threaded / multi threaded.

+1  A: 

The only way to know for sure is to try it. But it sounds as if you need your processor to read chunks of the file as it is needed by the processor. Since you're processor bound, the most improvement you could expect is the 5% time it takes to read.

Two threads would require an in-memory buffer to hold the next chunk of file so that it's immediately available for processing, and many embedded systems are extremely limited in available memory.

Robert Harvey
I agree, but theoretically what should be faster?
Sunny
+1  A: 

Right now, when you make the call to read, your program blocks while the data is read. Then it starts up again when it's done and I presume your processing code takes over. The time when it's blocked won't show up as "cpu time" via "time" because the process is in a sleep state during this period. (This depends on DMA being available which it is).

You will probably show a wall-clock increase over the whole program of the time it takes to read in that file, but your cpu time will not go down (and will probably go up due to synchronization).

Dave
Also, you probably want to read in bigger chunks than 2 KB to cut down on synchronization overhead and increase cache performance.
Dave
+1  A: 

There are a couple of things you will want to make sure of.

  1. Can both activities be done in parallel? If the hardware/architecture is going to cause the processing thread to block the other thread then there will be no gain.

  2. The maximum gain you can expect is 5%, (based on Amdhal's law). is the complexity in coding worth that?

I would recommend looking at more efficient ways of processing the file. Look closely at what the processing thread is doing and see.

Q Boiler
+1  A: 

You would probably get some improvement from being able to process data while the read going, but there will necessarily be some overhead as well. As with any optimization problem measurement is the key.

The real question is whether it's worth implementing something in order to measure the difference. For a 5% maximum gain, I suspect the answer is no, but it's up to you how much the potential for some of that 5% is worth versus your time.

Does your platform support memory-mapped files? That would allow you to leave the reading up to the O/S, which it probably does pretty well.

Tim Sylvester
+1  A: 

If you read the data sequentially the additional thread probably is not worth it, because the kernel will read the file ahead and cache contents in memory. Memory mapping the file, unless you are writing for an embedded system (one where MMC is memory-mapped), changes little (the file has to be loaded in memory sometime and these loads will just be trigerred by attempted reads and not by explicit call).

Robert Obryk
+2  A: 

I recommend not using another thread. Instead use posix_fadvise() to tell Linux to read more of your file in advance. The kernel can be reading the file via DMA while your program is processing data.

This assumes that the kernel has enough free memory for data buffering. If your data processing is using all of the memory then the kernel will ignore posix_fadvise().

The exact call that you need would look something like this:

while( 1 ) {
  ret = read(fd, buffer, 2*1024);
  if( ret < 0 ) abort();
  if( ret == 0 ) break;
  if( ret != 2*1024 ) abort();
  pos += ret;
  ret = posix_fadvise(fd, pos, 8*1024, POSIX_FADV_WILLNEED);
  if( ret ) abort();
  process(buffer);
}
Zan Lynx
Cooool, There is nothing like this. really thanks.
Sunny
+1  A: 

I wrote an article about

Multithreaded File Access

on ddj.com. It probably answers a part of your question.

RED SOFT ADAIR