ansaurus

Question

Which is faster in memory, ints or chars? And file-mapping or chunk reading?

Answer 1

+5 A:

Memory mapped files are usually the fastest operations available if you require your file to be available synchronously. (There are some asynchronous APIs that allow the O/S to reorder things for a slight speed increase sometimes, but that sounds like it's not helpful in your application)

The main advantage you're getting with the mapped files is that you can work in memory on the file while it is still being read from disk by the O/S, and you don't have to manage your own locking/threaded file reading code.

Memory reference wise, on the x86 memory is going to be read an entire line at a time no matter what you're actually working with. The extra time associated with non byte granular operations refers to the fact that integers need not be byte aligned. For example, performing an ADD will take more time if things aren't aligned on a 4 byte boundary, but for something like a memory copy there will be little difference. If you are working with inherently character data then it's going to be faster to keep it that way than to read everything as integers and bit shift things around.

If you're doing h.264 or MPEG2 encoding the bottleneck is probably going to be CPU time rather than disk i/o in any case.

Billy ONeal 2010-04-07 05:16:05

Answer 2

+2 A:

If you have to access the whole file, it is always faster to read it to memory and do the processing there. Of course, it's also wasting memory, and you have to lock the file somehow so you won't get concurrent access by some other application, but optimization is about compromises anyway. Memory mapping is faster if you're skipping (large) parts of the file, because you don't have to read them at all then.

Yes, accessing memory at 4-byte (or even 8-byte) granularity is faster than accessing it byte-wise. Again it's a compromise - depending on what you have to do with the data afterwards, and how skilled you are at fiddling with the bits in an int, it might not be faster overall.

As for everything regarding optimization:

measure
optimize
measure

DevSolar 2010-04-07 05:16:59

+1, wasn't it 'meassure, meassure, optimize, meassure again'?

David Rodríguez - dribeas 2010-04-07 07:38:34

Answer 3

A:

Regarding to the best size to read from memory, I'm sure you will enjoy reading this post about memory access performance and cache effects.

fnieto 2010-04-07 09:01:55

Answer 4

A:

Will 2010-04-07 09:09:41

Answer 5

A:

One thing to consider about memory-mapping files is that a file with a size greater than the available address range will only be able to be map a portion of the file. To access the remainder of the file requires the first part to be unmapped and the next part to mapped in its place.

Since you're decoding mpeg streams you may want to use a double buffered approach with asynchronous file reading. It works like this:

blocksize = 65536 bytes (or whatever)
currentblock = new byte [blocksize]
nextblock = new byte [blocksize]
read currentblock
while processing
   asynchronously read nextblock
   parse currentblock
   wait for asynchronous read to complete
   swap nextblock and currentblock
endwhile

Skizz 2010-04-07 09:17:48

ansaurus

tags:

views:

answers:

Which is faster in memory, ints or chars? And file-mapping or chunk reading?

related questions