views:

458

answers:

3

I am using a hardware interface to send data that requires me to set up a DMA buffer, which needs to be aligned on 64 bits boundaries.

The DMA engine expects buffers to be aligned on at least 32 bits boundaries (4 bytes). For optimal performance the buffer should be aligned on 64 bits boundaries (8 bytes). The transfer size must be a multiple of 4 bytes.

I use posix_memalign to create a buffer like this...

posix_memalign ((void**)&pPattern, 0x1000, DmaBufferSizeinInt32s * sizeof(int) ) )

pPattern is a pointer to an int, and is the start of my buffer which is DmaBufferSizeinInt32s deep.

Is my buffer aligned on 64bits?

+6  A: 

Yes, your buffer IS aligned on 64-bits. It's ALSO aligned on a 4 KByte boundary (hence the 0x1000). If you don't want the 4 KB alignement then pass 0x8 instead of 0x1000 ...

Edit: I would also note that usually when writing DMA chains you are writing them through uncached memory or through some kind of non-cache based write queue. If this is the case you want to align your DMA chains to the cache line size as well to prevent a cache write-back overwriting the start or end of your DMA chain.

Goz
Not sure if I need the 4KByte boundary alignment... should I?
Krakkos
I am basically writing data object which are 10 x 32bit words.I want to send whole numbers of 10 x 32bit words each time.I'm currently DMA'ing 400 x 320bit data objects in each DMA transfer.I'm not sure how the size of my buffer (400 x 10 x 32bits) is related to the alignment, if at all. Should I tweak the size of the buffer?
Krakkos
I can't answer that question. I don't know what your platform is, for one. Under windows memory pages are allocated in 4K pages. This means you can only set an entire page to uncached at a time and thus you may well need the 4K alignment. Alas, though, I cannot say for sure without knowing a lot more about your system ...
Goz
System is RedHat Enterprise Linux kernel 2.6.18.8.Running embedded on a single board computer.
Krakkos
Running on an x86? If so I'd guess that linux also uses 4K pages in the TLB so the 4K alignment would be there to ensure you definitely aren't in the cache and not affecting things that are supposed to be cached.
Goz
@Krakkos - it seems like you might want to ask a question about pointers to good information or tutorials for performing DMA on Linux systems. It looks like buffer alignment issues might only be the start of what you need to know about what I'd guess is a pretty complex subject. It sure as hell is complex on Windows anyway.
Michael Burr
@Michael... That's a fair comment... I am stumbling around really. :)Luckily the underlying device driver takes care of all the actual DMA processing, but I have to allocate some memory, which I will use as a data buffer. I then pass a pointer to the buffer to the device driver and it does the DMA transfer.The only stipulation seems to be that it is aligned on a minimum of 64bits for good performance.
Krakkos
+2  A: 

As Goz pointed out, but (imo) a bit less clearly: you're asking for alignment by 0x1000 bytes (the second argument), which is much more than 64 bits.

You could change the call to just:

posix_memalign ((void**)&pPattern, 8, DmaBufferSizeinInt32s * sizeof(int)))

This might make the call cheaper (less wasted memory), and in any case is clearer, since you ask for something that more closely matches what you actually want.

unwind
OK, I think I see now... the middle argument to `posix_memalign`, is the alignment. And whilst my value was a factor of 64bits, it was actually set to 4096bytes.
Krakkos
A: 

I don't know your hardware and I don't know how you are getting your pPattern pointer, but this seems risky all around. Most DMA I am familiar with requires physical continuous RAM. The operating system only provides virtually continuous RAM to user programs. That means that a memory allocation of 1 MB might be composed of up to 256 unconnected 4K RAM pages.

Much of the time memory allocations will be made of continuous physical pieces which can lead to things working most of the time but not always. You need a kernel device driver to provide safe DMA.

I wonder about this because if your pPattern pointer is coming from a device driver, then why do you need to align it more?

Zan Lynx
I guess the problem there depends on whether you need more than 4K of RAM ...
Goz