views:

184

answers:

1

Hi!

I'm writing some (arm) inline assembly code that works on a huge array of C structs in a loop and stores some data into another array.

the processor supports the PLD prefetching command.

if i'm accessing the data in successive order, is there a gain in performance if I use the prefetch command to load the startadress of the next struct in the array, before i started processing the current one? or should i prefetch in each iteration the next but one? or prefetch a certain ammound of bytes ahead?

does it also make sense to prefetch an address in the destination array?

thanks!

+1  A: 

This heavily depends on the processor inner workings. Maybe prefetching will increase performance, maybe not, you have to review the documentation.

Performance can be increased if there's a separate subunit for loading data in the processor that works in parallel with the computation subunit. Also bear in mind that prefetch instruction is yet another instruction so you better only issue it once for each block length of the cache line, not more often, otherwise you just increase the processor load and waste time. If the subunit for loading data is not separate and you still issue the prefetch instruction you can even face decrease of performance - no increase because of no simultaneous work and greater load of processor leading to wasting time.

You should not prefetch data from the array you only write to - it's just a waste of time.

sharptooth
hello and thank you for your reply!how is this property of 'loading data into the processor is parallel with computation' called? what do i have to look for in the manuals?
genesys
There must be a block layout of the processor showing several "bricks" called something like "arithmetic operations unit", "memory access unit" and so on. There also must be a description on what units can work in parallel and under which conditions. For example look at this AMD Athlon guide: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf "Appendix B" covers the topics you should be interested in in great details. That's what you should look for in your processor guide.
sharptooth