tags:

views:

704

answers:

6

I need a suggestion on on how do I copy a block of memory efficiently, in single attempt if possible, in C++ or assembly language.

I have a pointer to memory location and offset. Think of a memory as a 2D array that I need to copy consisting of rows and columns.

+21  A: 

How about memcpy?

Aistina
Yes, use memcpy, as it is usually optimal for the target architecture. On x86 architectures, optimal implementations use a few 128-bit SSE registers.
Eric Bainville
well i had already tried that. What memcopy does that it copies one row at a time. Think of i have a block consisting of 5000 rows or more and in a function that is called all the time 10000 times.
Abdul Khaliq
If all rows are contiguous in memory, you can copy all rows in a single memcpy call. If the gaps between the rows in memory are small, a single memcpy call will probably be the fastest way. If all rows are allocated separately, then a loop of memcpy will be needed.
Eric Bainville
Beware of the fact that the origin and destination memory areas must not overlap. If they overlap, either you create an algorithm to perform N non-overlaping memcpy instead of a single operation
David Rodríguez - dribeas
Sorry, memcpy has been deemed not safe. :P ( http://stackoverflow.com/questions/876557/microsoft-sdl-and-memcpy-deprecation )
Sanjaya R
A: 

memcpy?

Martin
+2  A: 

Reading your comments, it sounds like you might want to use parallelism. There are instructions to do this, but they only operate on registers, not memory.

This is because of the way the computer architecture is (I'm assuming x86).

You can only be accessing one memory location at a time because the computer only has one address bus. If you tried to access more than one location at a time, you would be overloading the bus and nothing would work properly.

If you can put the data you need in registers, then you can use a lot of cool processor instructions, such as MMX or SSE, to perform parallel calculations. But as for copying memory in parallel, it's not possible.

As others have said, use memcpy. It's reliable, debugged, and fast.

samoz
+1  A: 

If you need to implement such functionality yourself, I suggest you to check up Duff's Device if it has to be done efficiently.

well i ur answer helped it save me few milliseconds.
Abdul Khaliq
A: 

REP MOVSD in assembly perhaps? Hard to say without more information on exactly what you're trying to copy... Or, you can reprogram the DMA controller to do it too, but it'll actually end up being slower than just using the processor. :-)

Brian Knoblauch
+1  A: 

Use memmove() if the origin and source overlap. Usually memcpy() and memmove() have been highly optimized already for your compiler's clib. If you do write a replacement, at least benchmark it against the clib versions to make sure you're not slowing down your code.

i have a block consisting of 5000 rows or more and in a function that is called all the time 10000 times

Also, consider changing your data structure. Perhaps instead of a 2D array, you can have a 1D array of Pointers to secondary Arrays (the columns). Then instead of copying the entire rows, you need only copy or move the Pointers. You could Pool the column Arrays in a Free-List so that you're not spending lots of time allocating and freeing them as well.

Adisak