views:

670

answers:

12

I have a function that is doing memcpy, but it's taking up an enormous amount of cycles. Is there a faster alternative/approach than using memcpy to move a piece of memory?

+3  A: 

Usually the standard library shipped with the compiler will implement memcpy() the fastest way possible for the target platform already.

sharptooth
+36  A: 

memcpy is likely to be the fastest way you can copy bytes around in memory. If you need something faster - try figuring out a way of not copying things around, e.g. swap pointers only, not the data itself.

nos
+1 for "...figuring out a way of not copying..."
Jerry Coffin
+1: I agree entirely
High Performance Mark
+1, We recently had an issue when some of our code SUDDENLY slowed down tremendously and consumed lots of extra memory when processing a certain file. Turned out the file had some huge metadata block while other flies had no metadata or small blocks. And those metadata was copied, copied, copied, consuming both time and memory. Replaced copying with pass-by-const-reference.
sharptooth
+2  A: 

It's generally faster not to make a copy at all. Whether you can adapt your function to not copy I don't know but it's worth looking in to.

High Performance Mark
+1  A: 

Sometimes functions like memcpy, memset, ... are implemented in two different ways:

  • once as a real function
  • once as some assembly that's immediately inlined

Not all compilers take the inlined-assembly version by default, your compiler may use the function variant by default, causing some overhead because of the function call. Sheck your compiler to see how to take the intrinsic variant of the function (command line option, pragma's, ...).

Edit: See http://msdn.microsoft.com/en-us/library/tzkfha43%28VS.80%29.aspx for an explanation of intrinsics on the Microsoft C compiler.

Patrick
A: 

I assume you must have huge areas of memory that you want to copy around, if the performance of memcpy has become an issue for you?

In this case, I'd agree with nos's suggestion to figure out some way NOT to copy stuff..

Instead of having one huge blob of memory to be copied around whenever you need to change it, you should probably try some alternative data structures instead.

Without really knowing anything about your problem area, I would suggest taking a good look at persistent data structures and either implementing one of your own or reusing an existing implementation.

Roland Tepp
+1  A: 

Check you Compiler/Platform manual. For some micro-processors and DSP-kits using memcpy is much slower than intrinsic functions or DMA operations.

Yousf
+2  A: 

If your platform supports it, look into if you can use the mmap() system call to leave your data in the file... generally the OS can manage that better. And, as everyone has been saying, avoid copying if at all possible; pointers are your friend in cases like this.

Andrew McGregor
+1  A: 

Please offer us more details. On i386 architecture it is very possible that memcpy is the fastest way of copying. But on different architecture for which the compiler doesn't have an optimized version it is best that you rewrite your memcpy function. I did this on a custom ARM architecture using assembly language. If you transfer BIG chunks of memory then DMA is probably the answer you are looking for.

Please offer more details - architecture, operating system (if relevant).

Iulian Şerbănoiu
A: 

You may want to have a look at this:

http://www.danielvik.com/2010/02/fast-memcpy-in-c.html

Another idea I would try is to use COW techniques to duplicate the memory block and let the OS handle the copying on demand as soon as the page is written to. There are some hints here using mmap(): http://stackoverflow.com/questions/1565177/can-i-do-a-copy-on-write-memcpy-in-linux

hurikhan77
A: 

nos is right, you're calling it too much.

To see where you are calling it from and why, just pause it a few times under the debugger and look at the stack.

Mike Dunlavey
A: 

Agner Fog has fast memcpy implementation http://www.agner.org/optimize/#asmlib

KindDragon
A: 

memory to memory is usually supported in CPU's command set, and memcpy will usually use that. And this is usually the fastest way.

You should check what exactly your CPU is doing. On Linux, watch for swapi in and out and virtual memory effectiveness with sar -B 1 or vmstat 1 or by looking in /proc/memstat. You may see that your copy has to push out a lot of pages to free space, or read them in, etc.

That would mean your problem isn't in what you use for the copy, but how your system uses memory. You may need to decrease file cache or start writing out earlier, or lock the pages in memory, etc.

n-alexander