Is there any asynchronous memcpy function in linux? I want it to work with DMA and notify me when it gets completed.
As far as I know, CPU doesn't/can't do DMA to itself. So you need an external hardware on the bus to do the trick for you.
However most hardware cannot address all physical memory, so an exact memcpy clone isn't possible unless you have very strict definitions of memory address ranges in your use case. Otherwise kernel would have to memcpy the block to your own memory block itself which would kill the purpose of cloning memcpy in the first place :)
But still if you want to create a "clone" of a memory block without using memcpy (still a bad idea by the way because DMA memory access is usually slower than the CPU's) you can send the memory block to video card and pull it back to another buffer. You might even be able to put the block in video memory (putbitmap()? :)) and do a hardware accelerated bitblt() to create a copy on the fly.
Do you mind sharing your actual goal so maybe people can come up with smarter/better tricks?
On a multicore processor or even just a processor with hyper-threading, you can sort of have what you want by executing the usual (synchronous) memcpy
in a separate thread. I'm not saying it's a good idea, just pointing out the obvious.
You can do some plays with mremap. Or you can hack FFmpeg to use different buffers for different frames.