Freeing CUDA memory painfully slow

views:

202

answers:

+3 Q:

Freeing CUDA memory painfully slow

I am allocating some float arrays (pretty large, ie 9,000,000 elements) on the GPU using cudaMalloc((void**)&(storage->data), size * sizeof(float)). In the end of my program, I free this memory using cudaFree(storage->data);.

The problem is that the first deallocation is really slow, around 10 seconds, whereas the others are nearly instantaneous.

My question is the following : what could cause this difference ? Is deallocation memory on a GPU usually that slow ?

+1 A:

should not be that slow, on Linux with cuda 2.2 it takes fraction of a second. Have you tried to run host and device profilers to see exactly why a slow? how many separate allocation do you perfor?, that does have some penalty but not so large.

aaa 2010-01-28 23:22:05

+2 A:

As pointed out on the NVIDIA forums, it's almost certainly a problem with the way you are timing things rather than with cudaFree.

Eric 2010-01-29 13:16:57

Yes, that was the problem. I asked on both SO and nVidia forums to make sure that someone competent will answer, and I got want I want on both ;) ! Awesome guys ! Thanks !

Wookai 2010-01-29 15:22:15

ansaurus

tags:

views:

answers:

Freeing CUDA memory painfully slow

related questions