I have noticed that I can use memory blocks for matrices either allocated using cudamalloc() or cublasalloc() function to call cublas functions. The matrix transfer rates and computational are slower for arrays allocated using cudamalloc() rather than cublasalloc(), although there are other advantages to using arrays using cudamalloc(). Why is that the case? It would be great to hear some comments.
+1
A:
cublasAlloc
is essentially a wrapper around cudaMalloc()
so there should be no difference, is there anything else that changes in your code?
Tom
2009-11-19 10:38:26