tags:

views:

49

answers:

1

I have a cuda program like this :

for (int i=0;i<100000;i++) {
  if (i%2 == 0) {
    bind_x(x) // bind x to texture
    kernel_code<<A,B>>(M,x,y) // calculate y = M*x
  }
  else {
    bind_x(y)
    kernel_code<<A,B>>(M,y,x) // calculate x = M*y
  }

  cudaThreadSynchronize();

  if (i%2 == 0)
    unbind_x(x)
  else 
    unbind_x(y) // unbind x from texture
}

I heard that if I do not put cudaThreadSynchronize(); cpu will continue to run without waiting for the kernel to end so ... Should I call cudaThreadSynchronize() before unbind_x(). I try to run with& without, the result is the same ?!? (And in theory It shouldn't)

+2  A: 

Any CUDA operation (except special asynchronious operations) will cause cudaThreadSynchronize() automatically. So, results should be the same.

KoppeKTop
Adding to that... "asynchronous operations" include the cudaMemcpyAsync family but also, importantly, other kernel launches.
Tom
Of course. It's naturally that if kernel_code<<>>() running in async mode, then the other kernels will also run in async. Thank you for addition.
KoppeKTop
However my bind and unbind is not kernel call. in kernel I have to fetch x from the texture memory while it may be unbind already if there is no synchronize
iKid
Sorry texture binding also asynchronous, so It will be queue
iKid