ansaurus

Question

Simultaneous launch of Multiple Kernels using CUDA for a GPU

Answer 1

+1 A:

Only devices with CUDA compute capability 2.0 and better (i.e. Fermi) can support multiple simultaneous kernel executions. See section 3.2.6.3 of the CUDA 3.0 programming guide, which states:

Some devices of compute capability 2.0 can execute multiple kernels concurrently. Applications may query this capability by calling cudaGetDeviceProperties() and checking the concurrentKernels property.

The maximum number of kernel launches that a device can execute concurrently is four.

A kernel from one CUDA context cannot execute concurrently with a kernel from another CUDA context.

Kernels that use many textures or a large amount of local memory are less likely to execute concurrently with other kernels.

Edric 2010-06-14 10:27:11

Answer 2

+2 A:

You will need SM 2.0 or above for concurrent kernels.

To get concurrent execution you need to manually indicate that there is no dependence between the two kernels. This is because the compiler cannot determine that one kernel will not modify data being used in the other, this could be by reading from and writing to the same buffer which seems simple enough, but is actually much harder to detect since there can be pointers inside data structures and so on.

To express the independence you must launch the kernels in different streams. The fourth parameter in the triple-chevron syntax specifies the stream, check out the Programming Guide or the SDK concurrentKernels sample.

Tom 2010-06-17 10:29:14

ansaurus

tags:

views:

answers:

Simultaneous launch of Multiple Kernels using CUDA for a GPU

related questions