cuda optimization techniques

tags:

cuda
gpgpu

views:

answers:

cuda optimization techniques

i have written a CUDA code to solve an NP-Complete problem, but the performance was not as i suspected.

i know about "some" optimization techniques (using shared memroy,textures,zerocopy...)

What are the most important optimization techniques Cuda programmers should know about?

+2 A:

You should read NVIDIA's CUDA Programming Best Practices guide: http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/NVIDIA_CUDA_BestPracticesGuide.pdf

This has multiple different performance tips with associated "priorities". Here are some of the top priority tips:

Use the effective bandwidth of your device to work out what the upper bound on performance ought to be for your kernel
Minimize memory transfers between host and device - even if that means doing calculations on the device which are not efficient there
Coalesce all memory accesses
Prefer shared memory access to global memory access
Avoid code execution branching within a single warp as this serializes the threads

Edric 2010-06-22 07:04:36

6. Avoid bank conflicts.PSIn my application, i have found out, that usage of statically allocated shared memory is faster, than usage of dynamically allocated memory (with kernels<<<blocks, threads, sharedMemSize>>>())All this is described in best practices guide.

LonliLokli 2010-06-22 09:38:31

ansaurus

tags:

views:

answers:

cuda optimization techniques

related questions