tags:

views:

81

answers:

2

I have a CUDA project. It consists of several .cpp files that contain my application logic and one .cu file that contains multiple kernels plus a __host__ function that invokes them.

Now I would like to determine the number of registers used by my kernel(s). My normal compiler call looks like this:

nvcc -arch compute_20 -link src/kernel.cu obj/..obj obj/..obj .. -o bin/..exe -l glew32 ...

Adding the "-Xptxas –v" compiler flag to this call unfortunately has no effect. The compiler still produces the same textual output as before. The compiled .exe also works the same way as before with one exception: My framerate jumps to 1800fps, up from 80fps.

+2  A: 

when you compile

nvcc --ptxas-options=-v

aaa
doesn't work either. I've tried all various notations for that flag that can be found on the internet.
Dave
@Dav try removing link option and compile only
aaa
@aaa carp In this case the compiler complains about undefined external symbols.
Dave
@Dav break process in two, first compile, than link.
aaa
@aaa carp I tried nvcc -c ..cu -arch compute_20 --ptxas-options=-v - the compiler outputs a ..obj file but no register count
Dave
A: 

Not exactly what you were looking for, but you can use the CUDA visual profiler shipped with the nvidia gpu computing sdk. Besides many other useful informations, it shows the number of registers used by each kernel in you application.

Dave