Is there a #define compiler (nvcc) macro of CUDA which I can use? (Like _WIN32 for Windows and so on.)
I need this for header code that will be common between nvcc and VC++ compilers. I know I can go ahead and define my own and pass it as an argument to the nvcc compiler (-D), but it would be great if there is one already defined.
...
I have some CUDA code that nvcc (well, technically ptxas) likes to take upwards of 10 minutes to compile. While it isn't small, it certainly isn't huge. (~5000 lines).
The delay seems to come and go between CUDA version updates, but previously it only took a minute or so instead of 10.
When I used the -v option, it seemed to get st...
I am trying to test some typical cuda functions during the configure process. How can I write it in my configure.ac? Something like:
AC_TRY_COMPILE([],
[
__global__ static void test_cuda() {
const int tid = threadIdx.x;
const int bid = blockIdx.x;
__syncthreads();
}
],
[cuda_comp=ok],[cuda_comp=no])
But nvcc is not defined...
The error I get is this
"C:\CUDA\bin\nvcc.exe" -arch sm_10 -ccbin "C:\Program Files\Microsoft Visual Studio 9.0\VC\bin" -deviceemu -D_DEVICEEMU -Xcompiler "/EHsc /W3 /nologo /Od /Zi /MTd " -I"C:\CUDA\include" -I"../../common/inc" -maxrregcount=32 --compile -o "Debug\matrixMul.cu.obj" "c:\Documents and Settings\All Users.SYSROOT...
I have many structs (classes) and standalone functions that I like to compile separately and then link to the CUDA kernel, but I am getting the "External calls are not supported" error while compiling (not linking) the kernel. nvcc forces to always use inline functions from the kernel. This is very frustrating!! If somebody have figured ...
I am just starting to learn how to use CUDA. I am trying to run some simple example code:
float *ah, *bh, *ad, *bd;
ah = (float *)malloc(sizeof(float)*4);
bh = (float *)malloc(sizeof(float)*4);
cudaMalloc((void **)
cudaMalloc((void **)
... initialize ah ...
/* copy array on device */
cudaMemcpy(ad,ah,sizeof(float)*N,cudaMemcpyHostTo...
I'm running Windows 7 Pro x64 on a Core i5 with a NVIDIA 3100m, which is CUDA compatible.
I've tried installing both the 32-bit and 64-bit CUDA toolkits from NVIDIA, unfortunately from with either of them I cannot compile anything; nvcc says "cannot find a supported cl version. Only MSVC 8.0 and MSVC 9.0 are supported".
I have the x86 ...
I am trying to compile a project by compiling object files and then linking them together, nothing fancy:
hello.o : hello.h hello.cu
nvcc hello.cu -c -o hello.o
#...
main.o : $(objs)
nvcc *.o -o exec
When I get to the link phase, just about every method is shown to be missing and undeclared, despite the fact that nm shows tha...
I would ask this in the CUDA forums but for some reason I can't get past the first page the registration, so here goes:
nVidia Card: 9800 GT
CUDA toolkit 3.0
Compiled for: compute capability 1.1
Scenario 1:
float result = 0;
float f1 = tex2D( tex, u, v );
float f2 = tex2D( tex, u + 1; v + 1 );
long long ll1 = __float2ll_rn...
When I try to build my project on a 64 bit Windows 7 using VS 2010 in Debug 64 bit configuration I get this error along with two other errors.
error: linkage specification is incompatible with previous "hypot" in math.h line 161
error: linkage specification is incompatible with previous "hypotf" in math.h line 161
error: function "abs(l...
When I try to build my project on a 64 bit Windows 7 using VS 2010 in Debug 64 bit configuration I get this error along with two other errors.
error: linkage specification is incompatible with previous "hypot" in math.h line 161
error: linkage specification is incompatible with previous "hypotf" in math.h line 161
error: function "abs(l...
I have a CUDA project. It consists of several .cpp files that contain my application logic and one .cu file that contains multiple kernels plus a __host__ function that invokes them.
Now I would like to determine the number of registers used by my kernel(s). My normal compiler call looks like this:
nvcc -arch compute_20 -link src/kerne...