I have a CUDA project. It consists of several .cpp files that contain my application logic and one .cu file that contains multiple kernels plus a __host__ function that invokes them.
Now I would like to determine the number of registers used by my kernel(s). My normal compiler call looks like this:
nvcc -arch compute_20 -link src/kerne...
I'm building a workstation and want to get into some heavy CUDA programming. I don't want to go all out getting the Tesla cards and have pretty much narrowed it down to either the Quadro 4000 and the GeForce 480, but I don't really understand the difference, on paper it looks like the 480 has more cores 480 vs 256 for the 4000, but the ...
I know it's ridiculous but I need it for storage optimization. Is there any good way to implement it in C++ ?
It has to be flexible enough so that I can use as normal data type e.g Vector< int20 >, operator overloading etc..
...
i need to copy 64 bit integer data from host to device memory.
both of them are declared as "unsigned __int64" and i used cudaMemcpyToSymbol().
By checking with Parallel Nsight, the copied data is shown as a negative integer.
I guess the most significant bit of the lower 4 bytes is treated as a sign bit which is not supposed to be.
can a...
Hi,
I have a problem that is seemingly just solvable by enumerating all possible solutions and then finding the best. In order to do so, I devised a backtracking algorithm that enumerates and stores the best solution if found. It works fine so far.
Now, I wanted to port this algorithm to CUDA. Therefore, I created a procedure that gene...
We have to compress a ton o' (monochrome) image data and move it quickly. If one were to just use the parallelizeable stages of jpeg compression (DCT and run length encoding of the quantized results) and run it on a GPU so each block is compressed in parallel I am hoping that would be very fast and still yeild a very significant compress...
I want to setup a CUDA emulator on my ubunbu 10.04, since I don't have the hardware. Can someone provides some valuable instructions. I think Nvidia does provide an emulator, how can i set it up. so far I don't care about performance, if it's slow. Thanks.
...
I don't know how to include cutil.h in linux, i know where it is, but I don't know how to include it. Ideas please.
...
Hi,
I've written a cuda plugin (dynamic library), and I have a program written in C which uses dlopen() to load this plugin. I am using dlsym() to get the functions from this plugin. For my application it is very important that any time of loading plugin the program gets a new handle with dlopen() calling (the library file may modified s...
what is the best nvidia Video Card for cuda development. a single GTX 295 has 2 GPUs, is it possible to have 2 GTX 295 and use the 4 GPUs in my cuda code?
is it better to get two 480 cards rather than two 295? would a fermi be better than both cards?
...
I'm calculating the Euclidean distance between n-dimensional points using OpenCL. I get two lists of n-dimensional points and I should return an array that contains just the distances from every point in the first table to every point in the second table.
My approach is to do the regular doble loop (for every point in Table1{ for every ...
The template and cppIntegration examples in the CUDA SDK (version 3.1) use Externs to link function calls from the host code to the device code.
However, Tom's comment at http://stackoverflow.com/questions/2090974/how-to-separate-cuda-code-into-multiple-files#comment-2024913 indicates that the usage of extern is deprecated.
If this the...
Direct Question: How do I create a simple hello world CUDA project within visual studio 2010?
Background: I've written CUDA kernels. I'm intimately familiar with the .vcproj files from Visual Studio 2005 -- tweaked several by hand. In VS 2005, if I want to build a CUDA kernel, I add a custom build rule and then explicitly define the...
Hello, I'm trying to convert a simple numerical analysis code (trapezium rule numerical integration) into something that will run on my CUDA enabled GPU. There is alot of literature out there but it all seems far more complex than what is required here! My current code is:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#defi...
Hello, everyone!
Please tell me what technologies GPGPU exist already and which hardwares vendor's implement GPGPU?
I've been reading articles on various sites from morning and I've become confused.
...
I'm using Xcode 3.2 on Mac OS 10.6 to build a very simple HelloWorld program for CUDA
but it fails to build .. any ideas !!!
this is the code :
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <CUDA/CUDA.h>
__device__ char napis_device[14];
__global__ void helloWorldOnDevice(void){
napis_d...
Hi! I need help with CUDA C. I am try programming image processing tools. And i can't understand, how use Bitmap(c++) and CUDA. Help me please. P.S. sorry for my bad english.
...
Is there any way I can call cuda function calls such as
cudaMemcpy(...);
in a .cpp file, or call it in a class method?
...
Hi.
I have been trying to configure OpenCV2.1 and CUDA3.1 on Visual Studio 2008 on a 64bit Windows XP machine, since past 1 week. But all in vain.
OpenCV alone is working fine. CUDA3.1 alone is working fine as well.
I am using CUDA3.1 for 64 bit ... But for OpenCV, I am using 32 bit installation (as provided on Source Forge) - Possible ...
I am going to use cuda to develop programs on GPUs. My plan is to place 3 Nvidia 480 on a single motherboard... is this possible?
if yes, then what motherboard do you recommend?
...