tags:

views:

129

answers:

3

Hi all, I am writing a CUDA kernel for Histogram on a picture, but I had no idea how to return a array from the kernel, and the array will change when other thread read it. Any possible solution for it?

__global__ void Hist(
    TColor *dst, //input image
    int imageW,
    int imageH,
 int*data
){
    const int ix = blockDim.x * blockIdx.x + threadIdx.x;
    const int iy = blockDim.y * blockIdx.y + threadIdx.y;

if(ix < imageW && iy < imageH)
{
  int pixel = get_red(dst[imageW * (iy) + (ix)]);
                  //this assign specific RED value of image to pixel

  data[pixel] ++; // ?? problem statement ...
 }
}

@para d_dst: input image TColor is equals to float4.

@para data: the array for histogram size [255]

extern "C" void
cuda_Hist(TColor *d_dst, int imageW, int imageH,int* data) 
{
  dim3 threads(BLOCKDIM_X, BLOCKDIM_Y);
  dim3 grid(iDivUp(imageW, BLOCKDIM_X), iDivUp(imageH, BLOCKDIM_Y));
  Hist<<<grid, threads>>>(d_dst, imageW, imageH, data);
}
A: 

Histogramming is not particularly efficient when implemented with CUDA (or with GPGPU in general) - typically you need to generate lots of partial histograms in shared memory and then sum them. You might want to consider keeping this particular task on the CPU.

Paul R
However, My task is to try using CUDA to apply a histogram. And I can't finish it. The data can not achieve singlely
kitw
A: 

You will have to either use atomic function to block other thread from using he same memory, or use the partial histogram. Either way it not that efficient unless the input image is very very large.

sjchoi
+1  A: 

Have you looked at the SDK sample? The "histogram" sample is available in the CUDA SDK (currently version 3.0 on the NVIDIA developer site, version 3.1 beta available for registered developers).

The documentation with the sample explains nicely how to handle your summation, either using global memory atomics on the GPU or by collecting the results for each block separately and then doing a separate reduction (either on the host or the GPU).

Tom