gpgpu

How do i get started with CUDA development on UBUNTU 9.04?

How do i get started with CUDA development on Ubuntu 9.04? Are there any prebuilt binaries? Are the default accelerated drivers sufficient? My thought is to actually work with OpenCL but that seems to be hard to do right now so i thought that i would start with CUDA and then port my application to OpenCL when that is more readily avail...

How to obtain OpenCL SDK?

I was perusing http://www.khronos.org/ web site and only found headers for OpenCL (not OpenGL which I don't care about). How can I obtain OpenCL SDK? ...

Is this GPU video transcoding project feasible?

Hello there Recently I was approached by a guy who wants to do video transcoding using the GPU. He basically wants me to create him an application that he can sell or gain revenue from advertising. Now he has basically asked me to tell me what I can achieve with 5000 US dollars of pay. Now, I am a graduate student and won an award fo...

Why are GPU threads in CUDA and OpenCL allocated in a grid?

Im just learning OpenCL and im at the point when trying to launch a kernel. Why is it that the GPU threads are managed in a grid? I'm going to read more about this in detail but it would be nice with a simple explanation. Is it allways like this when working with GPGPU's? ...

Running OpenCL on hardware from mixed vendors

I've been playing with the ATI OpenCL implementation in their Stream 2.0 beta. The OpenCL in the current beta only uses the CPU for now, the next version is supposed to support GPU kernels. I downloaded Stream because I have an ATI GPU in my work machine. I write software that would benefit hugely from gains by using the GPU. However th...

How to scale cholesky factorization on multiple GPUs

Hello folks, I have implemented Cholesky Factorization for solving large linear equation on GPU using ATI Stream SDK. Now I want to exploit computation power of more and more GPUs and I want to run this code on multiple GPUs. Currently I have One Machine and One GPU installed on it and cholesky factorization is running properly. I want...

Is GPGPU a hack ?

Hello Folks, I had started working on GPGPU some days ago and successfully implemented cholesky factorization with good performacne and I attended a conference on High Performance Computing where some people said that "GPGPU is a Hack". I am still confused what does it mean and why they were saying it hack. One said that this is hack b...

ATI Stream SDK on ubuntu 9.04

Hello All, I have used ATI Stream SDK on windows XP SP3 and implemented one algorithm on GPU. But Now I am interested in scaling this algorithm on multiple GPUs on mutiple machines I switched to UBUNTU to use MPI ( To send messages ). I googled this but I got references for installation on SLES and RHEL but I am looking for UBUNTU 9.04...

high precision math on GPU

I'm interested in implementing an algorithm on the GPU using HLSL, but one of my main concerns is that I would like a variable level of precision. Are there techniques out there to emulate 64bit precision and higher that could be implemented on the GPU. Thanks! ...

Why hasn't GPGPU seen widespread use?

Why do programmers tend to seek to assembly instead of GPGPU when running into serious performance problems? I know that there are different architectures, but implementing an algorithm in two different architectures seems like a reasonable cost when considering the performance benefit (at least is some cases). ...

CUDA: What reasons could there be for nvcc taking several minutes to compile?

I have some CUDA code that nvcc (well, technically ptxas) likes to take upwards of 10 minutes to compile. While it isn't small, it certainly isn't huge. (~5000 lines). The delay seems to come and go between CUDA version updates, but previously it only took a minute or so instead of 10. When I used the -v option, it seemed to get st...

How should a very simple Makefile look like for Cuda compiling under linux

Hi, I want to compile a very basic hello world level Cuda program under Linux. I have three files: the kernel: helloWorld.cu main method: helloWorld.cpp common header: helloWorld.h Could you write me a simple Makefile to compile this with nvcc and g++? Thanks, Gabor ...

Is there a list somewhere of video cards that support GPGPU programming?

Mine is a "NVIDIA GeForce 9500 GS" and everywhere I've searched I can only find "9500 GT" ... does that mean the 9500 GS does not support any GPGPU language such as CUDA? ...

Fetching the vertices from the backbuffer (HLSL) on XNA

Hello and sorry for the obscure title :} I`ll try to explain the best i can. First of all, i am new to HLSL but i understand about the pipeline and stuff that are from the fairy world. What i`m trying to do is use the gpu for general computations (GPGPU). What i don`t know is: how can i read* the vertices (that have been transformed us...

How to debug DirectX 11 Compute Shaders?

I've started using DirectX 11 Compute Shader technology for GP-GPU programming. I had written quite a complex program on HLSL and when I wanted to debug it, I realized that PIX utility from DX SDK August 2009 does not support Compute Shaders... I know that Nvidia is going to release Nexus for Visual Studio, which will support Direct Comp...

Is there any GPGPU library for iPhone?

Is there any GPGPU library for iPhone? ...

Using Delphi to take advantage of GPGPU technology?

GPGPU is the principle of using the parallel processors on video cards for massive increases in performance. Does anyone have any ideas about using GPGPU in Delphi, using either OpenCL or CUDA? CUDA was/is NVidia only, but they have also adopted the OpenCL "standard". I found a few Delphi samples from Google searches but they either c...

Microsoft Accelerator V2 - toArray2D question

Hi, I am new to Microsoft.Accelerator. Take a look at the following code (it is F# but it is similar to C#): type FPA = Microsoft.ParallelArrays.FloatParallelArray let fi = List.init 9 (fun i -> new FPA(i, [|10;10|])) let process (fi: FPA list) : FPA list = fi // complicated function let newfi = process fi let target = new DX9Target(...

Gaussian filter with Brahma

I am trying to write a gaussian filter with Brahma with the DirectX provider, but I get a "The generated HLSL was invalid." exception. Has anyone written anything similar? Could you tell me if my approach is ok? The code I have written so far is : var provider = new ComputationProvider(); var data = new DataParallelArray2D<float>(provid...

Reducing Number of Registers Used in CUDA Kernel

I have a kernel which uses 17 registers, reducing it to 16 would bring me 100% occupancy. My question is: are there methods that can be used to reduce the number or registers used, excluding completely rewriting my algorithms in a different manner. I have always kind of assumed the compiler is a lot smarter than I am, so for example I o...