how to optimize cuda program for get better performance?
Hi, I write matlab program(cuda) for generate key. how to optimize cuda program for get better performance? ...
Hi, I write matlab program(cuda) for generate key. how to optimize cuda program for get better performance? ...
Should i learn OpenCL if i only want to program NVIDIA GPUs ? ...
Hi, I have 16bit grayscale data on which I would like to make such operations: For every pixel: 1) Compute sample 's' downsampling 16bit->8bit using LUT 2) Store sample in RGB (24bit - 8bit per sample) texture R=s,G=s,B=s At the end I would like to have data that I could use in a windows DIB directly ( unsigned short 8bit per sample RGB...
Hi. I'm just starting out learning OpenCL. I'm trying to get a feel for what performance gains to expect when moving functions/algorithms to the GPU. The most basic kernel given in most tutorials is a kernel that takes two arrays of numbers and sums the value at the corresponding indexes and adds them to a third array, like so: __ker...
Hello, I have to convert several full PAL videos (720x576@25) from YUV 4:2:2 to RGB, in real time, and probably a custom resize for each. I have thought of using the GPU, as I have seen some example that does just this (except that it's 4:4:4 so the bpp is the same in source and destiny)-- http://www.fourcc.org/source/YUV420P-OpenGL-GLS...
So I finally took the time to learn CUDA and get it installed and configured on my computer and I have to say, I'm quite impressed! Here's how it does rendering the Mandelbrot set at 1280 x 678 pixels on my home PC with a Q6600 and a GeForce 8800GTS (max of 1000 iterations): Maxing out all 4 CPU cores with OpenMP: 2.23 fps Running the...
General-purpose computing on graphics processing units (GPGPU) is a very attractive concept to harness the power of the GPU for any kind of computing. I'd love to use GPGPU for image processing, particles, and fast geometric operations. Right now, it seems the two contenders in this space are CUDA and OpenCL. I'd like to know: Is Op...
How is the NVIDIA PhysX engine implemented in the NVIDIA GPUs: It's a co-processor or the physical algorithms are implemented as fragment programs to be executed in the GPU pipeline ? ...
Answering to another StackOverflow question (this one) I stumbled upon an interresting sub-problem. What is the fastest way to sort an array of 6 ints ? As the question is very low level (will be executed by a GPU): we can't assume libraries are available (and the call itself has it's cost), only plain C to avoid emptying instruction ...
UPDATE: Danvil solved it in a comment below. My texture format was GL_RGB not GL_RGBA which of course means that the alpha values aren't kept. Don't know why I didn't realize... Thanks Danvil. I am rendering to a texture using a GLSL shader and then sending that texture as input to a second shader. For the first texture I am using RGB c...
I want to know what sort of financial applications can be implemented using a GPGPU. I'm aware of Option pricing/ Stock price estimation using Monte Carlo simulation on GPGPU using CUDA. Can someone enumerate the various possibilities of utilizing GPGPU for any application in Finance domain, ...
Does CUDA support double precision floating point numbers ? Also need reasons for the same. ...
I have some (financial) tasks which should map well to GPU computing, but I'm not really sure if I should go with OpenCL or DirectCompute. I did some GPU computing, but it was a long time ago (3 years). I did it through OpenGL since there was not really any alternative back then. I've seen some OpenCL presentations and it looks really n...
I want to verify if the text log files created by my program being run at my customer's site have been tampered with. How do you suggest I go about doing this? I searched a bunch here and google but couldn't find my answer. Thanks! Edit: After reading all the suggestions so far here are my thoughts. I want to keep it simple, and since...
Do the parallel-for in .net 4.0 takes privilege of GPU computing automatically? Or I have to configure with some drivers so that it uses GPU. ...
My hands have been itching to learn GPGPU programming for some time. I finally have some time on my hands so I want to use it as wisely as possible. I'm really interested in your guys experience with GPGPU programming, any pointers, references to good literature, links to sites, interesting projects etc. My interests lie mainly in scie...
i have written a CUDA code to solve an NP-Complete problem, but the performance was not as i suspected. i know about "some" optimization techniques (using shared memroy,textures,zerocopy...) What are the most important optimization techniques Cuda programmers should know about? ...
Hello: Does Global Work Size (Dimensions) Need to be Multiple of Work Group Size (Dimensions) in OpenCL? If so, is there a standard way of handling matrices not a multiple of the work group dimensions? I can think of two possibilities: Dynamically set the size of the work group dimensions to a factor of the global work dimensions. (thi...
I've been reading about CUDA and OpenCL and have learned that before these frameworks developers could only use low level APIs like OPENGL and D3D. Unfortunately I haven't been able to find much information about it. Was it a widespread or commercial practice or was it just something they used in research and military labs? I'm sure so...
How can I programmatically determine a GPU's memory bus width and memory clock rate? I want to use these numbers to compute the maximum theoretical memory bandwidth. I'm mostly interested in NVIDIA GPUs. ...