tags:

views:

561

answers:

2

Hello,

I have to convert several full PAL videos (720x576@25) from YUV 4:2:2 to RGB, in real time, and probably a custom resize for each. I have thought of using the GPU, as I have seen some example that does just this (except that it's 4:4:4 so the bpp is the same in source and destiny)-- http://www.fourcc.org/source/YUV420P-OpenGL-GLSLang.c

However, I don't have any experience with using GPU's and I'm not sure of what can be done. The example, as I understand it, just converts the video frame to YUV and displays it in the screen.

Is it possible to get the processed frame instead? Would it be worth the effort to send it to the GPU, get it transformed, and sending it again to main memory, or would it kill performance?

Being a bit platform-specific, assuming I work on windows, is it possible to get an OpenGL or DirectDraw surface from a window so the GPU can draw directly to it?

+2  A: 

The real question is, what do you hope to get out of this?

At the frame rate you are receiving video, you could use something like Intel Performance Primitives to do the couple of operations that you need and easily keep up with the stream.

If you want to learn how to do gpu programming, this is a nice easy problem that you could implement.

It is possible to get the processed frame by doing a readback from the gpu to memory. The actual mechanic will vary depending on what api you use (OpenGL, DirectX, CUDA, OpenCL). I've done it with much greater resolution video and still kept up with a 25fps stream. However, this all depends on the hardware that you will be using.

DirectX and OpenGL both have great tutorials on using windows surfaces as render targets.

Jose
I'm not sure I'm made this clear. I have to process several videos at the same time. The upper limit will surely be between 15 and 20. So I'll have to handle perhaps 500 fps.I haven't heard of IPP before and have been playing with it a couple days. It seems great, and the YUV to RGB conversion seems to be just fast enough. However, there's no way to resize a non-planar YUV picture directly, and the RGB resize is extremely slow - resizing only one video puts my four core xeon CPU usage at 70-80%.I will take a look at CUDA now.
Jaime Pardos
The CPU usage issue seems to be a bug in the last IPP version when using more than one thread to resize. I finally solved my problem with IPP, thank you very much.
Jaime Pardos
+1  A: 

I have actually programmed this for CUDA in C, and a pthreads one in C. (just for fun, though, mind you.) And I found that the GPU works so fast that you spend 50-80% of your time sending data back and forth, even if you completely fill up the memory of the GPU every time. Due to this, the CPU did this work pretty much just as fast as the GPU could. This problem is extremely thread friendly as you may have figured out, so with modern hardware, memory bandwidth is the greatest issue.

I tested this with Core i7 as CPU, and GeForce 8800GT/GTX 285 as graphics card. The GTX285 processed afaik 1500fps of 1920x1080 video, so no matter what you choose, things will be blazingly fast.

Maister
As I commented in the previous reply, I'm unable to get this kind of performance with my CPU, even with the highly optimized IPP library.I'm 'only' going to process 500 fps of PAL video, and it seems that with a bit luck it will be fast enough to do the color conversion (forget about resizing), but 1920x1080 is about 6 times bigger, so you are speaking of perhaps 8-9000 fps. Will try CUDA.Any suggestion about where to start? I don't have the slightiest idea of this GPUGP thing, and it seems quite overwhelming.Thank you very much.
Jaime Pardos
BTW, it's a quad-core Xeon 5130 @ 2.00 ghz. Not an i7, of course, but still...
Jaime Pardos