views:

1669

answers:

4

I am wondering what is the key thing that helps you in GPGPU development and of course what is the constraints that you find unacceptable.

Comes to mind for me:

  • Key advantage: the raw power of these things
  • Key constraint: the memory model

What's your view?

+2  A: 

I found this article to be interesting about how GPU's won't be as necessary with the speed of CPU's and # of cores ever increasing.

http://arstechnica.com/articles/paedia/gpu-sweeney-interview.ars

Lou Franco
I pretty much agree with Tim's views. Even more when you take into account Larrabee's arrival.
Fabien Hure
A: 

Used to be interesting for their parallel architectures and extra silicon that was mostly idle and hence could be used on the side for general purpos programming tasks -

see - http://en.wikipedia.org/wiki/CUDA

but it might not be too relevant in the face of Lou's answer above.

+3  A: 

You have to be careful with how you interpret Tim Sweeney's statements in that Ars interview. He's saying that having two separate platforms (the CPU and GPU), one suitable for single-threaded performance and one suitable for throughput-oriented computing, will soon be a thing of the past, as our applications and hardware grow towards one another.

The GPU grew out of technology limitations with the CPU, which made the arguably more natural algorithms like ray-tracing and photon mapping nigh-undoable at reasonable resolutions and framerates. In came the GPU, with a wildly different and restrictive programming model, but maybe 2 or 3 orders of magnitude better throughput for applications painstakingly coded to that model. The two machine models had (and still have) essentially different coding styles, languages (OpenGL, DirectX, shader languages vs. traditional desktop languages), and workflows. This makes code reuse, and even algorithm/programming skill reuse, extremely difficult, and hamstrings any developer who wants to make use of a dense parallel compute substrate into this restrictive programming model.

Finally, we're coming to a point where this dense compute substrate is similarly programmable to a CPU. Although there is still a sizeable performance delta between one "core" of these massively-parallel accelerators (though the threads of execution within, for example, an SM on the G80, are not exactly cores in the traditional sense) and a modern x86 desktop core, two factors drive convergence of these two platforms:

  • Intel and AMD are moving towards more, simpler cores on x86 chips, converging the hardware with the GPU, where units are becoming more coarse-grained and programmable over time).
  • This and other forces are spawning many new applications that can take advantage of Data- or Thread-Level Parallelism (DLP/TLP), effectively utilizing this kind of substrate.

So, what Tim was saying is that the 2 distinct platforms will converge, to an even greater extent than, for instance, OpenCl, affords. A salient quote from the interview:

TS: No, I see exactly where you're heading. In the next console generation you could have consoles consist of a single non-commodity chip. It could be a general processor, whether it evolved from a past CPU architecture or GPU architecture, and it could potentially run everything—the graphics, the AI, sound, and all these systems in an entirely homogeneous manner. That's a very interesting prospect, because it could dramatically simplify the toolset and the processes for creating software.

Right now, in the course of shipping Unreal 3, we have to use multiple programming languages. We use one programming language for writing pixel shaders, another for writing gameplay code, and then on PlayStation 3 we use yet another compiler to write code to run on the Cell processor. So the PlayStation 3 ends up being a particular challenge, because there you have three completely different processors from different vendors with different instruction sets and different compilers and different performance techniques. So, a lot of the complexity is unnecessary and makes load-balancing more difficult.

When you have, for example, three different chips with different programming capabilities, you often have two of those chips sitting idle for much of the time, while the other is maxed out. But if the architecture is completely uniform, then you can run any task on any part of the chip at any time, and get the best performance tradeoff that way.

Matt J
A: 

The key advantage is gigaflops - raw power. Disadvantages include limited, non orthogonal instruction set and programming model.

Here's a survey paper: http://graphics.idav.ucdavis.edu/publications/print_pub?pub_id=907

The wikipedia article's a pretty good start.

Lou Franco points to an interview with Tim Sweeney; here's the slides of a talk he gave, which has more detail: http://www.scribd.com/doc/5687/The-Next-Mainstream-Programming-Language-A-Game-Developers-Perspective-by-Tim-Sweeney

Might also nose around: http://gpgpu.org

ja