OpenCL Events and Command Queues

I'm working on translating a CUDA application (this if you must know) to OpenCL. The original application uses the C-style CUDA API, with a single stream just to avoid the automatic busy-wait when reading the results.

Now I notice that OpenCL command queues look a lot like CUDA streams. But in the device read command, and likewise in the write and kernel execute commands, I notice parameters for events too. So I'm wondering, what does it take to execute a device write, a number of kernels (e.g. one call to one kernel then 100 calls to another kernel), and a device read, all sequentially?

If I just enqueue them sequentially into the same queue, will they execute sequentially like they do in CUDA?
If that doesn't work, can/should I daisy-chain events, making each call's wait list the previous call's event?
Or should I add all previous events to each call's wait list, like if there's an N^2 search for dependencies or something?
Or do I just have to event.wait() for each call individually, like it says to in AMD's tutorial?

Thanks!

The one place I didn't think to look. Thanks!

Ken_g6 2010-08-23 20:58:47

I'm going to have to amend my statement to "mostly accepted". It seems to be true that a number of kernels enqueued in order compute in order. However, the buffer reads and writes appear to execute out-of-order unless I both use events *and* waits, possibly even if the read/write is specified as synchronous. See also the atiopencl example in BOINC (boinc.berkeley.edu) for a working example.

Ken_g6 2010-08-28 17:42:58

That's not how CL works, see the OpenCL spec section 5.1 on creating command queues. If you see a different behaviour, it's a implementation error (bug).

Matias Valdenegro 2010-08-28 23:37:03

ansaurus

tags:

views:

answers:

OpenCL Events and Command Queues

related questions