views:

71

answers:

1

I'm working on translating a CUDA application (this if you must know) to OpenCL. The original application uses the C-style CUDA API, with a single stream just to avoid the automatic busy-wait when reading the results.

Now I notice that OpenCL command queues look a lot like CUDA streams. But in the device read command, and likewise in the write and kernel execute commands, I notice parameters for events too. So I'm wondering, what does it take to execute a device write, a number of kernels (e.g. one call to one kernel then 100 calls to another kernel), and a device read, all sequentially?

  1. If I just enqueue them sequentially into the same queue, will they execute sequentially like they do in CUDA?
  2. If that doesn't work, can/should I daisy-chain events, making each call's wait list the previous call's event?
  3. Or should I add all previous events to each call's wait list, like if there's an N^2 search for dependencies or something?
  4. Or do I just have to event.wait() for each call individually, like it says to in AMD's tutorial?

Thanks!

+2  A: 

That depends on how you create the command queue. in clCreateCommandQueue there's a properties parameter that can contain CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, which enables non-sequential execution in the command queue.

If that property is set, commands might execute out of order or in parallel, and the only way of synchronize them is using events.

When that property is not set, commands execute sequentially in the queue.

Matias Valdenegro
The one place I didn't think to look. Thanks!
Ken_g6
I'm going to have to amend my statement to "mostly accepted". It seems to be true that a number of kernels enqueued in order compute in order. However, the buffer reads and writes appear to execute out-of-order unless I both use events *and* waits, possibly even if the read/write is specified as synchronous. See also the atiopencl example in BOINC (boinc.berkeley.edu) for a working example.
Ken_g6
That's not how CL works, see the OpenCL spec section 5.1 on creating command queues. If you see a different behaviour, it's a implementation error (bug).
Matias Valdenegro