views:

42

answers:

2

Hi,

I have seen both versions in tutorials, but I could not find out, what their advantages and disadvantages are. Which one is the proper one?

cl_mem input = clCreateBuffer(context,CL_MEM_READ_ONLY,sizeof(float) * DATA_SIZE, NULL, NULL);
clEnqueueWriteBuffer(command_queue, input, CL_TRUE, 0, sizeof(float) * DATA_SIZE, inputdata, 0, NULL, NULL);

vs.

cl_mem input = clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, ,sizeof(float) * DATA_SIZE, inputdata, NULL);

Thanks.

[Update]

I added CL_MEM_COPY_HOST_PTR, to the second example to make it correct.

A: 

Well the main difference between these two is that the first one allocates memory on the device and then copies data to that memory. The second one only allocates.

Or did you mean clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,sizeof(float) * DATA_SIZE, inputdata, NULL);?

stevehb
Hi stevenhb, you were right. I forgot the CL_MEM_COPY_HOST_PTR. So how do they now differ?
Framester
I don't think they do, except for the ability to do asynchronous transfers, as Grizzly mentioned.
stevehb
+1  A: 

I assume that inputdata is not NULL.

In that case the second approach should not work at all, since the specifications says, that clCreateBuffer returns NULL and an error, if:

CL_INVALID_HOST_PTR if host_ptr is NULL and CL_MEM_USE_HOST_PTR or CL_MEM_COPY_HOST_PTR are set in flags or if host_ptr is not NULL but CL_MEM_COPY_HOST_PTR or CL_MEM_USE_HOST_PTR are not set in flags.

so you mean either

clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,sizeof(float) * DATA_SIZE, inputdata, NULL);

or

clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR,sizeof(float) * DATA_SIZE, inputdata, NULL);

The first one should be more or less the same as the first approach you showed, while the second one won't actually copy the data, but instead use the supplied memory location for buffer storage (caching portions or all of it in device memory). Which of those two is better depends on the usage scenario obviously.

Personaly I prefer using the two step approach of first allocating the buffer and afterwards filling it with a writeToBuffer, since I find it easier to see what happens (of course one step might be faster (or it might not, thats just a guess))

Grizzly
Hi Grizzly, you were right. I forgot the CL_MEM_COPY_HOST_PTR. So there are no hard facts that speak for one or the other?
Framester
At least from the specifications there shouldn't be. Ofcourse performance might (or might not) vary, but that would be implementation dependent and subject to change, so I wouldn't count on it anyhow (if performance is critical for the memory transfer it could be intresting to look into asynchronous memory transfers to (using CL_FALSE as blocking parameter of clEnqueueWriteToBuffer). Again whether or not its faster depends on the impleme ntation, for CPU the fastest should be using CL_USE_HOST_PTR. In general I would to ensure that memtransfertime doesn't matter that much and be donewith that
Grizzly
Thanks, performance is not sooo important, it was more of an academic question.
Framester