views:

390

answers:

2

I'm allocating a cl_mem buffer on a GPU and work on it, which works fine until a certain size is exceeded. In that case the allocation itself succeeds, but execution or copying does not. I do want to use the device's memory for faster operation so I allocate like:

buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);

Now what I don't understand is the size limit. I'm copying about 16 Mbyte but should be able to use about 128 Mbyte (see CL_DEVICE_MAX_MEM_ALLOC_SIZE ).

Why do these numbers differ so much ?


Here's some excerpt from oclDeviceQuery:

 CL_PLATFORM_NAME:  NVIDIA
 CL_PLATFORM_VERSION:  OpenCL 1.0 
 OpenCL SDK Version:  4788711

  CL_DEVICE_NAME:          GeForce 8600 GTS
  CL_DEVICE_TYPE:          CL_DEVICE_TYPE_GPU
  CL_DEVICE_ADDRESS_BITS:              32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:  128 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:     255 MByte
  CL_DEVICE_LOCAL_MEM_TYPE:      local
  CL_DEVICE_LOCAL_MEM_SIZE:      16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:  64 KByte
A: 

clCreateBuffer will not actually create a buffer on the device. This makes sense, since at the time of creation the driver does not know which device will use the buffer (recall that a context can have multiple devices). The buffer will be created on the actual device when you enqueue a write or when you launch a kernel that takes the buffer as a parameter.

As for the 16MB limit, are you using the latest driver (195.xx)? If so you should contact NVIDIA either through the forums or directly.

Tom
A: 

Don't forget whatever other memory you happen to have used on the device (and, if this is also your graphics card, the memory that your display is using).

(Is there a way to get the current available memory, or the largest fragment, or somesuch?)

andrew cooke
Yes, clGetDeviceInfo() has a bunch of params related to total memory, max single allocation etc.
Tom