views:

65

answers:

1

Hey all,

I have been having a tough time setting up an experiment where I allocate memory with CUDA on the device, take that pointer to memory on the device, use it in OpenCL, and return the results. I want to see if this is possible. I had a tough time getting a CUDA project to work so I just used Nvidia's template project in their SDK. In the makefile I added -lOpenCL to the libs section of the common.mk. Everything is fine when I do that, but when I add #include to template.cu so I can start making OpenCL calls, I get over a 100 errors. They all look similar to this, but with different function names at the end:

/usr/lib/gcc/x86_64-linux-gnu/4.4.1/include/xmmintrin.h(334): error: identifier "__builtin_ia32_cmpeqps" is undefined

I am having a hard time figuring out why. Please help if you can. Also, if there is an easier way to set up a project that'll be able to call the CUDA and OpenCL APIs let me know.

+1  A: 

I haven't really worked with cuda, so I don't know how helpful my answer is.

From what I understand you are trying to use opencl directly from your cuda hostcode, which is if I remember correctly compiled using some compiler from nvidia instead the standard gcc. So the problem is probably that this compiler doesn't implement the necessary builtins to work with the mentioned headers. Look here for a similar problem and it's solution: http://forums.nvidia.com/lofiversion/index.php?t88573.html

It seems you have to put everything which needs the opencl api into a different (non cuda) compilation unit so that it will be compiled by the non nvidia compiler.

However I wouldn't count on this working (since opencl buffers aren't just pointers to the memory but should contain some metainformations to), simply because there is no real reason it should work and if it does there is no guarantee that it continues to do so.

What you could try if you really want to is using opengl for the interop, since both opencl and cuda have extensions to allow creating buffers from opengl buffers.

However why do you need to do this? Whats keeping you from using Apple's implementation shortterm, since IIRC it's open source and most of it (the opencl parts) should be platform independent anyways.

Grizzly
I like your idea of using OpenGL buffers, I believe I heard about that before. It seems like a much safer way to do it.I looked at the link you posted, it seems like wrappers will work and I'll give it a try since I can't think of anything else.We are currently using cufft because we can access it through JCUDA and then since its on Java, we can run our program on a Linux, Mac, or Windows machine. With Apple's FFT library we'd have to port it ourselves to make it accessible through JOCL, which the maintainer has already expressed interest in doing himself.