views:

177

answers:

2

Problem

I'm trying to create an CUDA application that is well integrated with .net. The design goal is to have several CUDA functions that can be called from managed code. Data should also be able to persist on a device between function calls, so that it can be passed to multiple CUDA functions.

It is of importance that each individual piece of data is only accessed by a single OS thread (as required by CUDA)

My Strategy

I'm wrapping CUDA functionalities and device pointers in Managed C++ code. A CUDA device pointer can be wrapped in a DevicePointer class written in MC++. If the class tracks which thread it is using, it can enforce that only a single thread can access the CUDA device pointer.

I'll then design the program so that only a single thread would attempt to access any given piece of data.

Where I need help

I've done some research, and read about the distinction between managed threads and OS threads. It seems that there is, in general, a many to many relationship between the two.

This means that even though I'm only using a single managed thread, it could switch OS threads, and I'll loose access to a device pointer.

Is there any way to force the CLR to not move a managed thread between OS threads?

+3  A: 

Use the BeginThreadAffinity and EndThreadAffinity methods :

try
{
    Thread.BeginThreadAffinity(); // prevents OS thread switch

    // your code
    // ...
}
finally
{
    Thread.EndThreadAffinity();
}
Thomas Levesque
A: 

I doubt that you need to do anything.

IIRC, the "OS thread switch" means that the OS can move the thread from one processor core to another (or even to another processor in multi-socket systems) when in it's alledged wisdom it thinks that would improve performance.

But Cuda doesn't really care which processor core/"OS thread" is running the code. As long as only one managed thread at a time can access the data there shouldn't be any race condition.

The thread affinity APIs are generally only used when someone gets totally anal about the difference in performance in accessing CPU memory locatations from different cores. But since your persistent data is (I assume) in GPU texture buffers and not in CPU memory, even that is irrelevant.

Die in Sente
Actually it does matter because the CUDA library uses thread local storage (TLS) on the OS thread for contexts. So you need to ensure the managed thread stays tied to the OS thread it was originally born on.
Drew Marsh
Thanks, Drew. I didn't know that!
Die in Sente