I have a suspicion that there might be a race condition in a certain C++ multithreading situation involving virtual method calls in a vtable dynamic dispatching implementation (for which a vtable pointer is stored as a hidden member in the object with virtual methods). I would like to confirm whether or not this is actually an issue, and I am specifying boost's threading library so we can assume some frame of reference.
Suppose an object "O" has a boost::mutex member for which the entirety of its constructor/destructor and methods are scope-locked on (similar to the Monitor concurrency pattern). A thread "A" constructs an object "O" on the heap without external synchronization (ie WITHOUT an shared mutex enclosing the "new" operation, for which it could synchronize with other threads; note, though, that there is still the "internal, Monitor" mutex locking the scope of its constructor). The thread A then passes a pointer to the "O" instance (which it just constructed) to another thread "B", by means of a synchronized mechanism--for instance, a synchronized readers-writers queue (note: only the pointer to the object is being passed--not the object itself). After construction, neither thread "A" or any other threads perform any writing operation on the "O" instance which "A" constructed.
The thread "B" reads the pointer value of the object "O" from the synchronized queue, after which it immediately leaves the critical section guarding the queue. Then the thread "B" performs a virtual method call on the object "O." Here is where I think an issue may arise.
Now, my understanding of virtual method calls in a [quite probable] vtable implementation of dynamic dispatching is that the calling thread "B" must dereference the pointer to "O" in order to obtain the vtable pointer stored as a hidden member of its object, and that this happens BEFORE the method body is entered (naturally because the method body to execute is not safely and accurately determined until vtable pointer stored in the object itself is accessed). Assuming the aforementioned statements are possibly true for such an implementation, is this not a race condition?
Since the vtable pointer is retrieved by thread "B" (by dereferencing the pointer to the object "O" located in the heap) prior to any memory visibility guaranteeing operations taking place (ie acquiring the member variable mutex in the object "O"), then it is not certain that "B" will perceive the vtable pointer value that "A" originally wrote on the object "O"'s construction, correct? (ie, it may instead perceive a garbage value, resulting in undefined behavior, correct?).
If the above is a valid possibility, does this not imply that making virtual method calls on exclusively internally synchronized objects that are shared between threads is undefined behavior?
And--likewise--since the standard is agnostic to a vtable implementation, how could one ever guarantee that the vtable pointer is safely visible to other threads prior to a virtual call? I suppose one could externally synchronize ("externally" as in, for instance, "surrounding with a shared mutex lock()/unlock() block") the constructor call and then at least the initial virtual method call in each of the threads, but this seems like some awfully discordant programming.
So, if my suspicions are true, then a possibly more elegant solution would be to use inlined, non-virtual member functions which lock the member mutex and then subsequently forward to a virtual call. But--even then--could we guarantee that the constructor initialized the vtable pointer within the confines of the lock() and unlock() guarding the constructor body itself?
If someone could help me clear this up and confirm/deny my suspicions, I would be very grateful.
EDIT: code demonstrating the above
class Interface
{
public:
virtual ~Interface() {}
virtual void dynamicCall() = 0;
};
class Monitor : public Interface
{
boost::mutex mutex;
public:
Monitor()
{
boost::unique_lock<boost::mutex> lock(mutex);
// initialize
}
virtual ~Monitor()
{
boost::unique_lock<boost::mutex> lock(mutex);
// destroy
}
virtual void dynamicCall()
{
boost::unique_lock<boost::mutex> lock(mutex);
// do w/e
}
};
// for simplicity, the numbers following each statement specify the order of execution, and these two functions are assumed
// void passMonitorToSharedQueue( Interface * monitor )
// Thread A passes the 'monitor' pointer value to a
// synchronized queue, pushes it on the queue, and then
// notifies Thread B that a new entry exists
// Interface * getMonitorFromSharedQueue()
// Thread B blocks until Thread A notifies Thread B
// that a new 'Interface *' can be retrieved,at which
// point it retrieves and returns it
void threadBFunc()
{
Interface * if = getMonitorFromSharedQueue(); // (1)
if->dynamicCall(); // (4) (ISSUE HERE?)
}
void threadAFunc()
{
Interface * monitor = new Monitor; // (2)
passMonitorToSharedQueue(monitor); // (3)
}
-- at point (4) I'm under the impression that the vtable pointer value which "Thread A" wrote to memory may not be visible by "Thread B", as I don't see any reason to assume that the compiler will generate code such that the vtable pointer is written within the constructor's locked mutex block.
For instance, consider the situation of multicore systems where each core has a dedicated cache. According to this article, caches are commonly aggressively optimized and--despite forcing cache coherence--do not enforce a strict ordering on cache coherence if there are no synchronization primitives involved.
Perhaps I am misunderstanding the implications of the article, but wouldn't that mean that "A"'s write of the vtable pointer to the constructed object (and there is no indication that this write occurs within a the constructor's locked mutex block) may not be perceived by "B" before "B" reads the vtable pointer? If both A and B are executed on different cores ("A" on core0 and "B" on core1), the cache coherence mechanism may re-order the update of the vtable pointer value in core1's cache (the update that would make it consistent with the value of the vtable pointer in core0's cache, which "A" wrote) such that it occurs after "B"'s read ... if I'm interpreting the article correctly.