views:

407

answers:

4

I'm a C++ developer who has primarily programmed on Solaris and Linux until recently, when I was forced to create an application targeted to Windows.

I've been using a communication design based on C++ I/O stream backed by TCP socket. The design is based on a single thread reading continuously from the stream (most of the time blocked in the socket read waiting for data) while other threads send through the same stream (synchronized by mutex).

When moving to windows, I elected to use the boost::asio::ip::tcp::iostream to implement the socket stream. I was dismayed to find that the above multithreaded design resulted in deadlock on Windows. It appears that the operator<<(std::basic_ostream<...>,std::basic_string<...>) declares a 'Sentry' that locks the entire stream for both input and output operations. Since my read thread is always waiting on the stream, send operations from other threads deadlock when this Sentry is created.

Here is the relevant part of the call stack during operator<< and Sentry construction:

    ...
    ntdll.dll!7c901046()    
    CAF.exe!_Mtxlock(_RTL_CRITICAL_SECTION * _Mtx=0x00397ad0)  Line 45 C
    CAF.exe!std::_Mutex::_Lock()  Line 24 + 0xb bytes C++
    CAF.exe!std::basic_streambuf<char,std::char_traits<char> >::_Lock()  Line 174 C++
    CAF.exe!std::basic_ostream<char,std::char_traits<char> >::_Sentry_base::_Sentry_base(std::basic_ostream<char,std::char_traits<char> > & _Ostr={...})  Line 78 C++
    CAF.exe!std::basic_ostream<char,std::char_traits<char> >::sentry::sentry(std::basic_ostream<char,std::char_traits<char> > & _Ostr={...})  Line 95 + 0x4e bytes C++
>   CAF.exe!std::operator<<<char,std::char_traits<char>,std::allocator<char> >(std::basic_ostream<char,std::char_traits<char> > & _Ostr={...}, const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & _Str="###")  Line 549 + 0xc bytes C++
    ...

I would be fine if the istream and ostream components were locked separately, but that is not the case.

Is there an alternate implementation of the stream operators that I can use? Can I direct it not to lock? Should I implement my own (not sure how to do this)?

Any suggestions would be appreciated.

(Platform is Windows 32- and 64-bit. Behavior observed with Visual Studio 2003 Pro and 2008 Express)

A: 

Perhaps you could implement a locking layer yourself? I.E., have a separate istream and ostream which you yourself lock when they are invoked. Periodically, check if both are unlocked, and then read from one into the other.

rlbond
A: 

Did you explicitly flush the stream after writing to it? This blog post implies that your data might simply be "stuck" in the buffer. If that's true, then perhaps you appear to deadlock because there's nothing available for reading yet. Add stream << std::flush to the end of your send operations.

An alternative (albeit less efficient) solution suggested by the blog post is to turn off the stream's output buffering:

stream.rdbuf()->pubsetbuf(0, 0);
Kristo
I think his call stack shows its a lock problem.
Zan Lynx
I actually do flush the stream at message boundaries, but that's not the problem here. The other end of this connection is not stuck waiting for data. Rather, two threads are deadlocked on the iostream 'sentry' that the windows streams define.Thanks for your consideration.
Adam
+1  A: 

According to the boost documentation [1] the use of two threads accessing the one object without mutexes is "unsafe". Just because it worked on the Unix platforms is no guarantee that it will work on the Windows platform.

So your options are:

  1. Rewrite your code so your threads don't access the object simultaneously
  2. Patch the boost library and send the changes back
  3. Ask Chris really nicely if he will do the changes for the Windows platform

[1] http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/overview/core/threads.html

teambob
Thanks for the input. Just to clarify -- this question relates specifically to the Visual Studio implementation of the basic_ostream template, which extends a locking and error checking class called Sentry. This is not an issue with the boost::asio library (although your reference is a good thing to note).
Adam
Do you need to convert from tcp::iostream to Visual Studio's std::iostream? You might have more luck if you keep it tcp::iostream all the way.
teambob
Hadn't thought that through all the way...I usually write my class in/out (read/write, etc.) in terms of the lowest common denominator (istream/ostream). Thanks for the food for thought.
Adam
teambob
A: 

This question has languished for long enough. I'm going to report what I ended up doing even though there's a chance I'll be derided.

I had already determined that the problem was that two threads were coming to a deadlock while trying to access an iostream object in separate read and write operations. I could see that the Visual Studio implementation of string stream insertion and extraction operators both declared a Sentry, which locked the stream buffer associated with the stream being operated on.

I knew that, for the stream in question for this deadlock, the stream buffer implementation was boost::asio::basic_socket_streambuf. I inspected the implementation to see that read and write operations (underflow and overflow) actually operate on different buffers (get vs. put).

With the above verified, I chose to simply circumvent the locking for this application. To do that, I used project-specific pre-processor definitions to exclude the locking code in the basic_istream implementation of the locking sentry:

    class _Sentry_base
     { // stores thread lock and reference to input stream
    public:
     __CLR_OR_THIS_CALL _Sentry_base(_Myt& _Istr)
      : _Myistr(_Istr)
      { // lock the stream buffer, if there
#ifndef MY_PROJECT
      if (_Myistr.rdbuf() != 0)
       _Myistr.rdbuf()->_Lock();
#endif
      }

     __CLR_OR_THIS_CALL ~_Sentry_base()
      { // destroy after unlocking
#ifndef MY_PROJECT
      if (_Myistr.rdbuf() != 0)
       _Myistr.rdbuf()->_Unlock();
#endif
      }

Upside:

  • It works
  • Only my project (with the appropriate defines) is affected

Downside:

  • Feels a little hacky
  • Each platform where this is built will need this modification

I plan to mitigate the latter point by loudly documenting this in the code and project documentation.

I realize that there may be a more elegant solution to this, but in the interest of expediency I chose a direct solution after due diligence to understand the impacts.

Adam