views:

942

answers:

3

I was asked to look at some code for a friend. (I rightly hesitated due to the MFC and lots of bad code, but he won...)

This is a dialog box based application that uses a CAsyncSocket.

The problem manifests in some nonstop debugbreaks and other similar things - there are also problem with an MFC ENSURE() macro - checking the socket for nullness. All of the issues happen deep in MFC.

Some googling showed up possible resource leaks if one uses themes in Vista/XP, but I don't think that is the problem here.

The code is pretty poor based on my couple hours of debugging, but basically it is doing the following:

(When connection is established there is no problem - it is only the case when no connection)

  • Calls Connect(server, socket) (on the derived CAsyncSocket object)
  • In the OnConnect() we are notified that connection did not work/not connected.
  • Inside a window timer for the main dialog/app there is a timer. When the timer event/handler is called we check if connected.
  • If we have detected that we are not connected (the OnConnect() was not good) then we call CAsyncSocket::Close(), then call CAsyncSocket::Create() (with no params) then call CAsyncSocket::Connect(server, port)

Note that the initial call to Connect() had no preceding call to Create().

My first real question:

  • What is the difference between the two and why is the Create() needed? (if I remove that then it no longer crashes, but I also don't connect when I re-establish connectivity)

The general question:

  • What exactly is wrong with the design of the code above?
  • How should this work in general?

EDIT:

I fixed the code so that all the paths go through calling Create() then Connect().

I still have a problem with an assert in CAsyncSocket::DoCallBack() - the last line of the code below is asserting:

void PASCAL CAsyncSocket::DoCallBack(WPARAM wParam, LPARAM lParam)
{
    if (wParam == 0 && lParam == 0)
     return;

    // Has the socket be closed - lookup in dead handle list
    CAsyncSocket* pSocket = CAsyncSocket::LookupHandle((SOCKET)wParam, TRUE);

    // If yes ignore message
    if (pSocket != NULL)
     return;

    pSocket = CAsyncSocket::LookupHandle((SOCKET)wParam, FALSE);
    if (pSocket == NULL)
    {
     // Must be in the middle of an Accept call
     pSocket = CAsyncSocket::LookupHandle(INVALID_SOCKET, FALSE);
     ENSURE(pSocket != NULL);

If I step through that then I get the message box: "Encountered an improper argument"

I think (but am not sure) that MFC is trying to call back the socket AFTER I have closed it. It is in a callback method (DoCallback()) but I already called Close() on the socket.

So it does look like an MFC problem, unless I am supposed to unsubscribe first.

+4  A: 

The issue is likely to be poorly written code and there's a good chance that it has little or nothing to do with MFC. But it's hard to say for sure from your description which boils down to , "I have an MFC application that throws all sorts of debug assertions. What should I do?".

It might be that a rewrite is in order, but no one can really say without knowing more about the application. I think you (or your friend) will need to make that determination.

Michael Burr
added code description
Tim
The limitations of CAsyncSocket and MFC in general sometimes make people code in ways that are bad. I am not specifically blaming MFC for the bad code, but it does make one do some contortionist coding.
Tim
+5  A: 

Your choice really. If you think you'll have more luck with another sockets implementation, then do it.

However, Microsoft has a lot of developers (and I believe some of them may even be good ones). You may, just may, want to consider the possibility that the fault doesn't all lay at their end.

The amount of help you can get for their APIs and products is also good, in my opinion.

Perhaps if you took the time to understand the MFC model, you would get that "AHA" moment and understand it better. I'm no fan of Winsock - I'm more used to the UNIX world where sync was the way to go, and you just ran separate processes/threads if you wanted async-type behavor.

CAsyncSocket, I suspect, is still hamstrung by the fact that MFC is a single-threaded model (in terms of GUI) even though Windows has had real pre-emptive threads for quite a while. [I may be wrong about that hamstrung comment, it's been a while since I used Win32 directly].


Update:

Based on your update where you stated what you were doing, I'm fairly certain you are not allowed to connect before creating. Quoting http://msdn.microsoft.com/en-us/library/3d46645f(VS.80).aspx,

To use a CAsyncSocket object, call its constructor, then call the Create function to create the underlying socket handle ... and for a client socket call the Connect member function.

As to why, I think this is an extra complexity added because Windows needs to do async sockets in an event-pumping environment since they cannot block the main GUI thread.

In UNIXy environments, there is either no event thread (normal processes) or network ops are just farmed off to another thread manually (in GUI apps).

This was most likely a design decision made in WinSock long ago (probably in Windows 3.11 which was a much more restrictive environment in which to do async operations) and carried through [although that's conjecture on my part, the UNIX sockets API has never had this sort of async behavior where messages are pumped, it always tended to use select() or threads/processes].


Further update:

That assertion (not exception) usually occurs if you have closed and/or deleted the socket object while there is a pending operation on it. In your case, I'd suggest it's still trying to do the connect when you close it.

Then when the connect succeeds or fails, the callback is called and it cannot find your socket in the tables.

This isn't an MFC problem, it's your friend's code violating the contract. If you do a connect (or any async operation), you have to wait for success or failure before closing the socket (or working further on it) - in this case, that means wait for the call to your OnConnect() function.

From memory, you call the create() when you create the async socket, then everything else happens in response to messages arriving in the message queue (i.e., calls to your OnXXX() functions). Like all Win32 GUI stuff, the messages are supposed to drive the program (the code runs in response to messages). This code is looking more and more like classic coding where the program drives everything - that way lies madness as you'll have your program and the async socket 'thread' fighting for control.

I haven't looked at it for quite a while but you should be able to get your hands on a CHATSRVR sample program which will show you how to do it.

paxdiablo
added code description
Tim
thanks for the description. That sounds like it may be the problem. Unfortunately I am not sure if the time period between the connect call and a response/notification of success or failure is deterministic. It seems like there is nothing I can do - for example if the user wants to shut down the app or try a connection to a different server I would just have to create another socket I guess. Perhaps that is the answer - leave that one pending forever and mark it as deletable when it gets the callback notification.Thanks,tim
Tim
Tim, I don't think shutting down the app would be a problem since that *should* release all relevant resources anyway. With user trying a different connection, I'd suggest either (1) do as you suggest, mark current attempt as obsolete so it can be deleted on success/fail, and start another one for the user; or (2) tell the user only one pending connection is available at a time (e.g., something gray out the "connect" button) - you don't have to cater to their *every* desire :-)
paxdiablo
+1  A: 

I am having a similar problem. This is a grip not an answer. Why cant the great programmers as MS give us some hint in this stupid message something like module, file and line of the failure.

CharlieBisbee