tags:

views:

119

answers:

3

I have a device that generates messages over a serial port. When I reboot the device, the IO Completion Port stops reading bytes.

The code is calls GetQueuedCompletionStatus():

BOOL bRet = GetQueuedCompletionStatus(
        m_hCompletionPort, 
        &dwBytesTransferred, 
        &dwCompletionKey, 
        &pOverlapped, 
        INFINITE);

PortMon looks like:

...
IRP_MJ_WRITE    Serial1    SUCCESS     LENGTH: 7    REBOOT.
IRP_MJ_READ     Serial1    CANCELLED   LENGTH: 1

Logging shows the following result:

bRet=true, dwBytesTransferred=7, pOverlapped=0x0202B028, GetLastError()=997
(sleep forever)

Is there any way to detect this failure and reestablish communications?

I can monitor a heat beat and close/reopen the serial port, but it doesn't seem right that the windows API allows serial communications to silently drop like this.

+1  A: 

If you do WaitForSingleObject on the handle for the serial port that you opened to start reading data, does the handle become signalled when the device is rebooted? Maybe this is a way to tell when you need to open the port again?

1800 INFORMATION
Thanks. I was trying to handle this without having to factor out the use of the IO Completion Port, but it looks like I don't have much choice.
You do have choice: See my answer!
janm
+1  A: 

IO Completion Ports can certainly handle this case without problem. You don't need to close and reopen the device.

The most likely problem in this case is that you have an error on the line (caused by the device reset) that you have not cleared using ClearCommError().

You need to use SetCommState() and SetCommTimeouts() appropriately for your device up front. In the DCB you pass to SetCommState(), you need to set fAbortOnError. If you do dequeue an error you need to call ClearCommError() before you requeue another read.

janm
+1  A: 

RE: janm (I can't seem to add a comment to your answer sorry)

I did try setting various flags, including the DCB's fAbortOnError, but GetQueuedCompletionStatus() would still wait infinitely. I also tried periodically timing out the call, and checking the serial port for errors. The serial port always looked fine, yet the disconnection would still permanently break the IO Completion Port. The device rebooting probably creates a transient error state... I say probably, because I've never been able to detect it!

A fellow developer also had a crack at this problem, and they too failed. So we just rewrote the code to use overlapped serial port reads, and now it works fine.

There is probably something, somewhere that we missed... in the end we wasted more time trying to solve the mystery than it took to rewrite the code.

murrayh
Interesting. I have developed a number of systems that do exactly this while using IOCompletionPorts; my designs have tended to use large numbers of single byte reads. These are used for control systems where during testing the serial port was subjected to all kinds of abuse. Another possibility is that there is a bug in the serial device driver. This is more likely if you're using a third-party serial board, rather than the Windows native driver with a standard serial port.
janm