tags:

views:

379

answers:

3

We have an application which is periodically sending TCP messages at a defined rate(Using MODBUS TCP). If a message is not received within a set period an alarm is raised. However every once in a while there appears to be a delay in messages being received. Investigation has shown that this is associated with the ARP cache being refreshed causing a resend of the TCP message.

The IP stack provider have told us that this is the expected behaviour for TCP. The questions are, Is this expected behaviour for an IP stack? If not how do other stacks work around the period when IP/MAC address translation is not available If this is the expected behaviour how can we reduce the delay in TCP messages during this period?(Permanent ARP entries have been tried, but are not the best solution)

+1  A: 

In my last job I worked with a company building routers and switches. Our implementation would queue packets waiting for ARP replies and send them when the ARP reply was received. Therefore, no TCP retransmit required.

Retransmission in TCP occurs when an ACK is not received within a given time. If the ARP reply takes a long time, or is itself lost, you might be getting a retransmission even though the device waiting for the ARP reply is queuing the packet.

It would appear from your question that the period of the TCP message is shorter than the ARP refresh time. This implies that reuse of the ARP is not causing it to stay refreshed, which is possible behaviour that would be helpful in your situation.

A packet trace of the situation occurring could be helpful - are you actually losing the first packet? How long does the ARP reply take?

In order to stop the ARP cache timing out, you might want to try to find something that will refresh it, such as another ARP request for the same address, or a gratuitous ARP. I found a specification for MODBUS TCP but it didn't help. Can you post some details of your network - media, devices, speeds?

Tony van der Peet
A: 

Your description suggests that the peer ARP entries expire between TCP segments and cause some subsequent segments to fail due to the lack of a current MAC destination.

If you have the MODBUS devices on a separate subnet, then perhaps the destination router will be kind enough to queue the segment until it receives a valid MAC. If you cannot use a separate subnet, you could try to force the session to have keep-alives activated - this would cause a periodic empty message to be sent that would keep the ARP timers resetting. If the overhead of the keep-alive is too high and you completely control the application in your system, you could try to force zero-length messages through to the peer.

Pekka
A: 

@pekka: On Cisco L3 switches it seems that sending back a heartbeat (unicast) does not refresh the arp timer. The TCP acks don't seem to to the job either.

Pinging to the broadcastaddress of the subnet however does reset the arp timer in the router.

Wim