When sending messages from a self hosted WCF service to many clients (about 10 or so), sometimes messages are being delayed significantly longer than I'd expect (several seconds to send to a client on local network). Does anyone have an idea why this would be and how to fix it?
Some background: the application is a stock ticker style service. It receives messages from a 3rd party server and re-publishes them to clients that connect to the service. It's very important that messages are published as quickly as possible, and in most cases the time between receiving a message and publishing it to all clients is less than 50ms (it's so quick it approaches the resolution of DateTime.Now).
Over the past few weeks, we've been monitoring some occasions when messages are delayed by 2 or 3 seconds. A few days ago, we got a big spike and messages were being delayed by 40-60 seconds. Messages are not being dropped as far as I can tell (unless the entire connection is dropped). The delays does not appear to be specific to any one client; it affects all clients (including ones on the local network).
I send messages to the clients by spamming the ThreadPool. As quickly as messages arrive I call BeginInvoke() once per message per client. The theory being that if any one client is slow to receive a message (because it's on dialup and downloading updates or something) that it won't impact other clients. That isn't what I'm observing though; it appears that all clients (including ones on the local network) are impacted by the delay by a similar duration.
The volume of messages I'm dealing with is 100-400 per second. Messages contain a string, a guid, a date and, depending on the message type, 10-30 integers. I've observed them using Wireshark as being less than 1kB each. We have 10-20 clients connected at any one time.
The WCF server is being hosted in a Windows service on a Windows 2003 Web Edition Server. I'm using the NetTCP binding with SSL/TLS encryption enabled and a custom username / password authentication. It has a 4Mbit internet connection, dual core CPU and 1GB ram and is dedicated to this application. The service is set to ConcurrencyMode.Multiple. The service process, even under high load, rarely exceeds 20% CPU usage.
So far, I've tweaked various WCF configuration options such as:
- serviceBehaviors/serviceThrottling/maxConcurrentSessions (currently 102)
- serviceBehaviors/serviceThrottling/maxConcurrentCalls (currently 64)
- bindings/netTcpBinding/binding/maxConnections (currently 100)
- bindings/netTcpBinding/binding/listenBacklog (currently 100)
- bindings/netTcpBinding/binding/sendTimeout (currently 45s, although I've tried it as high as 3 minutes)
It appears to me like the messages are being queued inside WCF once some threshold is reached (hence why I've being increasing the throttling limits). But to affect all clients it would need to max out all outgoing connections with one or two slow clients. Does anyone know if this is true of the WCF internals?
I can also improve efficiency by coalescing incoming messages when I send them to the client. However, I suspect there's something underlying going on and coalescing won't fix the problem in the long term.
WCF Config (with company names changed):
<system.serviceModel>
<host>
<baseAddresses>
<add baseAddress="net.tcp://localhost:8100/Publisher"/>
</baseAddresses>
</host>
<endpoint address="ThePublisher"
binding="netTcpBinding"
bindingConfiguration="Tcp"
contract="Company.Product.Server.Publisher.IPublisher" />
</behavior>
Code used to send messages:
Private Sub HandleDataBackground(ByVal sender As Object, ByVal e As Timers.ElapsedEventArgs)
If Me._FeedDataQueue.Count > 0 Then
' Dequeue any items received in last 50ms.
While True
Dim dataAndReceivedTime As DataWithReceivedTimeArg
SyncLock Me._FeedDataQueue
If Me._FeedDataQueue.Count = 0 Then Exit While
dataAndReceivedTime = Me._FeedDataQueue.Dequeue()
End SyncLock
' Publish data to all clients.
Me.SendDataToClients(dataAndReceivedTime)
End While
End If
End Sub
Private Sub SendDataToClients(ByVal data As DataWithReceivedTimeArg)
Dim clientsToReceive As IEnumerable(Of ClientInformation)
SyncLock Me._ClientInformation
clientsToReceive = Me._ClientInformation.Values.Where(Function(c) Contract.CollectionContains(c.ContractSubscriptions, data.Data.Contract) AndAlso c.IsUsable).ToList()
End SyncLock
For Each clientInfo In clientsToReceive
Dim futureChangeMethod As New InvokeClientCallbackDelegate(Of DataItem)(AddressOf Me.InvokeClientCallback)
futureChangeMethod.BeginInvoke(clientInfo, data.Data, AddressOf Me.SendDataToClient)
Next
End Sub
Private Sub SendDataToClient(ByVal callback As IFusionIndicatorClientCallback, ByVal data As DataItem)
' Send
callback.ReceiveData(data)
End Sub
Private Sub InvokeClientCallback(Of DataT)(ByVal client As ClientInformation, ByVal data As DataT, ByVal method As InvokeClientCallbackMethodDelegate(Of DataT))
Try
' Send
If client.IsUsable Then
method(client.CallbackObject, data)
client.LastContact = DateTime.Now
Else
' Make sure the callback channel has been removed.
SyncLock Me._ClientInformation
Me._ClientInformation.Remove(client.SessionId)
End SyncLock
End If
Catch ex As CommunicationException
....
Catch ex As ObjectDisposedException
....
Catch ex As TimeoutException
....
Catch ex As Exception
....
End Try
End Sub
A sample of one of the message types:
<DataContract(), KnownType(GetType(DateTimeOffset)), KnownType(GetType(DataItemDepth)), KnownType(GetType(DataItemDepthDetail)), KnownType(GetType(DataItemHistory))> _
Public MustInherit Class DataItem
Implements ICloneable
Protected _Contract As String
Protected _MessageId As Guid
Protected _TradeDate As DateTime
<DataMember()> _
Public Property Contract() As String
...
End Property
<DataMember()> _
Public Property MessageId() As Guid
...
End Property
<DataMember()> _
Public Property TradeDate() As DateTime
...
End Property
Public MustOverride Function Clone() As Object Implements System.ICloneable.Clone
End Class
<DataContract()> _
Public Class DataItemDepth
Inherits DataItem
Protected _VolumnPriceDetail As IList(Of DataItemDepthItem)
<DataMember()> _
Public Property VolumnPriceDetail() As IList(Of DataItemDepthItem)
...
End Property
Public Overrides Function Clone() As Object
...
End Function
End Class
<DataContract()> _
Public Class DataItemDepthItem
Protected _Volume As Int32
Protected _Price As Int32
Protected _BidOrAsk As BidOrAsk ' BidOrAsk is an Int32 enum
Protected _Level As Int32
<DataMember()> _
Public Property Volume() As Int32
...
End Property
<DataMember()> _
Public Property Price() As Int32
...
End Property
<DataMember()> _
Public Property BidOrAsk() As BidOrAsk ' BidOrAsk is an Int32 enum
...
End Property
<DataMember()> _
Public Property Level() As Int32
...
End Property
End Class