We are facing random spikes in high throughput transaction processing system using sockets for IPC.
Below is the setup used for the run:
- The client opens and closes new connection for every transaction, and there are 4 exchanges between the server and the client.
- We have disabled the
TIME_WAIT
, by setting the socket linger (SO_LINGER
) option via getsockopt as we thought that the spikes were caused due to the sockets waiting inTIME_WAIT
. - There is no processing done for the transaction. Only messages are passed.
- OS used Centos 5.4
The average round trip time is around 3 milli seconds, but some times the round trip time ranges from 100 milli seconds to couple of seconds.
Steps used for Execution and Measurement and output
Starting the server
$ python sockServerLinger.py > /dev/null &
Starting the client to post 1 million transactions to the server. And logs the time for a transaction in the client.log file.
$ python sockClient.py 1000000 > client.log
Once the execution finishes the following command will show the execution time greater than 100 milliseconds in the format
<line_number>:<execution_time>
.$ grep -n "0.[1-9]" client.log | less
Below is the example code for Server and Client.
Server
# File: sockServerLinger.py
import socket, traceback,time
import struct
host = ''
port = 9999
l_onoff = 1
l_linger = 0
lingeropt = struct.pack('ii', l_onoff, l_linger)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, lingeropt)
s.bind((host, port))
s.listen(1)
while 1:
try:
clientsock, clientaddr = s.accept()
print "Got connection from", clientsock.getpeername()
data = clientsock.recv(1024*1024*10)
#print "asdasd",data
numsent=clientsock.send(data)
data1 = clientsock.recv(1024*1024*10)
numsent=clientsock.send(data)
ret = 1
while(ret>0):
data1 = clientsock.recv(1024*1024*10)
ret = len(data)
clientsock.close()
except KeyboardInterrupt:
raise
except:
print traceback.print_exc()
continue
Client
# File: sockClient.py
import socket, traceback,sys
import time
i = 0
while 1:
try:
st = time.time()
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
while (s.connect_ex(('127.0.0.1',9999)) != 0):
continue
numsent=s.send("asd"*1000)
response = s.recv(6000)
numsent=s.send("asd"*1000)
response = s.recv(6000)
i+=1
if i == int(sys.argv[1]):
break
except KeyboardInterrupt:
raise
except:
print "in exec:::::::::::::",traceback.print_exc()
continue
print time.time() -st