views:

15

answers:

1

I am desperate to get some help on this one.

We are getting issues when a volume of messages are sent through TCP Socket Connection on a different server for HSM Card validation.

This is what we tested for volume testing: We tested with processing 3 blocks of 30,000 transactions at 15 trans/sec and at the third block transactions started to be rejected.

The situation was:

  1. We processed block 1 and 2 with 30,000 transactions successfully.
  2. At the third block the system just processed 8,000 transactions successfully and after that the connections with HSM were blocked.
  3. We saw all HSM sockets were in used so transactions were rejected.

We think some of the sockets are not closed or getting timed out because of the volume of messages

Below is the gist of teh code.

    ; Open the device
    OPEN SOCK:(CONNECT=HOST_":"_PORT_":TCP":DELIMITER=$C(13,10,58,27,95):ATTACH="HSMCLIENT"):TIMOUT:"SOCKET"

    ELSE  SET ER=1 CLOSE SOCK QUIT "-1"
    ; Use the socket
    USE SOCK
    ; Write the request. The request message is packed and the bytesteam is written
    WRITE HREQ,#
    ; Read the first two bytes from the socket to identify the length of the reponse
    READ BRESP#2:TIMOUT ELSE  SET ER=1 CLOSE SOCK QUIT "-1"
    ; Calculate the length of the incoming data
    SET RESPLEN=$A(BRESP,1)_$A(BRESP,2)
    ; Now read the data of the calculated length
    READ BRESP#RESPLEN:TIMOUT ELSE  SET ER=1 CLOSE SOCK QUIT "-1"
    ; Cleanup
    CLOSE SOCK
    #ENDBYPASS

If you can provide any suggestion or recomendation that will be really appreciated.

Thanks

A: 

You're most likely leaving the sockets in TIME_WAIT on the HSM as you're initiating the active close.

Due to the rate of connection establishment and the duration of the TIME_WAIT period and the finite number of ephemeral ports you eventually run out of available ports and can't accept any more connections.

You may be able to avoid the HSM's socket going into TIME_WAIT by aborting your connection on completion of your transaction (sending an RST rather than a FIN). You do this in code by setting the Linger option to false before closing your connection. Alternatively your HSM may have a command that you can send that means "thanks for that, I'm done, please close the connection" which would allow it to initiate the active close and then move the TIME_WAIT to the client machines (this may not help if you have a single client machine as you'll just switch the problem from the HSM not being able to accept more connections to you not being able to initiate them...).

Len Holgate