views:

915

answers:

1

The TIBCO EMS user's guide (pg 292) says that the backup server will work indefinitely to either A) become the primary server or B) reconnect to the primary server. It also says clients may receive fail-over notification when the switch is successful (see also TIBCO EMS .NET reference pg 220). I have some questions spinning off of these facts...

  1. What kind of errors occur on the client side while the servers are attempting fail-over/reconnect?
  2. What is the appropriate response from the client?
    • Get new Connection objects from the ConnectionFactory until one works?
    • Wait for fail-over notification? (are current Connection instances fixed at this time? or do I need to get a new instance?)

I hope the scenario is clear, any related information or advice would be appreciated too.

+2  A: 

I can at least answer #1 above.

If you have enabled Tibems.SetExceptionOnFTSwitch(true); and have set up an exception handler to capture the messages the server sends to the client, you will see the following:

For single-server, non-fault tolerant connection failures: "Connection has been terminated".

For fault-tolerant connection failures: "Connection has performed fault-tolerant switch to "

If you attempt to publish while the connection is down, a TIBCO.EMS.IllegalStateException is thrown with the "Producer is closed" message.

for #2 above, I think the answer is to allow the EMS library to handle as much as possible. Once we got the EMS reconnect functionality to work, it gracefully tried to reconnect until the server became available again and once it reconnected, it was like there was never a problem. The only gotcha is probably if you try to publish a message before the ems connection is back. This is where the exception handler comes in, Once notified that you are in failover mode, you can adjust exception handling on the publisher side to suppress the error until the connection is back. The thing I don't know is how do you tell when you've exhausted all reconnect attempts.

Anyway, Seems like our two worlds are closely related when it comes to EMS - hope our findings (based on your comments on my questions) help you.

ScottCher