views:

620

answers:

4

I have clients that need to all connect to a single server process. I am using UDP discovery for the clients to find the server. I have the client and server exchange IP address and port number, so that a TCP/IP connection can be established after completion of the discovery. This way the packet size is kept small. I see that this could be done in one of two ways using UDP:

  1. Each client sends out its own multicast message in search of the server, which the server then responds to. The client can repeat sending this multicast message in regular intervals (in the case that the server is down) until the server responds.
  2. The server sends out a multicast message beacon at regular intervals. The clients subscribe to the multicast group and in this way receives the server's multicast message and complete the discovery.

In 1. if there are many clients then initially there would be many multicast messages transmitted (one from each client). Only the server would subscribe and receive the multicast messages from the clients. Once the server has responded to the client, the client ceases to send out the multicast message. Once all clients have completed their discovery of the server no further multicast messages are transmitted on the network. If however, the server is down, then each client would be sending out a multicast message beacon in intervals until the server is back up and can respond.

In 2. only the server would submit a multicast message beacon in regular intervals. This message would end up getting routed to all clients that are subscribed to the multicast group. Once the clients receive the packet the client's UDP listening socket gets closed and they are no longer subscribed to the multicast group. However, the server must continue to send the multicast beacon, so that new clients can discover it. It would continue sending out the beacon at regular intervals regardless of whether any clients are out their requiring discovery or not.

So, I see pros and cons either way. It seems to me that #1 would result in heavier load initially, but this load eventually reduces down to zero. In #2 the server would continue sending out a beacon forever.

UDP and multicast is a fairly new topic to me, so I am interested in finding out which would be the preferred approach and which would result in less network load.

+1  A: 

I would recommend method #2, as it is likely (depending on the application) that you will have far more clients than you will servers. By having the server send out a beacon, you only send one packet every so often, rather than one packet for each client.

The other benefit of this method, is that it makes it easier for the clients to determine when a new server becomes available, or when an existing server leaves the network, as they don't have to maintain a connection to each server, or keep polling each server, to find out.

X-Cubed
+1  A: 

Both are equally viable methods.

The argument for method #1 would be that in normal principle, clients initiate requests, and servers listen and respond to them.

The argument for method #2 would be that the point of multicast is so that one host can send a packet and it can be received by many clients (one-to-many), so it's meant to be the reverse of #1.

OK, as I think about this I'm actually drawn to #2, server-initiated beacon. The problem with #1 is that let's say clients broadcast beacons, and they hook up with the server, but the server either goes offline or changes its IP address.

When the server is back up and sends its first beacon, all the clients will be notified at the same time to reconnect, and your entire system is back up immediately. With #1, all of the clients would have to individually realize that the server is gone, and they would all start multicasting at the same time, until connected back with the server. If you had 1000 clients and 1 server your network load would literally be 1000x greater than method #2.

I know these messages are most likely small, and 1000 packets at a time is nothing to a UDP network, but just from a design standpoint #2 feels better.

Edit: I feel like I'm developing a split-personality disorder here, but just thought of a powerful point of why #1 would be an advantage... If you ever wanted to implement some sort of natural load balancing or scaling with multiple servers, design #1 works well for this. That way the first "available" server can respond to the client's beacon and connect to it, as opposed to #2 where all the clients jump to the beaconing server.

routeNpingme
+1  A: 

Your option #2 has a big limitation in that it assumes that the server can communicate more or less directly with every possible client. Depending on the exact network architecture of your operational system, this may not be the case. For example, you may be depending that all routers and VPN software and WANs and NATs and whatever other things people connect networks together with, can actually handle the multicast beacon packets.

With #1, you are assuming that the clients can send a UDP packet to the server. This is an entirely reasonable expectation, especially considering the very next thing the client will do is make a TCP connection to the same server.

If the server goes down and the client wants to find out when it's back up, be sure to use exponential backoff otherwise you will take the network down with a packet storm someday!

Greg Hewgill
Option #1 is also based on sending a multicast packet, it's just going in the opposite direction - so similar caveats apply.
caf
With #1, if the server goes down, we would get a sudden spike in multicast messages from clients. However, only the server would be subscribed to the multicast group. So,if I understand this correctly the router would route all messages to a single end point (the server). With #2 you could have 100+ clients subscribed to the server's multicast and the router would route 100+ messages to each of the client end points. Wouldn't the network load be the same in both cases? Would receipt of 100+ packets arriving at the same time at the server (#2) be a concern? This gradually/rapidly reduces to 0.
Elan
Correction: Actually, with #1, the client packets would not all arrive at the same time at the routers and server. The client's each have different startup times. The client beacon could transmit let's say in 30sec intervals. Each packet is maybe 200 bytes in size. Even with 1000 packets wouldn't we have ample (computer) time for the routers and server to process? The client could also slow down the beacon if there are too many failed attempts.
Elan
+1  A: 

I've used option #2 in the past several times. It works well for simple network topologies. We did see some throughput problems when UDP datagrams exceeded the Ethernet MTU resulting in a large amount of fragmentation. The largest problem that we have seen is that multicast discovery breaks down in larger topologies since many routers are configured to block multicast traffic.

The issue that Greg alluded to is rather important to consider when you are designing your protocol suite. As soon as you move beyond simple network topologies, you will have to find solutions for address translation, IP spoofing, and a whole host of other issues related to the handoff from your discovery layer to your communications layer. Most of them have to do specifically with how your server identifies itself and ensuring that the identification is something that a client can make use of.

If I could do it over again (how many times have we uttered this phrase), I would look for standards-based discovery mechanisms that fit the bill and start solving the other protocol suite problems. The last thing that you really want to do is come up with a really good discovery scheme that breaks the week after you deploy it because of some unforeseen network topology. Google service discovery for a starting list. I personally tend towards DNS-SD but there are a lot of other options available.

D.Shawley
This is very helpful information. I am developing a product in C# that will be deployed to customers. Needless to say, network and router configurations could vary greatly from one site to the next. There will have a client service running on user workstations and a single server host. WCF will be used for normal messages.DNS-SD is an interesting proposition, I need to look into this some more. I found one interesting article: http://www.smallegan.com/blog/?p=28
Elan