views:

480

answers:

3

Consider a large scale, heterogeneous network of various devices. These devices are providing services to others on the network in a peer-to-peer fashion. The mechanism used to track service availabilty across all nodes is currently using TCP sockets marked as keep-alive, usually for the duration the node is online. This leads to every node having a socket open with every other node (within a subnet of the peer-to-peer infrastructure).

What arguments exist regarding the scaleability of using TCP keep-alive in this way?

My alternative approach is to use a publish/subscribe model, where nodes push new services to the network as they become available, and their peers cache them for when they want to subscribe to a service. Does this sound feasable?

A: 

Yeah using keep alive seems like a bad idea for any P2P network. Not only would I only have connections kept open while data is being transferred I would also keep node state updates on a different sockets altogether so as to not interfere with file transmissions.

Spencer Ruport
+1  A: 

I interpret from what you wrote that the communication is strictly point-to-point, with considerable duration ('leases'). If this is true, it means that you will gain nothing in a publish-subscribe model. If this is not true, then yes, you should (could) change the network model to match the communications, and your idea sounds sound.

Regarding your second question, since TCP sockets and keep-alive is just a concept, there is no (or a very small) intrinsic cost of having such a keep-alive contract. In practice YMMV since different socket implementations require different resources, and other actions might be required to keep the channel open. There are however many implementations which require very little resources for open sockets (select()-type for example).

A discovery service (publish/subscribe of services) makes most sense if there are many implementers of the same type of service, and you cannot (or do not want to) predict statically where they will appear.

In short, I would say that you should only change the design if the type of communication that you have fits the current architecture badly. Your idea certainly sounds very feasible, but more information about the communication patterns would be necessary to make an estimation of the outcome.

disown
A: 

If your TCP Keep Alive mechanism is being used only for tracking service availability (meaning, you never communicate service request/response across these connections), the use of TCP sockets is definitely an over kill. TCP sockets do take significant resources.

A more scalable method could be using a publish/subscribe model that uses UDP publish messages at regular intervals to advertise continued existence of the service. You could also use a service-down message published from a disconnecting node to gracefully declare end of service.

Going further, if you mean to get optimal with really large scale networks and, are ready to put in some time and effort -- consider a structured P2P mechanism like DHT.

nik