From the simplicity point of view, the quickest/easiest way to accomplish what you're looking for would be to 'round-robin' your cluster so that for every request, a machine is selected (by a cluster management service or some such) to process a request. Actual client requests don't go directly to the machine that handles it; they instead point to a single endpoint, which acts as a proxy to distribute incoming requests to machines based on availability and load. To quote the below-referenced link,
Network Load Balancing is a way to configure a pool of machines so they take turns responding to requests. It’s most commonly seen implemented in server farms: identically configured machines that spread out the load for a web site, or maybe a Terminal Server farm. You could also use it for a firewall(ISA) farm, vpn access points, really, any time you have TCP/IP traffic that has become too much load for a single machine, but you still want it to appear as a single machine for access purposes.
As for your application being "active", that requirement does not factor into this equation since whether 'active' or 'passive', the application still makes a request to your servers.
Commercial load balancers exist for serving HTTP-style requests, so that may be worth looking into, but with the load balancing features of W2k8, you may be best served tapping into those.
For more info on how to configure that in Win2k8, see this article.
this article is much more technical and focuses on using NLB with Exchange, but the principles should still apply to your situation.
see here for another detailed walk-through of NLB setup and configuration.
Failing that, you may be well served by searching / posting on ServerFault, since your application code is not (and should not be) strictly aware that the NLB even exists.
EDIT: added another link.
EDIT (the 2nd): The OP has corrected my erroneous conclusion in the 'active' vs. 'passive' concept. My answer to that is very similar to my original answer, save that the 'active' service (which, since you're using WCF, could easily be a windows service) could be split into two parts: the actual processing portion, and the management portion. The management portion would run on a single server, and act as a round-robin load balancer for the other servers doing the actual processing. It's slightly more complicated than the original scenario, but I believe it would provide a good deal of flexibility as well as offer a clean separation between your processing and management logic.