tags:

views:

821

answers:

2

What are some good articles/ resources to understand how load balancing is configured with Biztalk --- both in terms of inherent abilities of the product as well as employing NLB (network load balancing with Windows 2003 or later editions)?

EDIT: I am specifically interested in the impact of application protocol on load balancing? For example, how two instances of Biztalk server handle TCP/IP connections when the other party (to which Biztalk makes a connection request) doesn't allow more than one connection, etc.

+4  A: 

The obvious resource is MSDN - There is a section entitled Planning for High Activity that covers most of the concepts and will give you the right terminology to then go looking for other resources on the web. As with a lot of Microsoft server products, MSDN also has lots of white papers covering specific BizTalk scenarios.

Most good BizTalk books also include a section on load balancing concepts (Professional BizTalk Server 2006 has an example).

Beyond that, there are several key concepts that you may find helpful, particularly around the use of terminology (some of BizTalk's usage can be misleading).

Load Balancing

BizTalk Server is, by the nature of its architecture, load balancing. What that means is that if you have more than one BizTalk Host connecting to a MessageBox database, the messages within the database will be spread evenly across the hosts participating in the BizTalk group. (with caveats around just what BizTalk processess have been configured to run in each Host).

There is also the concept of Network Load Balancing which is Microsoft Network Load Balancing Services or any equivalent service. In BizTalk this applies at the web level, for receive adapters using the HTTP protocol (e.g. the HTTP adapter, the SOAP adapter and WCF HTTP adapters). This load balancing is not actually a BizTalk service but is instead a load balancing layer provided on top of the BizTalk isolated host adapters to ensure high availability of the web resources. It is configured the same as any other NLB service.

Clustering

When clustering is mentioned in BizTalk it is used to refer to one of two things - clustering at the SQL layer to provide high availability and failover, and BizTalk Host Clustering.

SQL Clustering - this is simply (though it isn't simple to do, just say) a matter of providing a SQL server cluster that runs the BizTalk server databases, allowing for database failover. This is not a BizTalk specific technology.

BizTalk Host clustering - in this case a BizTalk Server Host is marked as being clustered when creating it inside BizTalk. This is a BizTalk specific setting that essentially states that one and only one instance of the host will be running at a time, and that by extension all the resources within this host will also only have a single instance. It is primarily intended for usage for adapters like the FTP and MSMQ adapters that behave incorrectly when more that one is allowed to run at the same time.


This edit is in response to the OP's comment asking for further details. Hopefully this make things clearer. If you have more questions about specifics I can possibly answer them but this pretty much exhausts my theory knowledge about high availability environment configuration. I'm primarily a BizTalk dev and solution designer, when it comes to network intricacies there are people where I work who fill in the nitty gritty detail and implementation of these designs.

Network Load Balancing for HTTP Based Adapters

The key point I was trying to express here was that Network Load Balancing in the context of BizTalk is no different for any other Network Load Balancing scenario.

BizTalk has two type of hosts, In Process and Isolated. In Process hosts are individual BizTalk services running on servers (with one host instance per server). Isolated hosts are actually delegates to a web server (IIS) that handle all HTTP based adapters (the HTTP adapter and the SOAP adapter plus certain configurations for the WCF adapter)

When you introduce Network Load Balancing to a BizTalk environment what you are doing is intoducing it at the web server layer, for the Isolated host hosted adapters.

Here is the MSDN page for the introduction to NLB. One of the key points about NLB is expressed in the page in the following quote:

Network Load Balancing allows all of the computers in the cluster to be addressed by the same set of cluster IP addresses (but also maintains their existing unique, dedicated IP addresses).

By setting up NLB you allow multiple isolated host servers to handle internet traffic directed at a single dedicated IP address. The NLB configuration farms out the work.

Clustering BizTalk Adapter Handlers

In my answer above I stated that certain BizTalk adapters behave incorrectly when allowed to run within multiple BizTalk Host Instances. This is very adapter specific in terms of the why, so the best expansion on that answer I can give is the following quote from the MSDN documentation, dealing with the FTP adapter specifically.

For most of the BizTalk integrated adapters, high availability can be achieved by creating multiple adapter handlers to run on BizTalk host instances on different BizTalk servers within a BizTalk group. FTP adapter receive handlers should not, however, be configured to run in multiple BizTalk host instances simultaneously. This recommendation is made because the FTP receive adapter uses the FTP protocol to retrieve files from the target system and the FTP protocol does not lock files to ensure that multiple copies of the same file are not retrieved simultaneously when running multiple instances of the FTP receive adapter.

As they say, the FTP adapter utilises the FTP protocol which does not lock files. Because BizTalk is natively a highly parallel system, if you allowed multiple BizTalk hosts to host an instance of the FTP adapter you would end up with multiple copies of the same FTP message received into your BizTalk system. What BizTalk clustering does is ensure that any clustered BizTalk hosts will run on 1 and only 1 host instance. By placing your FTP receive handler inside a clusterd host, you ensure that:

  • you will always have an FTP adapter running so long as a BizTalk host is running
  • you will never have more than one FTP adapter running.

Additionally you can use a BizTalk clustered host to reduce load on a system. For example, a BizTalk SQL adapter receive location that has been configured to poll, will poll on all host instances. While this would not necessarily cause multiple message instances, it could cause undue load on the SQL server you poll, or even create deadlock scenarios depending on the design of the called stored procedure, so clustering the SQL Adapter receive handler can be a good idea.

David Hall
Can you please shed more light on issues/ ideas related to load balancing "adapters." The first part of your answer mentions HTTP, SOAP and WCF but the exact issue (as to why a different strategy is needed) isn't explained. Your second part mentions FTP And MSMQ which you say can't be load balanced at all. Shed some more light, please!
Jaywalker
I can't upvote enough. This is one of the best answers/ advice I received on SO.
Jaywalker
A: 

Hi David,

Currently in our environment we have just one BizTalk server using the HTTP adapter (receive XML files via AS2 with SSL certificate for encryption and digital signature) but we want to add a second BizTalk server to share the HTTP requests coming from our EDI suppliers.

Is there any restriction with BizTalk 2006 R2 STANDARD Edition for configuring NLB when using the HTTP adapter in this scenario?

Many Thanks, Jose

Jose Gonzalez