views:

70

answers:

2

This is my first attempt at building a sort of wiki (general one) so please support! :) I am compiling a list of methods that deal with transfering and distributing data. Please do add your thoughts to my list. I'll update the document and re-post for everyone's benefit. If I have missed a method out, please let me know and I will edit it!

  1. Replication
  2. BCP/Bulk Insert
  3. Distributed Queries/Linked Servers
  4. SSIS

Replication

  1. Applications require data to be consolidated from one or more servers to a central repository
  2. Move your reporting functionality off your main SQL Server to a separate SQL Server
  3. Support a large group of users from multiple servers
  4. Need to filter rows/columns

BCP/Bulk Insert

I am not very familiar with BCP utility or how it can be helpful in light of other services available. This is where contribution would be helpful! But here is what I can gather:

  1. BCP: Transfer things very quickly
  2. Bulk Insert: Useful as a inline T-SQL solution. Can be scheduled. However, data format must be very clean

Distributed Queries/Linked Servers

  1. Linked servers: Provide access to a wider variety of data sources than SQL Server replication provides
  2. Linked servers: Connect with data sources that replication does not support or which require ad hoc access
  3. Linked servers: Perform better than OPENDATASOURCE or OPENROWSET
  4. OPENDATASOURCE and OPENROWSET functions: Convenient for retrieving data from data sources on an ad hoc basis. OPENROWSET has BULK facilities as well that may/may not require a format file which might be fiddley
  5. OPENQUERY: Doesn't support variables
  6. All are T-SQL solutions. Relatively easy to implement and set up
  7. All are dependent on connection between source and destionation which might affect performance and scalability

SSIS Again I dont have much experience with SSIS. Here is what I know:

  1. Connection to different databases

What are your experiences and thoughts regards the all the above mentioned methods?

A: 

SSIS can do a lot more than just move data around. It can perform operations on XML files, call web services, send email, execute arbitrary script (VB.NET in 2005, C# in 2008). It can perform operations on MSMQ and WMI, operate on spreadsheets and flat files, etc.

John Saunders
A: 

Service Broker

  1. Extreme performance. Tight integration with SQL Server engine transactions and data access allows for high data throughput.
  2. Offers reliable guaranteed delivery in face of unreliable network links and downtime in communication partners availability. The applications can simply SEND the data even when offline or not connected, SSB will handle transparently the needed retries while maintaining Exactly Once In Order guarantees.
  3. Decoupling of physical location from application code via routes. Applications can send to data logical names, the resolution of logical name to physical location is resolved at deployment, allowing for in-flight adjustments as the topology changes.
  4. Built in support for high availability solutions, clustering and mirroring. Mirroring can be deployed transparently without application changes as only routing is changed.
  5. Cross domain communication. Certificate based authentication and authorization allows for secure communication between unrelated windows domains, like B2B scenarios based on Internet.
  6. Activation on message arrival allows for dynamic scalability of data transfer processing.
  7. Correlated processing exclusive lock-out. Conversation groups remove the burden of locking correlated data itmes processing from the application developer. As long is the conversation group key is used as data partitioning key for business items SSB guarantees than no two requests will process the same business item.
  8. Priority support. Higher priority items can be send and processed ahead of the queue. This applies to the message delivery infrastructure honoring the message priority and the RECEIVE verb also dequeueing messages based on priority.
  9. Large scale deployments. SSB allows for solutions involving thousands of communication partners. It has built-in support for multiplexing of communications by deploying dedicated message forwarders that act as communication nodes, reducing the number of physical connections (sockets) and allowing back ends to connect to very large number of clients (tens and hundreds of thousands).
  10. Built-in load balancing for scale out. New data processing sites can be added transparently on-the-fly as simple deployment time routing operations.
  11. SSB is a general communication framework and as such does not offer out-o-the box table replication capabilities (change tracking, conflict resolution or schema change propagation), these are left as application semantics of the SSB application.
  12. Everything is T-SQL. No new language to learn, no new tools to master.
  13. Poor choice of support tools for set-up, monitoring and troubleshooting. SQL Server 2008 has a diagnostics tool that can help troubleshoot most SSB runtime and deployment issues.
Remus Rusanu