views:

231

answers:

5

We have customers who, for unassailable reasons, cannot use SQL Server's built-in backup features because it backs up the entire database. These customers need to breakup and backup the database into subsets of who owns the data so the appropriate parties can backup their own data according to their own rules. My question is two-fold:

  1. How would I do something like this?
  2. Have you ever been asked to partition your backups like this? If not, have you ever been asked to do something that appears to fly in the face of the industry standard? People in our company suggest we/I should simply "roll our own" backup process that backs-up just the required subsets of data. This means, of course, that we/I should simply "roll our own" restore process. When I argue that this is the definition of reinventing the wheel and part of why we chose SQL Server in the first place, I get the sense that they think I am being a tech snob and/or lazy.

I suspect their opinions are based on experience with another product that was Access-based and stored each logical unit in a separate database that could simply be copied.

Any advice other than find a different employer?

+1  A: 

I can see why one would like to do it this way, its an easy way to open up the DB for a advanced customer and letting the customer only work and mess around with its own data. They can use that to create own reports with direct access to the datasource and do whatever they want with it.

I would call that "Export" and "Import" of data, not a "backup". But its playing with words. We do that kind of exports a lot in some of our systems.

On the "How to" I have to have more information, do they want it exported to another server, same sever but another database, or something else?

It could be done by jobs running at night or by a service pushing the data. Other tools exists for this too. Maybe using DTE-packages running at night or triggered. Or having a program made that fetch the data when requested.

Edit: Answer the comment:
In the most of the cases we drop existing subsetdb and then restore an empty db and fill it with the filtered data. Another way is to just backup full, restore as new database and delete the rows that is not part of subset.

I presume that the subsetdb is more of a "read-only"-db with statistical data, so you dont have to worry writeing over changes and so on.

Stefan
And is it manageable? Do you export it into another SQL database and then detach it or do you export it into some kind of file? Do you really restore from such an export?
flipdoubt
@flipdoubt, I edited my answer with information about that.
Stefan
+1  A: 

I can totally understand why a company would want to do this, especially if they offer a hosted solution and are sharing a single database between multiple customers, or something similar. It seems like filtering records out by a customerId field and dropping them to a file would just work and that's the end of it... however, they are messing with fire on this one.

Without looking at the database in question, it is hard to give pointers on why this is a bad idea. But a few come to mind immediately:

  1. Loss of transaction log backup.

  2. Auto-incrementing IDs don't take kindly to inserting "missing" records by hand, and turning the IDENT feature and/or constraints off during insert is just asking for referential integrity issues.

  3. What about shared data? Are there any tables being used across multiple customers? What happens when that data changes over time... where will you retrieve the latest backup just for that data? And how would it affect other customers living in the same database?

  4. Foreign keys... you would have to analyze all of the tables and ensure that tables without foreign keys are inserted first. It isn't impossible, but there's quite a bit of room for error.

  5. What happens when the schema changes? If you are backing up all this data as individual inserts, then they will no longer work as-is without matching the schemas back up.

There's all kinds of things to consider. Personally, if I were them, I would start with a single SQL Server backup of the entire database (better yet, separate their customers out into different databases instead of making them all share one big database), creating differentials daily (or whatever schedule best fits their needs). Then, as an additional service, they could offer some method of exporting and importing data, whether it be via XML, CSV, whatever. Allow the customers to perform backups of their data through export, and if needed, they can re-import it any time, allowing for duplicate checking, etc.

With this approach, you can always guarantee you have a method of reviving a backup that meets Microsoft standards. Data isn't a toy, and SQL Server isn't a beast for nothing... there's a lot more than just pulling data out of a database and throwing it somewhere when it comes to SQL Server's backup process. An entire company can be brought to its knees just by failing to properly safeguard their data, and the worst part is that most of them don't realize until the last minute that their custom backup process doesn't work at restore time... ouch

Last, but not least, there may be tools that will fit the job. Red Gate offers a lot of great SQL Server tools, like SQL Server Compare, Data Compare, and their own custom Backup application. Regardless, I would use them as a last resort...

http://www.red-gate.com/

Lusid
+1  A: 

The short answer is there isn't a native way to handle this.

The longer answer is, if you created a new database with just the schema, then loaded in the customers data from the main database, you could then backup the smaller database into a single backup file and give it to them.

SSIS will probably be your best bet as you can use it's native tasks to grab all the table schemas and create them empty, then define the transformations for the customer specific tables, then loop through the lookup tables copying over the data for all those tables.

mrdenny
A: 

Do you have the option of creating multiple databases? You might be able to derive a solution where the one "central" database contains views which effectively union the other databases' tables together. I know some web filtering apps do this, of course, they don't do the updates this way. But it may be workable. And in that case every database can be backed up using a native means.

K. Brian Kelley
A: 

well if you really have to do something like this then probably the most secure way is to have a data transfer of somekind (replication, service broker, etc) of subsets od data into their own databases (1 db per backable subset). then you can backup those databases.

since you're dealing with only subsets i'd use service broker for this, since it guarantees no loss of data.

Mladen Prajdic