views:

104

answers:

1

I am doing investigation on filestream (asking on stackoverflow while reading whitepapers and google searching), in my current screnario documents are managed in this way:

1) I have a DB table where I keep the document id and the doc path (like \fileserver\DocumentRepository\file000000001.pdf)

2) I have a document folder (\fileserver\DocumentRepository) where I store the documents

Of course I need to change this to a varbinary(max)/filestream storage.

What is the best way to perform this task?

Is it possible to say "\fileserver\DocumentRepository\file000000001.pdf" is assigned to a varbinary(max) field or I have to explicitly insert it? So somehow tell to the varbinary(max) field: "now you are a pointer to the existing document".

+1  A: 

You can't assign an existing file to a varbinary(max)/filestream value. You have to explicitly insert it.

That being said, if for some reason this is not an option for you (e.g. you can't afford copying huge amounts of data or would hit a disk space problem while copying) there are some hacks to do the migration with 0-copy. The trick would be to do the following steps:

  1. Switch the DB to simple recovery model.
  2. Insert placeholder filestream files for all the files you're about to migrate. When inserting, use a varbinary value of 0x. While inserting, collect the (document id/file path) => (filestream file name) pairs.
  3. Stop Sql Server.
  4. Overwrite the empty filestream files with the real documents (use move/hard links to avoid copying data).
  5. Start Sql Server, perform some sanity checks (DBCC) and start a new backup chain.

Obviously, this hack is not recommended and prone do database corruption. :)

Pawel Marciniak
Yes perfect. Now that I made some tests I think that it is better for me to check for available disk space and "duplicate" the documents, or inserting and deleting one at a time. The hack you suggest is exactly the answer to my question, anyway since it is a real hack it is not suitable for me. Thanks a lot.