views:

59

answers:

4

I have an application which will have approximately 25,000 records when the initial data import is complete. These records will each have 1-3 associated 'file attachments' (.doc, .pdf, etc). Can anyone give me advice on how to implement this functionality? Specifically, where would you store the files and how would you organize them?

I am reluctant to store them directly in the database, as this would result in a huge database. Does this seem like a valid concern? If so, I don't think I would want to see up to 100,000 files in a single folder either.

A: 

There is a group of people who cringe at the notation of storing files in a database. if you are one of those people, you can take an approach similar to below.

Create a table that stored all files associated with a record. For each file, create a unique key (I use GUID) to store in that table along with other file metadata (file name, size, location, users, dates, etc..). Store the files on the server.

This allows you to have a quick querying source for files and also allows you to move repositories if you need to.

Cody C
Interesting idea, And perhaps use a YYYYMM folder naming scheme so that too many files / folders don't end up in a single folder? The current legacy system has a problem where the users cannot even open up the 'parent' folder because there are 25,000 sub-folders underneath it and explorer chokes on it.
Shaun Rowan
+3  A: 

If you can use SQL Server 2008, it has the "FILESTREAM" feature. You can define a column as type FILESTREAM, and it will store the file on the filesystem (perhaps on a NAS device). You can then either read the data yourself to pass to callers, or else give the callers the file system path to the file and let them read it.

John Saunders
Good answer for SQL server 2008.
David Stratton
+1  A: 

Store them on the filesystem. (I could point to hundreds pf posts with the same advice, and from experience, you're better off in the long run even if the files are small to begin with.)

Set up a folder that the web app has read/write access to and create a page that allows users to upload to this folder in whatever logical structure makes sense.

As for the db structure, I would have a seperate table just for file attachmenswith a foreign key pointing to the main record they are associated with.

David Stratton
A: 

I'd make a decision base on what you want to do with them afterwards.

The sizes aren't really a concern, and neither are the counts. NTFS maxes out at 2^32 file entries, so 100k isn't going to sweat it. And SQL overhead will add little to the 200GB worth of data, so it's not likely that space will be a deciding factor.

The same arguments as we always have over whether you should store in the DB (locking, indexable/queryable attributes, ACID, DB security, known backup/recovery, etc.) versus the filesystem (simpler, slightly smaller, well known, can move to external storage easily, etc.) are going to be the deciding factor.

Mark Brackett