views:

247

answers:

1

Hi,

I have an application that uses a SQL FILESTREAM to store images. I insert a LOT of images (several millions images per days).

After a while, the machine stops responding and seem to be out of memory... Looking at the memory usage of the PC, we don't see any process taking a lot of memory (neither SQL or our application). We tried to kill our process and it didn't restore our machine... We then kill the SQL services and it didn't not restore to system. As a last resort, we even killed all processes (except the system ones) and the memory still remained high (we are looking in the task manager's performance tab). Only a reboot does the job at that point. We have tried on Win7, WinXP, Win2K3 server with always the same results.

Unfortunately, this isn't a one-shot deal, it happens every time.

Has anybody seen that kind of behaviour before? Are we doing something wrong using the SQL FILESTREAMS?

+3  A: 

You say you insert a lot of images per day. What else do you do with the images? Do you update them, many reads?

Is your file system optimized for FILESTREAMs?

How do you read out the images?

If you do a lot of updates, remember that SQL Server will not modify the filestream object but create a new one and mark the old for deletion by the garbage collector. At some time the GC will trigger and start cleaning up the old mess. The problem with FILESTREAM is that it doesn't log a lot to the transaction log and thus the GC can be seriously delayed. If this is the problem it might be solved by forcing GC more often to maintain responsiveness. This can be done using the CHECKPOINT statement.

UPDATE: You shouldn't use FILESTREAM for small files (less than 1 MB). Millions of small files will cause problems for the filesystem and the Master File Table. Use varbinary in stead. See also Designing and implementing FILESTREAM storage

UPDATE 2: If you still insist on using the FILESTREAM for storage (you shouldn't for large amounts of small files), you must at least configure the file system accordingly.

Optimize the file system for large amount of small files (use these as tips and make sure you understand what they do before you apply)

  • Change the Master File Table reservation to maximum in registry (FSUTIL.exe behavior set mftzone 4)
  • Disable 8.3 file names (fsutil.exe behavior set disable8dot3 1)
  • Disable last access update(fsutil.exe behavior set disablelastaccess 1)
  • Reboot and create a new partition
  • Format the storage volumes using a block size that will fit most of the files (2k or 4k depending on you image files).
Christian Vik
We are able to reproduce this problem by only doing INSERTS, no reads, no updates.It does not look like a managed memory issue... We have forced the garbage collector and it didn't help. It looks like it might be an OS issue since the memory of our process does not look high.The CHECKPOINT statement cleans the files that are not referenced by a FILESTREAM so it does not help us since we still want to reference our files...
mdarsigny
Response to UPDATE #1:We use the filestream for small files because we don't want a huge database file. We want the storage to be taken on the disk, not the DB... We understand that using varbinary is more efficient, but we don't want a huge DB file...
mdarsigny
Response to UPDATE #2:We already tried the tips you mentioned. What we have noticed is that it improves the file insertion time which causes the problem to occur more rapidly... :( We suspect that we are creating our files too rapidly for the file system to handle (as you pointed out in your update #1).
mdarsigny
Try to monitor the number of open file handles. You can probably view this using performance monitor or sysinternals process explorer to see this information. Perhaps even SQL server has some internal commands to see filestream stats?Regarding the number of DB files and BLOBs: It is possible to use many files and filegroups in SQL server 2005/2008.
Christian Vik
Excellent answer. I only want to add that AFAIK under the full recovery model (which you're most likely using) you need to truncate the part of the transaction log that contains delete/update operation of filestream blob in order for GC to be able to delete it. Taking log backup (actually, in most cases 2 log backups are necessary) will truncate the log and during the next CHECKPOINT GC will be able to physically delete the files.
Pawel Marciniak
We have identified the problem (and are working on a solution). As mentioned in Christian's initial response, using FILESTREAM for small files can fill up the Master File Table... Our application was writing too many files through the FILESTREAM so we were filling the file table. We reproduced the same behaviour by creating a tool that was creating files at the same rate as our application but without using the FILESTREAM. That showed us that the problem was not with SQL. Thanks.
mdarsigny