views:

38

answers:

2

I am storing documents in sql server in varbinary(max) fileds, I use filestream optionally when a user has:

(DB_Size + Docs_Size) ~> 0.8 * ExpressEdition_Max_DB_Size

I am currently zipping all the files, anyway this is done because the Document Read/Write work was developed 10 years ago where Storage was more expensive than now.

Many files when zipped are almost as big as the original (a zipped pdf is about 95% of original size). And anyway unzipping has some overhead, that becomes twice when I need also to "Check-in"/Update the file because I need to zip it.

So I was thinking of giving to the users the option to choose whether the file type will be zipped or not by providing some meaningful default values. For my experience I would impose the following rules:

1) zip by default: txt, bmp, rtf

2) do not zip by default: jpg, jpeg, Microsoft Office files, Open Office files, png, tif, tiff

Could you suggest other file types chosen among the most common or comment on the ones I listed here?

+3  A: 

.doc and .mdb files actually tend to compress rather well, if i remember correctly. The Office 2007 equivalents (.docx and .accdb), though, are zip files already...so compressing them is pretty much useless.

Don't forget HTML and XML files. Zip by default.

cHao
Thanks for the answer. I didn't know about Doc2007, good idea. I also have in mind a tool that will make statistics: so i loop through all the docs and unzip them one by one and check the compression ratio. THen I will make an average for file type and for those who go beyond a particular threshold I will zip.
+1  A: 
amphetamachine
Thanks for this. The idea of "flaccing" a wav and "zipping" a txt is very good, in my application it really makes no sense, since people is using mainly pdf/Office/txt/images, so all those compress fairly well with zip/rar algorithms. Anyway in gerenal your answer is very appropriate and can be usefule to other users. I also removed from my application the zipping of zip and rar files, before I was zipping everything, now I improved this.