views:

195

answers:

5

Looking for suggestions on file system management tools. We have several terabytes of images, pdfs, excel sheets, etc.

We're looking at some sort of software that will help us to manage, archive, etc the images.

We don't store all the files information in a database but some are and we were hoping to maybe find an app that could help us integrate the archive process into the database.

Thank you!

A: 

If some of the data is saved on disks perhaps a search application is more appropriate. You can use google, microsoft search or similar program.

Dror Helper
A: 

Some database products (e.g., Oracle) offer file system-like storage that you can put files into. Since it's an Oracle-managed file system, you have all the Oracle backup and management facilities. Since it's a file system, you just use ordinary OS tools like cp to move files in and out of it.

The best practice is to avoid wasting RDBMS on large BLOBS of data that the RDMBS can't use. use the database for names, dates and stuff it handles well. The actual image file or spreadsheet file can be left in ordinary filesystem world. If you do this, you don't have a lot of effort or complexity -- you're just collecting essential information on your files.

You don't duplicate storage (the spreadsheet is only an ordinary file). You don't put large objects in the database that can't be processed by the database.

The file system is faster, simpler and more reliable than the database. Feel free to use it for bulk storage. The database has cool search capabilities. Use the database for just that.

S.Lott
+2  A: 

I have always seen these guys in the trade magazines. http://www.dtsearch.com/ I believe they've been around long before even Google. Not sure if this is what you're looking for.

Optimal Solutions
A: 

to clarify i guess i should say all the files are on file servers but there are references to some of them in the DB (upload logs, etc) so we were just hoping maybe there were some tools that would let us set it so that if it archived a file in a certain directory it could run some sort of sql command so the database would be updated to know the file was archived.

but thanks for the info. I think we're just going to have to roll-our-own in this case.

AlanBarber
A: 

You could run a job periodically to list the files that have been added to the file system since the last time the job was run. On Windows, this batch file would list all files and folders in archivedirectory so that you can compare the list to the last time it was run.

cd archivedirectory
del oldlist.txt
rename newlist.txt oldlist.txt
dir /s /b > newlist.txt

If you install diffutils on Windows, you can use the standard diff tool to list the new files. To isolate the new files:

diff oldlist.txt newlist.txt > newfiles.txt

Any lines in newfiles.txt starting with > should now give you the new files. (You could use grep and sed to trim it down even more. Windows versions available from gnuwin32) You should now be able to run further operations on this file, perhaps in some language such as Python, C# or Java, to add information to the database.

Liam