views:

59

answers:

1

Friends,

We will be undertaking a knowledge preservation project for scanning more than 1 million books. We need some suggestions on implementing database for storing and retrieving metadata as well as use it for tracking the scanning status of each object (book)

  1. Can you guys suggest should we go for SQL or NoSQL (The metadata could vary from project to project say this project could have 15 fields)

  2. We are thinking something based on Lucene/Solr or some Scalable RDF database

  3. Any open source solution where we have the ability to define custom metadata fields and store information with a search feature?

A: 

Disclaimer: Never attempted this type of project

I have seen very good performance from MSSQL server's "Filestream" type. It uses the NTFS file APIs for storing binary data, and keeps a pointer in the rows of your table.

If you have no structure on the metadata you could use XML, but if you do have a repeating structure shove it into relation data and then you can use indexing etc. to help you get your performance.

Filestream Type

Spence