We are building a solution for document storage and for each document we need to store a lot of extra metadata with it to comply with local regulations, ranging from basic data like title or description to dates of relevant events or disposition and classification rules.
I've seen different types of solutions, but none convinces me:
- Tables that grow in columns when a new metadata slot is added (so they have as many columns as metadata associated with the documents)
- Tables with a lot of spare generic columns. Very similar to 1. but the tables don't grow (less permissions)
- A table of document ids, metadata keys and metadata values.
- A table with metadata definitions and metadata keys in 3. are substituted by metadata ids. We used this solution in the past. The tables have millions of rows at the end.
- A text field in the document table or associated table that stores a XML or other structured information with all the metadata in key-value pairs.
I'm biased towards number 5, providing a parallel full-text index (Lucene.Net? Other?) to search by relevant metadata (not everything has to be "searchable").
Any suggestion? Similar experiences?