I have a dataset which contains a string key field and up to 50 keywords associated with that information. Once the data has been inserted into the database there will be very few writes (INSERTS) but mostly queries for one or more keywords.
I have read "Tagsystems: performance tests" which is MySQL based and it seems 2NF appears to be a good method for implementing this, however I was wondering if anyone had experience with doing this with SQL Server 2008 and very large datasets.
I am likely to initially have 1 million key fields which could have up to 50 keywords each.
Would a structure of
keyfield, keyword1, keyword2, ... , keyword50
be the best solution or two tables
keyid
keyfield
| 1
|
| M
keyid
keyword
Be a better idea if my queries are mostly going to be looking for results that have one or more keywords?