30 million distinct phrases, not documents, ranging from one word to a 10 word sentence and I need to support word/phrase searching. Basically what where contains(phrase, "'book' or 'stack overflow'") offers.
I have an instance of SQL Server 2005 (32 bit, 4 proc, 4gb) going against several full text catalogs and performance is awful for word searches with high cardinality.
Here are my thoughts to speed things up, perhaps someone can offer guidance--
1) Upgrade to 2008 iFTS, 64bit. Sql Server 2005 FTS's windows service is never more than 50mb. From what I have gathered, it uses the file system cache for looking up catalog indexes. My populated catalogs on disk are only around 300mb, so why can't this all be in memory? Might iFTS's new memory architecture, which is part of the sqlserver process help here?
2) Scale out the catalogs to several servers. Will the queries to the linked FTS servers run in parallel?
3) Since I'm searching phrases here and not documents, maybe Sql Server's Full Text Search isn't the answer. Lucene.NET? Put the catalog index on a ram drive?