ansaurus

Question

Answer 1

A:

maybe check out lucene (or Zend_Search_Lucene in php). it's very nice FTS engine.

jspcal 2009-12-25 02:35:53

Answer 2

A:

For 150k articles, you must have a few hundred million rows in the words_articles table. This is manageable, as long as you configure MySQL properly.

A few tips:

Make sure your tables are MyISAM, not InnoDB.
Drop the id field in the words_articles table and make (word_id, article_id) the primary key. Also, create separate indexes for word_id and article_id in the words_articles table:
```
ALTER TABLE words_articles
DROP PRIMARY KEY,
ADD PRIMARY KEY (word_id, article_id),
ADD INDEX (word_id),
ADD INDEX (article_id);
```
(doing everything in a single alter statement gives much better performance).
Create an index for word in the words table:
```
ALTER TABLE words ADD INDEX (word);
```
Tweak my.cnf. Specifically, increase the buffer sizes (especially key_buffer_size). my-huge.cnf might be a good starting point.

Can Berk Güder 2009-12-25 03:12:30

Word lists for a lot of articles - document-term matrix