ansaurus

Question

Answer 1

+2 A:

Let's pretend there are 20 new topics & 100 new posts per day. Which would you choose? What if the number of topics/posts per day way twice that? Five times that? Ten times? Does your decision of one vs. the other change?

That's about 36,000 posts a year. Doesn't matter. Probably doesn't matter with ten times that, even on a cheap machine.

However, you might want a third table containing an explicit tsvector combining topic and body-text together. You can then use the built-in weighting system and run one search to provide the sort of search people generally expect on forums etc. That will mean writing custom triggers to update your tsvector when either source table is changed.

Richard Huxton 2009-10-30 10:42:54

Answer 2

+2 A:

Normally I'd go with storing the tsvector in a field, because that will also give you usable access to things like headline() and rank().

Magnus Hagander 2009-10-30 10:43:57

Answer 3

+2 A:

Using Option 1 will not make your searches more slow.

The GIN index will be used regardless of whether you created in on instantiated column or computed expression.

You just need to change the query syntax:

SELECT  *
FROM    posts
WHERE   TO_TSVECTOR('english', title) @@ myquery

in the first case, or

SELECT  *
FROM    posts
WHERE   title_vector @@ myquery

in the second case.

You probably can save a little time when using TS_RANK over the instantiated column.

Quassnoi 2009-10-30 11:26:07

Hmm, maybe I misread something in the manual then, because I was left with the impression that it would be quicker to do something like option 2.

Matt Huggins 2009-10-30 19:02:41

If you are doing operations with the `TS_VECTOR` itself, like `TS_RANK`, then option `2` will be faster. If you only use `TS_VECTOR` for searching, the performance will be the same. The persisted column will not even be referenced by the optimizer.

Quassnoi 2009-10-30 19:34:58

Good to know. I'm going to want to sort by rank, so I'll likely end up using option 2. Thanks!

Matt Huggins 2009-10-30 22:56:30

ansaurus

tags:

views:

answers:

Dynamic or column-ized tsvector index?

related questions