views:

226

answers:

1

I'm trying to calculate a ts_rank for a full-text match where some of the terms in the query may not be in the ts_vector against which it is being matched. I would like the rank to be higher in a match where more words match. Seems pretty simple?

Because not all of the terms have to match, I have to | the operands, to give a query such as to_tsquery('one|two|three') (if it was &, all would have to match).

The problem is, the rank value seems to be the same no matter how many words match. In other words, it's maxing rather than multiplying the clauses.

select ts_rank('one two three'::tsvector, to_tsquery('one')); gives 0.0607927.

select ts_rank('one two three'::tsvector, to_tsquery('one|two|three|four')); gives the expected lower value of 0.0455945 because 'four' is not the vector.

But select ts_rank('one two three'::tsvector, to_tsquery('one|two'));

gives 0.0607927 and likewise

select ts_rank('one two three'::tsvector, to_tsquery('one|two|three'));

gives 0.0607927

I would like the result of ts_rank to be higher if more terms match.

Possible?

To counter one possible response: I cannot calculate all possible subsequences of the search query as intersections and then union them all in a query because I am going to be working with large queries. I'm sure there are plenty of arguments against this anyway!

Edit: I'm aware of ts_rank_cd but it does not solve the above problem.

+1  A: 

The conclusion that I have come to is to & the items together for the ranking. In my select query (with which I'm doing the search) the items are |ed. This seems to work.

Joe