views:

333

answers:

2

Hello all,

I'm lousy at SQL queries so this may be a silly question. However, here is roughly what i'd like to do:

table corpuses //stuff i'd like to search thru title body ...

table searches //list of search terms to be applied to corpuses term
...

The query i'd like to write is more or less as follows: I beleive I need some sort of a join, but I'm not sure just how to do that. Additionally, I'm not sure that the against() operator will take anything aside from a literal - the docs didn't seem to mention either way.

select * from corpuses where match (title, body) against (select term from searches);

I'm using MySQL 5

Any thoughts are greatly appreciated.

Thanks! Brian

+1  A: 

Sounds like you need to use a FULLTEXT matching expression in your join condition.

I've never used a fulltext match in a join condition, so I'm not sure this will work, but hypothetically this might do it:

SELECT DISTINCT c.*
FROM corpuses c JOIN searches s 
  ON (MATCH(c.title, c.body) AGAINST (s.term));


Okay I've tried it using your table definitions and some sample data from the MySQL manual. Here's a query that works (tested with MySQL 5.1.30):

SELECT *
FROM corpuses 
WHERE MATCH(title, body)
  AGAINST ( (SELECT GROUP_CONCAT(term SEPARATOR ' ') FROM searches) 
    IN BOOLEAN MODE);
Bill Karwin
upon further inspection this is not actually what i needed - group concat lumps all of the terms in as one; i need to distinguish them from one another, so i think i'm back to square one. thanks for the suggestions though.
sweeney
It pays to be clear about your goals when you ask questions. Good luck.
Bill Karwin
A: 

bill, thanks for the advice however im just now noticing that the argument for against(...) may not be a variable - it must be a constant string. let me refine this question a bit...

tables:

corpuses  CREATE TABLE `corpuses` (                                                                              
            `source-id` int(11) unsigned NOT NULL,                                                               
            `title` text collate utf8_bin NOT NULL,                                                              
            `body` text collate utf8_bin NOT NULL,                                                               
            PRIMARY KEY  (`source-id`),                                                                          
            FULLTEXT KEY `NewIndex1` (`title`,`body`)                                                            
          ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC 

searches  CREATE TABLE `searches` (                                
            `id` int(11) unsigned NOT NULL auto_increment,         
            `user-id` int(11) NOT NULL,                            
            `source-id` int(11) NOT NULL,                          
            `term` varchar(11) NOT NULL,                           
            PRIMARY KEY  (`id`)                                    
          ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1

given that the argument for against(...) must be a literal what would be an efficient way of comparing all of searches.terms with all of corpuses.title and corpuses.body and returning user-id along with title and body?

i realize that this is complex operation with several different problems going on at once, so advice regarding individual portions of the whole problem is still hugely appreciated.

additionally, the searches column is likely to be very very large and corpuses table is likely to remain very very small, albeit will be updated very frequently.

i really would like to avoid having to programatically iterate over each term in searches. any thoughts?

thanks again brian

update - the query that returns the data that i am interested in is of the following form:

select `user-id`, `title`, `body` from corpuses, searches where ( (match(corpuses.title, corpuses.body) against ('search term')) and `user-id`=uid);

now what i need to do is figure out how to make this efficient assuming that i will need to execute it for many many combinations of search-term and uid.

also, it seems that stored procedures are out of the question since you cannot pass a variable into against.

sweeney