views:

480

answers:

5

Hi all!,

I'm building a searchfunctionallity where a user can select the different libraries to search in with the provided keywords. Also the AND operator is supported. My problem is that when I try to search for information it gets somehow multiplied or it gets increased exponential??

I'm using a MySQL database and PHP to process the results.

A sample query looks like this, sample tablenames taken for readability:

SELECT table1.id AS table1_id,
       table2.id AS table2_id,
       table1.*,
       table2.*
FROM   table1, table2
WHERE  (table1.column1 LIKE '%key%' OR table1.column2 LIKE '%key%') OR
       (table2.column1 LIKE '%key%' OR table2.column2 LIKE '%key%')

Or when a user provides keywords like 'key AND yek' the queries will look like:

SELECT table1.id AS table1_id,
       table2.id AS table2_id,
       table1.*,
       table2.*
FROM   table1, table2
WHERE  ( (table1.column1 LIKE '%key%' AND table1.column2 LIKE '%key%') OR
        (table1.colum1 LIKE '%yek%' AND table1.colum2 LIKE '%yek%') ) 
OR
       (( table2.column1 LIKE '%key%' AND table2.column2 LIKE '%key%') OR
        (table2.colum1 LIKE '%yek%' AND table2.colum2 LIKE '%yek%') )

I've searched arround and came accross UNION, but that won't do the trick. Firstly because of the tables I'm querying are not equal and it's a pretty expensive query. So also because the searchfunctionallity will be used often it's not an option.

In fact I'm querying 5 tables and per table between 2 or 4 columns. I set up a testrow in one of the tables with a unique value so I should get 1 row as result but in fact I'm getting a lot more. When I select all 5 tables I'm getting over 10K resultrows. So my guess is the result gets multiplied some way or another. I've played with the operators AND and OR but so far without any luck.

Is there someone who can help me further in my 'quest' to build a userfriendly searchengine in a webapp?

Thanks in advance.

Ben Fransen

+1  A: 

you should take a look at a real full text search engine, there is many around:

  • lucene
  • sphinx
  • Mysql: MySQL has a full text search engine but it's kind of slow so I don't advise you to use it apart if you want a quick solution
  • couchDB: that's not really a full text search engine but in your case it might helps but you might be able to create a DB from your current MySQL database and use that only for search purpose.

depending how your table are done, you might want to create a table with all the information you need.

RageZ
I'm using MySQL. I'll update my post for clearity. My guess is that this should and could be done with a MySQL query.
Ben Fransen
+1  A: 

A join will solve the problem of your dataset appearing to 'multiply'.

http://en.wikipedia.org/wiki/Join%5F%28SQL%29

This requires a field from Table1 to also be present in Table2 and used as a foreign key.

For eg, Table1 and Table2 both have ProductID:

Select * From Table1 Left Outer Join Table2 on Table1.ProductID = Table2.ProductID

tbs
A join won't do here because the 5 tables are completely different. Besides, a uses provides the dynamic query with statements because he/she can select (via checkboxes) which libraries can be used. I'm still convinced that this should and could be done by a single query without joins because they don't fit for the purpose.
Ben Fransen
+1  A: 

Some suggestions:

  1. I would suggest using (INNER?) JOIN instead of SELECT FROM table1, table2. That will narrow the number of tables. This should limit the output considerably.
  2. Look into the DISTINCT keyword (or parallels like GROUP BY), it should further limit the output
  3. DON'T rule out UNION. I don't know about MySQL, but in Oracle's SQL it is often two to three times faster than using an "OR". If you have -- Admittedly, this query is a bit defective, but I'm trying to demonstrate. SELECT table1.data FROM table1, table2 WHERE table1.data = 7 UNION SELECT table1.data FROM table1, table2 WHERE table1.data = 8; that will be far faster than using: SELECT table1.data, table2.* FROM table1, table2 WHERE table1.data = 7 OR table1.data = 8;

Overall, I think that you'll see better results using something like this (this is Oracle syntax):

SELECT 
    ti.id, 
    t1.data,
    t2.*
FROM 
    table1 t1
    JOIN
       table2 t2
       ON(
          t1.id = t2.id
       )
WHERE 
    table1.data LIKE '%foo%'
UNION 
    SELECT
       ti.id, 
       t1.data,
       t2.*
    FROM 
        table1 t1
        JOIN
           table2 t2
           ON(
              t1.id = t2.id
           )
    WHERE 
        table1.data LIKE '%bar%';

------- EDIT -------
It has been protested that "JOIN will not work as the tables are completely different". So long as there exists some key which exists that can relate one of the tables to another, JOIN will be able to create (effectively) an "uber-table" which has access to all of the data of both tables. If this index does not exist, then you'll probably need some system to concatenate the information outside of the SQL query.

Christopher W. Allen-Poole
I can't use UNION because my tables aren't the same. It are 5 different information libraries where different types of information is stored for different types of informationobjects. One of the conditions for using UNION is that the tables must have the same structure or columnnames. This could be fixed by selecting AS but there still are different number of columns to search through between the 5 tables...
Ben Fransen
If the field count you need from each table isn't the same, you could simply use the maximum numer of fields among all tables, and then just let the SELECT's on the tables with less fields return some empty fields...
allanmc
A: 

I think what you want is to use UNION, which can combine the result of different SELECT queries.

(SELECT id FROM table1 WHERE  (table1.column1 LIKE '%key%' OR table1.column2 LIKE '%key%'))
UNION 
(SELECT id FROM table2 WHERE  (table2.column1 LIKE '%key%' OR table2.column2 LIKE '%key%'))
allanmc
I mentioned in my post UNION will not do because the tables aren't the same...
Ben Fransen
@Ben: union work on a row set so if there the same number of column displayed it's just fine.
RageZ
@Ben: from the mysql documention : `SELECT REPEAT('a',1) UNION SELECT REPEAT('b',10)`
RageZ
You have a point there, but i need all data from the result, so the entire tablestructure because the layout of the searchresult will differ because of the table-type.
Ben Fransen
A: 

I found the fulltext solutions to 'heavy' for my situation. I've created a query-array where queries per table are stored and later on I've used them for displaying the data I wanted. Everybody thanks for their effort!

Ben Fransen