views:

321

answers:

5

I have 6 tables, let's call them a,b,c,d,e,f. Now I want to search all the colums (except the ID columns) of all tables for a certain word, let's say 'Joe'. What I did was, I made INNER JOINS over all the tables and then used LIKE to search the columns.

INNER JOIN ... ON INNER JOIN ... ON.......etc. WHERE a.firstname ~* 'Joe' OR a.lastname ~* 'Joe' OR b.favorite_food ~* 'Joe' OR c.job ~* 'Joe'.......etc.

The results are correct, I get all the colums I was looking for. But I also get some kind of cartesian product, I get 2 or more lines with almost the same results.

How can i avoid this? I want so have each line only once, since the results should appear on a web search.

UPDATE

I first tried to figure out if the SELECT DISTINCT thing would work by using this statement: pastie.org/970959 But it still gives me a cartesian product. What's wrong with this?

+2  A: 

try SELECT DISTINCT?

oedo
Even if it is SQL Server?
hgulyan
ms sql server still has select distinct, does it not?
oedo
Thanks for the hint.
flhe
It seems that there is distinct.
hgulyan
+2  A: 

On what condition do you JOIN this tables? Do you have foreign keys or something?

Maybe you should find that word on each table separately?

hgulyan
I have relations between the tables, so I do INNER JOIN to the person_has_job table and from there to the job table.
flhe
A: 

What kind of server are you using? Microsoft SQL Server has a full-text index feature (I think others have something like this too) which lets you search for keywords in a much less resource-intensive way.

Also consider using UNION instead of joining the tables.

molnarm
It's a postgresql server.
flhe
A: 

Without seeing your tables, I can only really assume what's going on here is you have a one-to-many relationship somewhere. You probably want to do everything in a subquery, select out the distinct IDs, then get the data you want to display by ID. Something like:

SELECT a.*, b.*
FROM (SELECT DISTINCT a.ID
      FROM ...
      INNER JOIN ...
      INNER JOIN ...
      WHERE ...) x
INNER JOIN a ON x.ID = a.ID
INNER JOIN b ON x.ID = b.ID

A couple of things to note, however:

  • This is going to be sloooow and you probably want to use full-text search instead (if your RDBMS supports it).

  • It may be faster to search each table separately rather than to join everything in a Cartesian product first and then filter with ORs.

lc
That may be the solution, i'll try it. Thanks!
flhe
Okay, I'll have a look at postgresql full-text search then.
flhe
I first tried to figure out if the SELECT DISTINCT thing would work by using this statement:http://pastie.org/970959But it still gives me a cartesian product. What's wrong with this?
flhe
@flhe Take a look then at your JOINs - you probably have multiple matching records somewhere. For example maybe a categories table or something?
lc
A: 

If your tables are entity type tables, for example a being persons and b being companies, I don't think you can avoid a cartesian product if you search for the results in this way (single query).

You say you want to search all the tables for a certain word, but you probably want to separate the results into the corresponding types. Right? Otherwise a web search would not make much sense. So if you seach for 'Joe', you want to see persons containing the name 'Joe' and for example the company named 'Joe's gym'. Since you are searching for different entities so you should split the search into different queries.

If you really want to do this in one query, you will have to change your database structure to accommodate. You will need some form of 'search table' containing an entity ID (PK) and entity type, and a list of keywords you want that entity to be found with. For example:

EntityType, EntityID, Keywords
------------------------------
Person,     4,        'Joe', 'Doe'
Company,    12,       'Joe''s Gym', 'Gym'

Something like that?

However it's different when your search returns only one type of entity, say a Person, and you want to return the Persons for which you get a hit on that keyword (in any related table to that Person). Then you will need to select all the fields you want to show and group by them, leaving out the fields in which you are searching. Including them inevitably leads to a cartesian product.

I'm just brainstorming here, by the way. It hope it's helpful.

Cloud