tags:

views:

152

answers:

4

I am trying to fetch records in MySQL using a simple used submitted field. More precisely, the user inputs a name (firstname or lastname or fullname) and the server should return matched rows.

What I am doing so far is something like:

SELECT * FROM people 
WHERE 
   firstname LIKE '%user_submitted_data%' OR 
   lastname LIKE '%user_submitted_data%'

That works well for now, but that (obviously) won't work when a user submits the fullname. Is there a way to add a OR between the whole 'WHERE type conditions' and the 'HAVING type conditions'? This way I could do something like:

SELECT [some fields], CONCAT(firstname, ' ', 'lastname') as fullname 
FROM people 
WHERE 
   firstname LIKE '%user_submitted_data%' OR 
   lastname LIKE '%user_submitted_data%' OR 
   HAVING fullname LIKE '%user_submitted_data%'

I know I could just split the original string but that has some negative impact since you have to deal with names containing spaces such as 'De Gaule' and stuff like that.

+3  A: 

Do a subquery:

SELECT [some fields]
FROM
  SELECT firstname, lastname, CONCAT(firstname, ' ', lastname) as fullname
  FROM people) AS tmp
WHERE firstname LIKE '%user_submitted_data%'
OR lastname LIKE '%user_submitted_data%'
OR fullname LIKE '%user_submitted_data%'
Zed
"Do a barrel ro^U^U^U^U subquery" (Now I've met my useless comment quota for the day.)
Juliet
Thanks, it fixes it.
Jimmy
A: 

Let's consider some possible inputs:

John
Smith
John Smith

Your initial sample query is:

SELECT * FROM people 
WHERE 
   firstname LIKE '%user_submitted_data%' OR 
   lastname LIKE '%user_submitted_data%'

Now, when the user enters the first input, this query will pick all the people whose first name contains 'John'; it will also pick all the people whose last name contains 'John' (for example, all the Johnsons in the database). Similarly, the second input will pick all the people whose first name contains 'Smith'; it will also pick all the people whose last name contains 'Smith' (for example, the Smithsons and Smithers). So far, so good; it isn't perfect because of case-sensitivity issues (I will ignore case-sensitivity from here on, but you probably should not ignore it at all), but it will be OK.

The third input will only pick the people whose first name contains 'John Smith'; it will also pick those people whose last name contains 'John Smith'. However, it is rather likely that there are very few people who meet those criteria - those people called John Smith will have just John in the first name and just Smith in the last name. This is unlikely to be what you had in mind.

It is not clear whether you have a column called 'fullname' in the table. If you do, then you can just match against that column instead of matching against the first name and last name separately. If you don't, maybe you can manufacture such a column and then run the query against that.

SELECT *
  FROM (SELECT firstname || ' ' || lastname AS fullname, ... FROM people) AS t 
 WHERE t.fullname LIKE '%user_submitted_data%'

This works reasonably well.

However, if you are worried about names such as 'Charles De Gaulle' (or 'Charles de Gaulle') or 'Michael van den Berg'), then the matching will fail if someone enters 'Charles Gaulle' or 'Michael Berg', let alone Michael Vandenberg. You would probably need to replace any space characters in the user input with a '%' symbol too. Even then, you face the problem that the words must appear in exactly the sequence given by the user - which may not matter, but you should consciously decide that it doesn't matter. For example, if the input is 'Adam John Smith', then the query won't catch 'John Adam Smith'; if the input is 'Smith, John', then it won't pick up anyone (most likely).

If you want to manage this, you probably need to tokenize the user's input, and search on the separate words. Beware of someone asking about a sub-string of a word (for example, someone asks about 'de' as a name word) - none of the queries at the moment ensures that the user input words match whole words in the values (John vs Johnson), and doing so with the SQL standard LIKE operator is near enough impossible.

Jonathan Leffler
Thanks for the in depth awnser. For some reason, it seems like MySQL is case-insensitive on my system (OSX), so it's been working all fine so far. As far as it goes with word order, the LIKE operator fits the context very well. I do not need (nor want) Johh Smith to come out when a user is searching John Adam Smith. The search sample is quite limited.
Jimmy
A: 

You can reference a computed column in the WHERE clause if you define that column in a subquery:

SELECT p.*
FROM (
  SELECT [some fields], CONCAT(firstname, ' ', 'lastname') as fullname 
  FROM people
) p
WHERE 
   p.firstname LIKE '%user_submitted_data%' OR 
   p.lastname LIKE '%user_submitted_data%' OR 
   p.fullname LIKE '%user_submitted_data%';

But honestly, for the type of search you're doing, LIKE with wildcards is a terrible solution. You should think about using a FULLTEXT index:

CREATE FULLTEXT INDEX people_names ON people(firstname, lastname);

SELECT *
FROM people
WHERE MATCH(firstname, lastname) AGAINST( ? );

PS: FULLTEXT indexes work only with the MyISAM storage engine. Another solution, even more speedy, is to use Sphinx Search for fulltext indexing.

Bill Karwin
A: 

Although using a subquery works well, it will have an impact because you are not hitting any indexes.

What about adding a computed column (firstname || ' ' || lastname) to the table and an index to it? Surely it would be much faster.

If you cannot do that I think that querying like

WHERE firstname || ' ' || lastname LIKE '%user_submitted_data%'

should still work faster than two ORs and one subquery.

Sklivvz
I don't think there's a way to do an indexed compute column in MySQL, or so show my researches. If that's doable, I'd love to hear about it.
Jimmy
You might be right, I think there is no way of using computed columns at all!?
Sklivvz