views:

190

answers:

4

They will also search part of their name. Not only words with spaces. If they type "Matt", I expect to retrieve "Matthew" too.

+9  A: 

You can use:

SELECT * 
 FROM mytable 
WHERE name LIKE '%matt%'
Aaron
Isn't this very slow...
TIMEX
not for 400 records (or event for 40000 for that matter) start worry in the hundred of thousands...besides there is no better simpler option (I guess you can index everything and every word in a tree index :-DDD )
Itay Moav
@alex: Wildcarding the left side of the text you are looking for will mean that an index, if one exists on the column, can't be used.
OMG Ponies
@OMG, I would expect that any decent SQL database can select table scan 400 or ever 400,000 rows in less time that it takes to print out the results.
Paul Tomblin
If you're worried about it, create an index on that column. I don't know if that will help much, but it's worth a try. See if it makes things faster, if they truly are slow. However, I wouldn't worry about it unless that query actually proves to be slow in your tests. As has been mentioned by others, with 400 rows, it is very unlikely to be noticeable.
Aaron
Be aware too that just because an index exists, doesn't mean the optimizer will use it.
OMG Ponies
@OMG Ponies he has 400 records. B.t.w does you nick comes from http://lfgcomic.com/page/42 and no, he didn't meant to address me. I wrote exactly what he (@ Paul Tomblin) wrote. I just forgot to put the @alex at the beginning.
Itay Moav
@OMG, I though you were the one expressing concern about the fact that the indexes wouldn't be used in the LIKE query. I think @Itay was postulating something a lot more complicated than a LIKE query.
Paul Tomblin
Could always try with explain (assuming that that functionality is available in his DBMS) to see if the index is getting used.
Aaron
@ Paul Tomblin no no, I meant exactly what you meant. And suggested that the next step to increase performance of such a search is not an easy (and worth while until you have millions of users and records) or worth your time to implement.
Itay Moav
@Itay Moav: Sorry, never seen the webcomic before. I choose the nick out of internet humour sake.
OMG Ponies
@Paul Tomblin: I strive to provide answers that scale well regardless of record count. And be aware that we only have *assumptions* on the scope of searching that needs to take place.
OMG Ponies
+1  A: 

You have the following options:

  1. Full Text Search (FTS)
  2. Regular Expressions
  3. LIKE Using wildcards

...in that order of preference.

OMG Ponies
+11  A: 
SELECT * 
FROM mytable 
WHERE name LIKE 'matt%' OR name LIKE '[ ,-/]matt%'

Notes:
1) Fancy wildcard. The reason for not using the simpler LIKE '%xyz%' form is that depending on the xyz the database could return many non-relevant records. For example "Jeff Zermatt" in the case of the "Matt" search.
The brackets in the second wildcard key include all the delimiters which may be indicative of a break between words. An alternative wildcard pattern would be [^A-Z0-9] (Which may yield a few O'Brian when search for brian but maybe not a bad thing...)

2) Performance. Because there are so few records in this table, the front wildcard approach is quite feasible, and certainly the easiest approach. No reason to search any further!
If the records happen to be very wide (many fields some of them more than 30 chars in length), you can create an index on name. The front-end wildcard will still require a scan, but this will be on the index which is narrower, hence fits more readily in the cache etc.
Indeed if rather than a SELECT * this query targets only a few of the fields of the myTable table [and if this table's record are "wide"], you can create a index made of all these fields.
Would the number of records grow past, say, 50,000 (and, to a lesser degree, would the application "hit" the database with similar queries at a rate above say 40 per minute), you may consider introducing more efficient ways of dealing with keywords: Full Text Catalog or a "hand made" table with the individual keywords.

3) Advantages of another approach. The advantage of a solution whereby the application maintains a table with a list of the individual keywords, readily parsed, from the full name, doesn't only provide better scaling (when the table and/or usage grows), but also introduces improvements in the quality of the search.
For example, it may allow improving the effective recall by introducing common common nicknames of first names (Bill or Will or Billy for William, Dick for Richard, Jack or Johnny for John etc.). Another possibility open by a more sophisticated approach is the introduction of a Soundex or modified Soundex encoding of the name tokens, allowing the users to locates names even when they may mispell or ignore the precise spelling (eg. Wilmson vs. Wilmsen vs. Willmsonn etc.)

mjv
Aarrgghh soundex considered sophisticated aarrgghh bad luck about Wilmson/Wilson/Nilsen/Milson or Wilhelmsen/Vilhelmsen
John Machin
A: 

If you are trying to search for the names through any development Language, you can use the Regular expression package in Java. Some thing like java.util.regex.*;

harigm