They will also search part of their name. Not only words with spaces. If they type "Matt", I expect to retrieve "Matthew" too.
views:
190answers:
4You have the following options:
- Full Text Search (FTS)
- Regular Expressions
LIKE
Using wildcards
...in that order of preference.
SELECT *
FROM mytable
WHERE name LIKE 'matt%' OR name LIKE '[ ,-/]matt%'
Notes:
1) Fancy wildcard. The reason for not using the simpler LIKE '%xyz%' form is that depending on the xyz the database could return many non-relevant records. For example "Jeff Zermatt" in the case of the "Matt" search.
The brackets in the second wildcard key include all the delimiters which may be indicative of a break between words. An alternative wildcard pattern would be [^A-Z0-9]
(Which may yield a few O'Brian
when search for brian
but maybe not a bad thing...)
2) Performance. Because there are so few records in this table, the front wildcard approach is quite feasible, and certainly the easiest approach. No reason to search any further!
If the records happen to be very wide (many fields some of them more than 30 chars in length), you can create an index on name
. The front-end wildcard will still require a scan, but this will be on the index which is narrower, hence fits more readily in the cache etc.
Indeed if rather than a SELECT *
this query targets only a few of the fields of the myTable table [and if this table's record are "wide"], you can create a index made of all these fields.
Would the number of records grow past, say, 50,000 (and, to a lesser degree, would the application "hit" the database with similar queries at a rate above say 40 per minute), you may consider introducing more efficient ways of dealing with keywords: Full Text Catalog or a "hand made" table with the individual keywords.
3) Advantages of another approach. The advantage of a solution whereby the application maintains a table with a list of the individual keywords, readily parsed, from the full name, doesn't only provide better scaling (when the table and/or usage grows), but also introduces improvements in the quality of the search.
For example, it may allow improving the effective recall by introducing common
common nicknames of first names (Bill or Will or Billy for William, Dick for Richard, Jack or Johnny for John etc.). Another possibility open by a more sophisticated approach is the introduction of a Soundex or modified Soundex encoding of the name tokens, allowing the users to locates names even when they may mispell or ignore the precise spelling (eg. Wilmson vs. Wilmsen vs. Willmsonn etc.)
If you are trying to search for the names through any development Language, you can use the Regular expression package in Java. Some thing like java.util.regex.*;