views:

143

answers:

2

Looking for a simple SQL (PostgreSQL) regular expression or similar solution (maybe soundex) that will allow a flexible search. So that dashes, spaces and such are omitted during the search. As part of the search and only the raw characters are searched in the table.:

Currently using:

SELECT * FROM Productions WHERE part_no ~* '%search_term%'

If user types UTR-1 it fails to bring up UTR1 or UTR 1 stored in the database.

But the matches do not happen when a part_no has a dash and the user omits this character (or vice versa)

EXAMPLE search for part UTR-1 should find all matches below.

UTR1 
UTR --1 
UTR  1

any suggestions...

A: 
SELECT  *
FROM    Productions
WHERE   REGEXP_REPLACE(part_no, '[^[:alnum:]]', '') = REGEXP_REPLACE('UTR-1', '[^[:alnum:]]', '')

Create an index on REGEXP_REPLACE(part_no, '[^[:alnum:]]', '') for this to work fast.

Quassnoi
+1  A: 

You may well find the offical, built-in (from 8.3 at least) fulltext search capabilities in postrgesql worth looking at:

http://www.postgresql.org/docs/8.3/static/textsearch.html

For example:

It is possible for the parser to produce overlapping tokens from the 
same of text. 

As an example, a hyphenated word will be reported both as the entire word 
and as each component: 

SELECT alias, description, token FROM ts_debug('foo-bar-beta1');

      alias      |               description                |     token     
-----------------+------------------------------------------+---------------
 numhword        | Hyphenated word, letters and digits      | foo-bar-beta1
 hword_asciipart | Hyphenated word part, all ASCII          | foo
 blank           | Space symbols                            | -
 hword_asciipart | Hyphenated word part, all ASCII          | bar
 blank           | Space symbols                            | -
 hword_numpart   | Hyphenated word part, letters and digits | beta1
davek