views:

80

answers:

3

Hi,

I have a database which contains 5 fields, which are: name, address, idcard, town, street

I then have a PHP query which accepts one parameter. In this parameter, you can enter whatever fields you like and the results must still be accurate.

For example, if I enter john doe skate street in the search box you will get all the records whose name is John Doe and they live in the street known as Skate Street.

Also, please note that the name is stored in one field in the order SURNAME NAME (I cannot change this because this is a must so please don't tell me that its better to change it... I know that :)). Obviously, I would like that the name can be entered in any order, be it NAME SURNAME or SURNAME NAME.

My idea is to first attach the first two entered parameters as the name. I will tell my client that the first 2 parameters must be the name and surname or surname and name. Following the name, he may then enter any of the order attributes in whatever order he likes, i.e.

These will be accepted:

name surname idcard town street
surname name idcard town street
name surname town idcard street
surname name address street idcard

These will not be accepted:

idcard town name surname street
town name surname idcard

for the simple reason that we will complicate the query too much (because of the name/surname reason, and them being in the same field)

If there is a way how the latter is possible without a lot of complications, I would like to hear it as well.

I would appreciate some help on this matter

Many thanks in advance

+3  A: 

The kind of search you seem to be after is not really the sweet spot for RDBMSs such as MySQL. And specifying the allowable search formats is not generally good for usability, unless it's a rather specific domain.

Keeping it very generic, the query to search three fields for "Persons Name Fake Street" could be something like this:

SELECT * FROM Users
WHERE (FirstName LIKE "%Persons%"
OR LastName LIKE "%Persons%"
OR Address LIKE "%Persons%")
AND (FirstName LIKE "%Name%"
OR LastName LIKE "%Name%"
OR Address LIKE "%Name%")
AND (FirstName LIKE "%Fake%"
OR LastName LIKE "%Fake%"
OR Address LIKE "%Fake%")
AND (FirstName LIKE "%Street%"
OR LastName LIKE "%Street%"
OR Address LIKE "%Street%")

This should find any member who has details matching those given. But isn't very elegant and will only get worse with longer queries and more fields. It's also incredibly inefficient and will struggle quickly as the table gets longer - it's unable to use indexes. It also doesn't help you get the "best" matches at the top (if there are numerous results)

A better solution might be to use MySQLs Full Text indexing by creating a separate table that can be searched using the full text index to find the relevant Users. I don't know much about this solution.

Another option may be to use an external indexing tool such as Lucene. While it adds more complexity, it allows extra functionality such as weighting of fields. So name could be seen as more important than address for example. This can also order the results in order of relevance.

The correct solution depends on your requirements (as always) but those are some ideas that might be worth investigating.

Brenton Alker
I will try to get the method you mentioned to work. As for usability and so on, this isn't an issue as this is an intranet site and therefore only their employees will be using it. Also, no additional fields will be added
Chris
Thanks for this. I have thousands of records so it was taking too long for a search to be completed. However, I read about Full Text Indexing and now the search results takes only seconds. Many thanks!
Chris
A: 

This is less than perfect but so is your design so for what it's worth:

Tokenize the imput paramater "name surname idcard town street" then concatinate the fields from the DB and do a series of like or's:

psudocode
---------
where
name+address+idcard+town+street like %input_token1% or
name+address+idcard+town+street like %input_token2% or
...
name+address+idcard+town+street like %input_token5%
fupsduck
A: 

I would try to split the words using a regex and put them into a temporary table with wich you can join your Person table.

The regex might look something like this

preg_match_all('/(\S+)*\s+/im', $input, $value, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($value[0]); $i++) {
    # Matched text = $result[0][$i];
}

a script that would find your results could look like this in SQL Server. I don't have enough MySQL knowledge to come up with an alternative in MySQL but you should get the idea.

The gist of the idea is to join with the input table on every column you wish to search for. The where clause takes care of the fact that the input values are found in at least the amount of columns as there are input values.

DECLARE @Table TABLE (Name VARCHAR(20), Surname VARCHAR(20), Address VARCHAR(20), IDCard VARCHAR(20), Town VARCHAR(20), Street VARCHAR(20))
DECLARE @Inputs TABLE (Value VARCHAR(32))

INSERT INTO @Table VALUES ('Doe', 'John', 'Dontknow', 'Dontknow', 'US', 'Skate')
INSERT INTO @Inputs VALUES ('%John%')
INSERT INTO @Inputs VALUES ('%Doe%')
INSERT INTO @Inputs VALUES ('%Skate%')

SELECT  t.*
FROM    @Table t
        LEFT OUTER JOIN @Inputs i_name ON t.Name LIKE i_name.Value     
        LEFT OUTER JOIN @Inputs i_surname ON t.SurName LIKE i_surname.Value        
        LEFT OUTER JOIN @Inputs i_address ON t.Address LIKE i_address.Value
        LEFT OUTER JOIN @Inputs i_idcard ON t.IDCard LIKE i_idcard.Value
        LEFT OUTER JOIN @Inputs i_town ON t.Town LIKE i_town.Value
        LEFT OUTER JOIN @Inputs i_street ON t.Street LIKE i_street.Value
        CROSS APPLY (SELECT inputCount = COUNT(*) FROM @Inputs) cnt
WHERE   cnt.inputCount <= 
          CASE WHEN i_name.Value IS NULL THEN 0 ELSE 1 END
          + CASE WHEN i_surname.Value IS NULL THEN 0 ELSE 1 END
          + CASE WHEN i_address.Value IS NULL THEN 0 ELSE 1 END
          + CASE WHEN i_idcard.Value IS NULL THEN 0 ELSE 1 END
          + CASE WHEN i_town.Value IS NULL THEN 0 ELSE 1 END
          + CASE WHEN i_street.Value IS NULL THEN 0 ELSE 1 END
Lieven