ansaurus

Question

Answer 1

+2 A:

Don't totally follow the question and I'm a Postgres guy, but my best guess is that you need to wrap the parameter in quotes.

$q  = mysqli_query($id,"SELECT * FROM mixes WHERE `tracklist` LIKE '%{$_REQUEST['did']}%'");

Derek Gathright 2009-08-14 15:38:41

Actually, I'm pretty sure that what he wrote is valid php. The '.' just adds strings together, so if [did] contained 'test', his query would look like "... where `tracklist` LIKE %test%".

Michael Todd 2009-08-14 15:41:32

Since the parameter may have spaces in it, encapsulating in quotes is the proper approach, as shown above.

AvatarKava 2009-08-14 15:44:03

Ah. Never seen curly braces used in php before. Thanks for the info.

Michael Todd 2009-08-14 16:07:09

Answer 2

+1 A:

You might want to have a look at the full text indexes/search in MySQL. I think the natural language mode should be able to find results for the search string regardless of word order, punctuation etc.

Note that this only works with MyISAM tables.

Tom Haigh 2009-08-14 15:50:46

trying a query like this:$q = mysqli_query($id,"SELECT * FROM mixes WHERE MATCH(tracklist) AGAINST ('".$_REQUEST['did']."')");no success the tables myisam also

2009-08-14 16:01:02

Is mixes.tracklist part of a FULLTEXT index?

outis 2009-08-14 19:42:40

"SELECT * FROM mixes WHERE MATCH(tracklist) AGAINST ('john hey johnnny');" works in my tests in that it found a row with "John - Hey (johnnny)", but it also had some false positives (eg it matched "The Ballad of John and Yoko, Johnny B. Goode, Hey Jude").

outis 2009-08-14 20:03:59

Answer 3

+1 A:

You need to wrap the like parameter in a '. You have the following:

$q  = mysqli_query($id,"SELECT * FROM mixes WHERE `tracklist` LIKE %".$_REQUEST['did']."%");

It should be:

$q  = mysqli_query($id,"SELECT * FROM mixes WHERE `tracklist` LIKE '% ".$_REQUEST['did']."%'");

Notice the two apostrophes I put in front of the first and last %'s

edit

So if you're getting the input john hey johnnny' and your looking for John - Hey (johnnny)` in the database then you're gonna have to manipulate the string you're searching for

$search = str_replace(" ", "%", $_REQUEST['did']); // john%hey%johnnny

$query = "SELECT * FROM mixes WHERE `tracklist` LIKE '".$search."'";
$q  = mysqli_query($id, $query);

The % will match any spaces or characters between the ones you need. This won't match the ending ). But it will select the row if that's good enough. You can add the % before and after the $search variable to match the ending ) and or any chars before the string you're looking for.

null 2009-08-14 17:08:25

Absolutely, but his query is still not going to work because 'john hey johnny' != 'John - Hey (johnnny)', which is actually what he's trying to search for. The only solutions I've found are pretty ugly (using sp's or functions to strip out non-alphanumeric chars from the string). Perhaps someone here will have a better solution than that.

Michael Todd 2009-08-14 18:28:11

Answer 4

A:

Some options are:

Write a function that strips non-alphanumeric and spaces and apply it to column tracklist in the search query. One big downside to this is your query won't be able to use any indices on tracklist.
Use some form of fuzzy matching. This will also most likely not take advantage of indices. There's Tom Haigh's suggestion, which will find the row in your example but may also return false positives. You could also use REGEXP/RLIKE, which don't perform fuzzy matching, but you can construct an expression that does:
```
$tracklist=preg_replace('/[^a-z]+/i', '[^[=a=]]+', $_REQUEST['did']);
$q  = mysqli_query($id,"SELECT * FROM `mixes` WHERE `tracklist` REGEXP '$tracklist'");
```
Using the example $_REQUEST['did'] of 'johnny hey johnnny', the resulting query is "SELECT * FROM mixes WHERE tracklist REGEXP 'john[^[=a=]]+hey[^[=a=]]+johnnny'". The above ignores non-alphabetic characters. If you want to ignore non-alphanumeric characters, try one of the following (the second doesn't ignore '_'):
```
$tracklist=preg_replace('/[^a-z0-9]+/i', '[^[:alnum:]]+', $_REQUEST['did']);


$tracklist=preg_replace('/\W+/', '[^[:alnum:]_]+', $_REQUEST['did']);
```
Restructure the tables and processing logic. This is the most work but will produce the conceptually cleanest implementation. I'm guessing that tracklist is a $delim (eg comma, semicolon, ...) separated list of track names. If so, mixes isn't even first normal form. Create a new tracklist table:
```
CREATE TABLE `tracklist` (
    mix INTEGER REFERENCES mixes (id),
    position INTEGER NOT NULL,
    track INTEGER REFERENCES tracks (id),
    PRIMARY KEY (mix, position),
    KEY (mix, track),
    KEY (track)
) ENGINE=InnoDB;
```
To find the mix for a given track, perform a join:
```
SELECT mixes.* FROM mixes 
    JOIN tracklist 
        ON mixes.id = tracklist.mix 
    WHERE tracklist.track=$trackid
```
Through the use of indices, the above query will be faster than the other two options. The main downside to this approach is you'll have to perform more queries to get the tracklist for a mix. An upside is it can improve tracklist editing. With appropriately defined indices, this option can still be faster than your current DB design. For example,
```
SELECT t2.mix, t2.position, tracks.*
    FROM tracklist AS t1 
    JOIN tracklist AS t2 
      ON t1.mix = t2.mix 
    JOIN tracks
      ON t2.track = tracks.id
    WHERE t1.track=:trackid
```
will be quite efficient if there are indices on tracklist.tracks, tracklist.mix and tracks.id, which is a very reasonable assumption, given the above definition of tracklist and that track.id is most likely a primary key column. MySQL only needs to examine (number of mixes containing track :trackid) + (total number of tracks in all mixes that contain :trackid) * 2 rows. If there are two mixes containing "Johnny, Hey (johnnny)", with 8 and 10 tracks respectively, MySQL will examine 38 rows total (20 from tracklist and 18 from tracks). Options 1 and 2 will need to examine (number of mixes) rows. As long as the number of mixes is much larger than the average mix size and the track you're looking for doesn't appear in many of the mixes, this option is faster than the other two.

Note that unless you sanitize $_REQUEST['did'] before constructing the query, your script is open to SQL injection ([2], [3]). The above example in option 2 that uses REGEXP is safe (sanitization is a side-effect of the preg_replace), but the safest, most general approach is to use prepared queries.

    $q = mysqli_prepare($id,"SELECT * FROM `mixes` WHERE `tracklist` REGEXP ?");
    $q->bind_param('s', $_REQUEST['did']);
    $q->execute();

You can also reuse prepared queries, which means the DB won't need to reparse the query, thus improving speed.

    function findMixFor($track) {
        global $id;
        static $q = Null;
        if (is_null($q)) {
           $q = mysqli_prepare($id,"SELECT * FROM `mixes` WHERE `tracklist` REGEXP ?");
        }
        $q->bind_param('s', $track);
        $q->execute();
        return $q;
    }

Note that findMixFor is just to illustrate reusing a prepared query. A proper implementation would isolate all the DB access in a data access layer and also handle errors.

You might also want to look into using PDO instead of mysqli. For one thing, it's support for prepared queries is much richer. For another, it makes it easy to switch databases.

outis 2009-08-14 21:27:20

Can you tell I'm bored today?

outis 2009-08-14 22:20:35

Answer 5

A:

There's a broader question - why are you trying to match this and what is the problem you are trying to solve?

I think that your problem is much deeper - you are not solving the right problem. What are you really trying to do?

Larry Watanabe 2009-08-15 01:57:52

ansaurus

tags:

views:

answers:

matching text to a row

related questions