tags:

views:

52

answers:

4

In MySQL table, I have a field called "Tag", which may include multiple comma-delimited values.

I want to select rows where the Tag field contain the value of "1".

How would I write my MySQL statement so I select rows with Tag values of "1" and "1,Cars", but exclude the rows with Tag values of "17" and "17,Cars"?

The problem that I'm using the "LIKE" operator, but that causes all four of these rows to be selected.

Thanks.

+1  A: 

You would probably have to do something like

WHERE Tag LIKE "1,%"
OR Tag LIKE "%,1,%"
OR Tag LIKE "%,1"
OR Tag = "1"

This should then cover all the options.

You might want to rather have a look at using Regular Expressions

astander
+1  A: 

If you just want to select those with a 1 in them, you can use:

where colm like '1,%'
   or colm like '%,1,%'
   or colm like '%,1'
   or colm = '1'

But you should be aware that this will be a performance killer. If you ever find yourself needing to manipulate things that are smaller than a column, your database schema is set up badly. The reason why that query above will not perform well is that it is not possible to use indexes to quickly locate rows satisfying the query. It will either need a full table or index scan to get the rows.

You would be better off re-engineering the schema to break the comma-separated stuff out into rows in another table.

An example of that would be something like:

PrimaryTable:
    id          integer       primary key
    other_stuff varchar(250)
SecondaryTable:
    primary_id  integer       references PrimaryTable(id)
    int_val     integer
    char_val    varchar(20)
    primary key (primary_id,int_val)
    index       (int_val)

This will allow you to write blindingly fast queries as opposed to the slow stuff you're proposing:

select p.id, p.other_stuff
from PrimaryTable p, SecondaryTable s
where p.id = s.primary_id
  and s.int_val = 1;

(or the equivalent explicit join syntax).

The reason this solution works faster is because it can use an index on SecondaryTable.int_val to quickly retrieve the relevant rows and the primary key of both tables to cross-match.

paxdiablo
You forgot to cover the case where the list of tags contains only a single value. See @astander's answer.
Asaph
@Asaph: okay, I'll cover that one as well but my advice is really to _not_ do it that way since it's a less-than-optimal schema (being as tactful as possible there). You'd be better off bypassing that altogether and creating a decent one.
paxdiablo
@paxdiablo: Absolutely true. I didn't mean to imply that it was a good solution. I just wanted it to be a correct one. I wanted to upvote your answer but was unable to because of the omission. I think your answer could still be improved with an explanation of *why* the original way is slow and the alternative is "blindingly fast". If programmers don't have an appreciation for what it means to do a full table scan and how important indexes are in a database, they'll continue to produce schemas like the one in this question.
Asaph
A: 

You can use word boundaries with REGEX as:

SELECT TAG 
FROM TABLE
WHERE TAG REGEXP '[[:<:]]1[[:>:]]';
codaddict
+2  A: 

FIND_IN_SET will search a comma-separated list, e.g.:

SELECT FIND_IN_SET('b','a,b,c,d');

In this case, something along the lines of

SELECT tag FROM table
WHERE FIND_IN_SET('1', tag) > 0

should do the trick.

EDIT: It actually returns 0 when no match is found, so the NULL check was wrong.

bnaul
+1 for FIND_IN_SET
Rocket
Do you really need the `IS NOT NULL` at the end?
Rocket
You don't really, especially because it doesn't actually return NULL. Thanks for the heads up. But I assume you meant that you could just use WHERE FIND_IN_SET(...)? That works just as well, I just prefer the more explicit condition.
bnaul
Thanks! Works perfectly (and fast).
JMan
If that's working fast, then you have a small database :-) In which case it's a good solution. Just be aware of the limitations of per for functions in your selects as your tables grow larger.
paxdiablo