views:

70

answers:

1

I have a MySQL table in which a column contains string prefixes. For instance these prefixes could be top-level directories on an Unix file system:

my_table:    
+---------+
| prefix  |
+---------+
|  /usr/  |
|  /bin/  |
|  /var/  |
|  /lib/  |
+---------+

How can I write a query that efficiently finds all rows in this table where the value of the prefix column is the beginning of a given string?

For instance given the string '/usr/bin/cat' how can I write a query that finds the row containing '/usr/' which is the beginning of '/usr/bin/cat'.

My first guess is to use LIKE this way:

SELECT * FROM my_table
WHERE '/usr/bin/cat' LIKE CONCAT(prefix, '%')

But I'm afraid this query won't be using the index I have on the prefix column.

I also came up with the following:

SELECT * FROM my_table
WHERE prefix <= '/usr/bin/cat' ORDER BY prefix DESC LIMIT 1

Which retrieves the prefix equal to or immediately preceding '/usr/bin/cat' in lexicographical order. I can then verify whether that prefix actually begins with '/usr/bin/cat' or not.

But that only works with a single row and I wonder if that's the optimal solution.

Edit: I used root directories as an example but I'd like to know if there's a way to deal with arbitrary strings as well. Perhaps these strings won't contain path separators or the prefix could be several level deep. Say: '/usr/lib'.

Edit: It seems that my second query is bogus. '/usr/' is smaller than '/usr/bin/cat' but so is '/usr/a'. That query is still much faster than a full table scan on a large table but to make it work I have to fetch more rows and go through them until I find the first actual prefix.

So it seems an index can help in this kind of prefix search but I still don't know the best way to take advantage of it.

+1  A: 

Replace ? with your string.

SELECT *
FROM my_table
WHERE prefix = LEFT(?, LOCATE('/', ?, '2'))

You're right in that you want to keep the column on the left side of the expression in order to use the index on your WHERE clause. You can do some manipulation on the string to get the constant to compare to.

Alternatively, can you truncate the string in your application?

Edit

Just one solution of many if you want it to work for any prefix:

SELECT *
FROM my_table
WHERE prefix = LEFT(?, LENGTH(prefix))

However, since the right side of the WHERE clause is not a constant, but a function on the column, MySQL will have to scan every row. It won't use the index on prefix to satisfy the WHERE clause.

Ideally, you want a column on the left side and a constant on the right.

Marcus Adams
It's the perfect answer in the case of filesystem paths but is there a good way of *finding columns containing the beginning of a string* in other cases a well? I may not know where to cut the string in advance.
Alexandre Jasmin
@Alexandre, I updated my answer.
Marcus Adams
@Marcus What about the second query in my question? I believe it uses indexes and it actually has a constant on the right side of the comparison in the WHERE clause.
Alexandre Jasmin