views:

226

answers:

2

I have a column on a mysql table that stores mostly numeric values, but sometimes strings. It's defined as VARCHAR(20). There is an index on this column for the first four characters.

ADD INDEX `refNumber` USING BTREE(`refNumber`(4));

Since the field is mostly numeric, it is useful for the user to be able to query for values that fall within an numeric range (e.g., 100 to 2000). If I use a numeric comparison, this index is not used.

WHERE refNumber BETWEEN 100 AND 2000

If I use a string comparison, I get some values I don't want (e.g., 10000 comes back when querying for a range of 100 to 2000).

WHERE refNumber BETWEEN '100' AND '2000'

Is there a good solution to this?

Note: there are some values that are recorded with zeros padded on the front like 0234, which should be returned when looking for values between 100 and 2000.

+1  A: 

Three possibilities

1) Separate the numeric values into their own column.

2) If you MUST keep things as they are, decide on a maximum length for the numbers, zero- or blank-pad the numbers to that length.

3) I don't know if MySQL supports function-based indexes, but that might be an option. if so, write a function that returns the extracted numeric value and use that as the basis of the index.

Jim Garrison
I suspect that padding or separating numeric values may be the options I'm faced with. I'm going to wait to see if there are other suggestions.
Bernard Chen
A: 

You can try using the string comparison first, so the index is used, and still do the numeric comparison afterwards. It shouldn't slow things too much, since the second filter will only apply to a small subset of the rows.

WHERE refNumber BEETWEEN '100' AND '2000' AND CAST(refNumber as SIGNED INTEGER) BEETWEEN 100 AND 2000

danielkza
What does `cast()` do with non-numeric values in MySQL?
Jim Garrison
It's an interesting suggestion. I think the CAST does not work for non-numeric values, and it ends up tossing out those rows when using the BETWEEN operator with numeric values. I did get some warnings for numeric non-integer values though.For my particular case, I found that I had some values like 0234, which did not match the first "BETWEEN" clause with the strings. I'll edit the original question to indicate there are some values that have been padded.
Bernard Chen