views:

503

answers:

7

I need to add the ability for users of my software to select records by character ranges.
How can I write a query that returns all widgets from a table whose name falls in the range Ba-Bi for example?

Currently I'm using greater than and less than operators, so the above example would become:

select * from widget
where name >= 'ba' and name < 'bj'

Notice how I have "incremented" the last character of the upper bound from i to j so that "bike" would not be left out.

Is there a generic way to find the next character after a given character based on the field's collation or would it be safer to create a second condition?

select * from widget
where name >= 'ba'
and (name < 'bi' or name like 'bi%')

My application needs to support localization. How sensitive is this kind of query to different character sets?

I also need to support both MSSQL and Oracle. What are my options for ensuring that character casing is ignored no matter what language appears in the data?

A: 

For MSSQL see this thread: http://bytes.com/forum/thread483570.html .

For Oracle, it depends on your Oracle version, as Oracle 10 now supports regex(p) like queries: http://www.psoug.org/reference/regexp.html (search for regexp_like ) and see this article: http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/rischert_regexp_pt1.html

HTH

Zsolt Botykai
A: 

Frustratingly, the Oracle substring function is SUBSTR(), whilst it SQL-Server it's SUBSTRING().

You could write a simple wrapper around one or both of them so that they share the same function name + prototype.

Then you can just use

MY_SUBSTRING(name, 2) >= 'ba' AND MY_SUBSTRING(name, 2) <= 'bi'

or similar.

cagcowboy
A: 

You could use this...

select * from widget
where name Like 'b[a-i]%'

This will match any row where the name starts with b, the second character is in the range a to i, and any other characters follow.

G Mastros
I believe, at least in MySQL, that would be "rlike", not "like".
Nouveau
What about if the user selects the range Ba-Ci or even Abc-Def?
Nathan Baulch
A: 

I think that I'd go with something simple like appending a high-sorting string to the end of the upper bound. Something like:

select * from widgetwhere name >= 'ba' and name <= 'bi'||'~'

I'm not sure that would survive EBCDIC conversion though

David Aldridge
+3  A: 

Let's skip directly to localization. Would you say "aa" >= "ba" ? Probably not, but that is where it sorts in Sweden. Also, you simply can't assume that you can ignore casing in any language. Casing is explicitly language-dependent, with the most common example being Turkish: uppercase i is İ. Lowercase I is ı.

Now, your SQL DB defines the result of <, == etc by a "collation order". This is definitely language specific. So, you should explicitly control this, for every query. A Turkish collation order will put those i's where they belong (in Turkish). You can't rely on the default collation.

As for the "increment part", don't bother. Stick to >= and <=.

MSalters
A: 

Let's skip directly to localization. Would you say "aa" >= "ba" ? Probably not, but that is where it sorts in Sweden. Also, you simply can't assume that you can ignore casing in any language. Casing is explicitly language-dependent, with the most common example being Turkish: uppercase i is İ. Lowercase I is ı.

Now, your SQL DB defines the result of <, == etc by a "collation order". This is definitely language specific. So, you should explicitly control this, for every query. A Turkish collation order will put those i's where they belong (in Turkish). You can't rely on the default collation.

As for the "increment part", don't bother. Stick to >= and <=.

MSalters
A: 

You could also do it like this:

select * from widget
where left(name, 2) between 'ba' and 'bi'

If your criteria length changes (as you seemed to indicate in a comment you left), the query would need to have the length as an input also:

declare @CriteriaLength int
set @CriteriaLength = 4
select * from widget
where left(name, @CriteriaLength) between 'baaa' and 'bike'
Carlton Jenke