views:

27

answers:

2

I have a column TypeCode varchar(20) that has values like '1', '2', '3', 'FOO', 'BAR'. I need to find the maximum integer which is less than the value of parameter. Something like this:

select max(TypeCode) TypeCode
    from table1 a
    left join table2 b on b.table1id = a.id
        and b.TypeCode not in ('FOO', 'BAR')
    where b.TypeCode < @MaxType

Which works most of the time, but in some queries SQL Server decides to convert it to something like this (according to the query plan).

select max(TypeCode) TypeCode
    from table1 a
    left join table2 b on b.table1id = a.id
        and b.TypeCode < @MaxType
        and b.TypeCode not in ('FOO', 'BAR')

That query obviously produces the following error:

Conversion failed when converting the varchar value 'FOO' to data type int.

I tried creating a view of table2 without the 'FOO' and 'BAR' values and joining the view instead but the query plan is still the same.

Do you know of a way to prevent the optimizer from changing the query?

PS: I know the design of the table is not the best but this is a legacy database and I can't change it.

A: 

There's no way to turn off the optimizer, and you probably don't really want to anyway. A better solution is probably to modify the query.

You can probably make this work with a CASE statement -- something like:

SELECT MAX(TypeCode) TypeCode
FROM table1 a
    LEFT JOIN table2 b ON b.table1id = a.id
WHERE CASE WHEN b.TypeCode IN ('FOO', 'BAR') THEN 99999999999 ELSE CAST(b.TypeCode AS int) END < @MaxType

You'll need to check the performance of this to see if it's acceptable, but it should work.

mwigdahl
+2  A: 

The real problem here is that you are combining character and integer semantics in the same query.

There are three possible solutions that I can think of:

  1. Change the @MaxType parameter to be a varchar(9). If you are only comparing against single-digit numbers, alphabetical order is OK. Otherwise, this won't work.

  2. Use a CASE or ISNULL(NULLIF(...)) construct in the WHERE statement. This will work, but it is not sargable and will cause the optimizer to ignore any indexes you have on the type code. Not great.

  3. Created a persisted, computed, nullable integer column (i.e. TypeCodeID) and index it separately. Put the CASE as the column's expression. This will take some extra data/index space, but if you want good performance, that is the best way to go. Then instead of writing NOT IN ('Foo', 'Bar') you can just write the first condition (TypeCode < @MaxType) because the rows with Foo and Bar in the TypeCode column will have NULL in the new TypeCodeID column.

I suppose there is also a fourth answer which is to change your design, and this would be the best idea, if it's possible. If a column can contain character data then you really shouldn't be trying to perform numerical comparisons. I have a strong suspicion that the character data exists in this column because it's coming from users and isn't validated/sanitized properly, so it's got garbage values like N/A or Unknown in there. If that is the case, your DB really should be enforcing data integrity, that's what a DBMS is for. I know you say you "can't change it", but I don't think this answer would be complete if I didn't recommend against such dangerous practices.

Aaronaught