views:

208

answers:

7

Specifically, Sql Server 2005/T-Sql. I have a field that is mostly a series of two characters, and they're all supposed to be upper case but there's some legacy data that predates the current DB/System, and I need to figure out which records are in violation of the upper casing covenant.

I thought this would work:

select * from tbl where ascii(field1) <> ascii(upper(field1))

And indeed it returned me a handful of records. They've since been corrected, and now that query returns no data. But I've got people telling me there is still mixed case data in the DB, and I just found an example: 'FS' and 'Fs' are both reporting the same ascii value.

Why is this approach flawed? What is a better way to go about this, or how can I fix this approach to work correctly?

+2  A: 

The ascii() function will only return the ascii number for the first character in an expression if you pass it a multiple character string. To do the comparison you want you need to look at individual characters, not entire fields.

cmsjr
beat me to it :)
Allain Lalonde
Gah, I misread the Ascii help. Thanks.
peacedog
np, but I think @tehvan has a good point that you could omit the usage of ASCII completely.
cmsjr
I need to find the specific records that are wrong so I can report back which ones are changed.
peacedog
You can also use COLLATE Latin1_General_CS_AS in the query to force case sensitivity for just that one column in that one query.
Ian Varley
+1  A: 

The ASCII() function returns only the ASCII code value of the leftmost character of a character expression. Use UPPER() instead.

tehvan
+5  A: 

if all the date should have been in upper case just do an update

update tbl
set field1 = upper(field1)

but to answer your original question this query should give you the results that you expect:

select * from tbl
where field1 COLLATE Latin1_General_CS_AS <> upper(field1)

Edit: just noticed that the suggestion to use COLLATE was also posted by Ian

kristof
I agree, that's a better approach and I didn't consider it.
peacedog
Thanks for the hat tip ... :)
Ian Varley
+3  A: 

ASCII is only comparing the first letter. You'd have to compare each letter, or change the database collation to be case sensitive.

You can change collation on an entire database level, or just on one column for a specific query, so:

SELECT myColumn 
  FROM myTable  
  WHERE myColumn COLLATE Latin1_General_CS_AS <> upper(myColumn)
Ian Varley
+1 I think that answers the original question
kristof
+1  A: 

This might work:

select * from tbl 
where cast(field1 as varbinary(256)) <> cast(upper(field1) as varbinary(256))
Chris Pebble
That does work as it turns out. Nice.
peacedog
+1  A: 

The methods described at Case sensitive search in SQL Server queries might be useful to you.

WileCau
Very much so, thanks!
peacedog
A: 

According to the documentation for ASCII(), it only returns the leftmost character.

I think you're going about this wrong. You could simply:

select * from tbl where field1 <> upper(field1)

if the collation rules were set correctly, so why not fix the collation rules? If you can't change them permanently, try:

select * from tbl where
          (field1 collate Latin1_General_CS_AS)
  <> upper(field1 collate Latin1_General_CS_AS)
geocar