views:

8922

answers:

7

I've read that it's unwise to use ToUpper and ToLower to perform case-insensitive string comparisons, but I see no alternative when it comes to LINQ-to-SQL. The ignoreCase and CompareOptions arguments of String.Compare are ignored by LINQ-to-SQL (if you're using a case-sensitive database, you get a case-sensitive comparison even if you ask for a case-insensitive comparison). Is ToLower or ToUpper the best option here? Is one better than the other? I thought I read somewhere that ToUpper was better, but I don't know if that applies here. (I'm doing a lot of code reviews and everyone is using ToLower.)

Dim s = From row In context.Table Where String.Compare(row.Name, "test", StringComparison.InvariantCultureIgnoreCase) = 0

This translates to an SQL query that simply compares row.Name with "test" and will not return "Test" and "TEST" on a case-sensitive database.

A: 

If you pass a string that is case-insensitive into LINQ-to-SQL it will get passed into the SQL unchanged and the comparison will happen in the database. If you want to do case-insensitive string comparisons in the database all you need to to do is create a lambda expression that does the comparison and the LINQ-to-SQL provider will translate that expression into a SQL query with your string intact.

For example this LINQ query:

from user in Users
where user.Email == "[email protected]"
select user

gets translated to the following SQL by the LINQ-to-SQL provider:

SELECT [t0].[Email]
FROM [User] AS [t0]
WHERE [t0].[Email] = @p0
-- note that "@p0" is defined as nvarchar(11)
-- and is passed my value of "[email protected]"

As you can see, the string parameter will be compared in SQL which means things ought to work just the way you would expect them to.

Andrew Hare
I don't understand what you're saying. 1) Strings themselves can't be case-insensitive or case-sensitive in .NET, so I can't pass a "case-insensitive string". 2) A LINQ query basically IS a lambda expression, and that's how I'm passing my two strings, so this doesn't make any sense to me.
BlueMonkMN
I want to perform a CASE-INSENSITIVE comparison on a CASE-SENSITIVE database.
BlueMonkMN
What CASE-SENSITIVE database are you using?
Andrew Hare
Also, a LINQ query is not a lambda expression. A LINQ query is composed of several parts (most notably query operators and lambda expressions).
Andrew Hare
MS SQL Server 2005
BlueMonkMN
This answer doesn't make sense as BlueMonkMN comments.
Alf
+8  A: 

As you say, there are some important differences between ToUpper and ToLower, and only one is dependably accurate when you're trying to do case insensitive equality checks.

Ideally, the best way to do a case-insensitive equality check is:

String.Equals(row.Name, "test", StringComparison.OrdinalIgnoreCase)

Note the OrdinalIgnoreCase to make it security-safe. But exactly the type of case (in)sensitive check you use depends on what your purposes is. But in general use Equals for equality checks and Compare when you're sorting, and then pick the right StringComparison for the job.

Michael Kaplan (a recognized authority on culture and character handling such as this) has relevant posts on ToUpper vs. ToLower:

He says "String.ToUpper – Use ToUpper rather than ToLower, and specify InvariantCulture in order to pick up OS casing rules"

Andrew Arnott
It seems this doesn't apply to SQL Server:print upper('Große Straße')returnsGROßE STRAßE
BlueMonkMN
Also, the sample code you provided has the same problem as the code I provided as far as being case-sensitive when run via LINQ-to-SQL on an MS SQL 2005 database.
BlueMonkMN
I agree. Sorry I was unclear. The sample code I provided does not work with Linq2Sql as you pointed out in your original question. I was merely restating that the way you started was a great way to go -- if it only worked in this scenario.And yes, another Mike Kaplan soapbox is that SQL Server's character handling is all over the place. If you need case insensitive and can't get it any other way, I was suggesting (unclearly) that you store the data as Uppercase, and then query it as uppercase.
Andrew Arnott
...that is, do the up-case conversion in .NET and pass the transformed string to SQL both for storage and for query.
Andrew Arnott
Is there any problem with performing the UPPER within SQL server (Use ToUpper in .NET and let it translate to UPPER() in SQL) on both sides and comparing the result? I wouldn't want to have all data always appear in all uppercase just to enforce case-insensitive compares on a database that, to be honest, in all likelihood *will* be case-insensitive; I just want this logic in place in case it isn't. Also, in this scenario, is there any significant difference between UPPER and LOWER? (Any demonstrations you can provide?)
BlueMonkMN
I think if you call UPPER() for SQL both for storing and querying then you're probably OK. I would shy away from LOWER since not all unicode characters have lower-case equivalents, although all characters have uppercase representations. I don't know an example of such a case, but Michael Kaplan talks about them.
Andrew Arnott
We're not calling ToUpper or ToLower for storing data because we don't want all data to always be displayed in upper-case, but want to retain mixed case data display. Is this a problem?
BlueMonkMN
I don't understand the suggestion to use UPPER() for storing data. Why can't you just use Upper() when retrieving the data?
BlueMonkMN
Well, if you have a case sensitive database, and you store in mixed case and search in Upper case, you won't get matches. If you upcase both the data and the query in your search, then you're converting all the text you're searching over for every query, which isn't performant.
Andrew Arnott
I'm seeing some interesting results from the execution plan. When I execute the statement "select * from OITM where Upper(ItemCode) = '23-RED'"I see the execution plan is doing an index scan, a key lookup and a nested loop, which for some reason takes only takes 14% as long as a clustered index scan (alone) that occurs when I execute the very similar "select * from OITM where Upper(ItemCode) = '19-BLACK'".
BlueMonkMN
+1  A: 

LINQ to SQL will translate a Contains() to a SQL LIKE statement which is not case sensitive (at least by default, I'm not sure if there is a way to make it case sensitive). This doesn't quite do the job, but if you do Contains() and match the length on the 2 strings you are comparing then they would be the same. I'm not sure how efficient matching the lengths is though.

jarrett
This is not helpful -- first of all, "contains" will include strings that contain the text rather than exactly matching the text. Second, it is case-sensitive on a case-sensitive database, which is what I'm asking about.
BlueMonkMN
Sorry, I didn't notice the part about matching the length, but it's still case-sensitive.
BlueMonkMN
Out of curiosity, how would you make LIKE be case sensitive? This has not been the default for me ever so I'm wondering.
jarrett
Sorry, to be clear, I'm referring to MS SQL 2005 specifically and not just any database.
jarrett
I'm using an SQL server instance with a case-sensitive collation. I do this so that all SQL code that I write will function on both case-insensitive and case-sensitive servers and databases. If I use a case-insensitive server or database during development, there's a good chance I'll write some SQL statement that will work on my system, but break if it ever runs on a case-insensitive system.
BlueMonkMN
+8  A: 

I used System.Data.Linq.SqlClient.SqlMethods.Like(row.Name, "test") in my query.

This performs a case-insensitive comparison.

Andrew Davey
ha! been using linq 2 sql for several years now but hadn't seen SqlMethods until now, thanks!
Carl Hörberg
Brilliant! Could use more detail, though. Is this one of the expected uses of Like? Are there possible inputs that would cause a false positive result? Or a false negative result? The documentation on this method is lacking, where's the documentation that *will* describe the operation of the Like method?
Task
I think it just relies on how SQL Server compares the strings, which is probably configurable somewhere.
Andrew Davey
System.Data.Linq.SqlClient.SqlMethods.Like(row.Name, "test") is the same as row.Name.Contains("test"). As Andrew is saying, this depends on sql server's collation. So Like (or contains) doesn't always perform a case-insensitive comparison.
doekman
A: 

To perform case sensitive Linq to Sql queries declare ‘string’ fields to be case sensitive by specifying the server data type by using one of the following;

varchar(4000) COLLATE SQL_Latin1_General_CP1_CS_AS or nvarchar(Max) COLLATE SQL_Latin1_General_CP1_CS_AS

Note: The ‘CS’ in the above collation types means ‘Case Sensitive’.

This can be entered in the “Server Data Type” field when viewing a property using Visual Studio DBML Designer.

For more details see http://yourdotnetdesignteam.blogspot.com/2010/06/case-sensitive-linq-to-sql-queries.html

John Hansen
That's the issue. Normally the field I use is case sensitive (the chemical formula CO [carbon monoxide] is different from Co [cobalt]). However, in a specific situation (search) I want co to match both Co and CO. Defining an additional property with a different "server data type" is not legal (linq to sql only allows one property per sql column). So still no go.
doekman
A: 

I guess the best way to have a case sensitive search is to create a store procedure and perform the case sensitive search there it self. Below link provide details of how to handle the case sensitive query in SQL.

http://www.a2zmenu.com/MySql/Case-sensitive-search-in-SQL.aspx

Experts Comment