views:

746

answers:

5

Given this linq query against an EF data context:

var customers = data.Customers.Where(c => c.EmailDomain.StartsWith(term))

You’d expect it to produce SQL like this, right?

SELECT {cols} FROM Customers WHERE EmailDomain LIKE @term+’%’

Well, actually, it does something like this:

SELECT {cols} FROM Customer WHERE ((CAST(CHARINDEX(@term, EmailDomain) AS int)) = 1)

Do you know why?

Also, replacing the Where selector to:

c => c.EmailDomain.Substring(0, term.Length) == term

it runs 10 times faster but still produces some pretty yucky SQL.

NOTE: Linq to SQL correctly translates StartsWith into Like {term}%, and nHibernate has a dedicated LikeExpression.

+2  A: 

The reason is that CharIndex is a lot faster and cleaner for SQL to perform than LIKE. The reason is, that you can have some crazy "LIKE" clauses. Example:

SELECT * FROM Customer WHERE EmailDomain LIKE 'abc%de%sss%'

But, the "CHARINDEX" function (which is basically "IndexOf") ONLY handles finding the first instance of a set of characters... no wildcards are allowed.

So, there's your answer :)

EDIT: I just wanted to add that I encourage people to use CHARINDEX in their SQL queries for things that they didn't need "LIKE" for. It is important to note though that in SQL Server 2000... a "Text" field can use the LIKE method, but not CHARINDEX.

Timothy Khouri
Whilst a Like can be slower than a CharIndex, it is much faster when used in the context of StartsWith (i.e. Like {term}%). In this case, the query optimiser is able to make use of existing indices.
thatismatt
+1  A: 

Performance seems to be about equal between LIKE and CHARINDEX, so that should not be the reason. See here or here for some discussion. Also the CAST is very weird because CHARINDEX returns an int.

edosoft
A: 

charindex returns the location of the first term within the second term.

sql starts with 1 as the first location (0 = not found)

http://msdn.microsoft.com/en-us/library/ms186323.aspx

i don't know why it uses that syntax but that's how it works

marshall
A: 

I agree that it is no faster, I was retrieving tens of thousands of rows from our database with the letter i the name. I did find however that you need to use > rather than = ... so use

{cols} FROM Customer WHERE ((CAST(CHARINDEX(@term, EmailDomain) AS int)) > 0)

rather than

{cols} FROM Customer WHERE ((CAST(CHARINDEX(@term, EmailDomain) AS int)) = 1)

Here are my two tests ....

select * from members where surname like '%i%' --12 seconds

select * from sc4_persons where ((CAST(CHARINDEX('i', surname) AS int)) > 0) --12 seconds

select * from sc4_persons where ((CAST(CHARINDEX('i', surname) AS int)) = 1) --too few results

A: 

I don't know about MS SQL server but on SQL server compact LIKE 'foo%' is thousands time faster than CHARINDEX, if you have INDEX on seach column. And now I'm sitting and pulling my hair out how to force it use LIKE.

http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/1b835b94-7259-4284-a2a6-3d5ebda76e4b

Broken Pipe