views:

161

answers:

3

I've heard that in some programming languages it is faster to check if the length of a string is 0, than to check if the content is "". Is this also true for T-SQL?

Sample:

SELECT user_id FROM users WHERE LEN(user_email) = 0

vs.

SELECT user_id FROM users WHERE user_email = ''
+4  A: 

I just tested it in a very limited scenario and execution plan ever so slightly favours comparing it to an empty string. (49% to 51%). This is working with stuff in memory though so it would probably be different if comparing against data from a table.

DECLARE @testString nvarchar(max)
SET @testString = ''

SELECT
    1
WHERE
    @testString = ''

SELECT
    1
WHERE
    LEN(@testString) = 0

Edit: This is with SQL Server 2005.

Evil Pigeon
+10  A: 

Edit You've updated your question since I first looked at it. In that example I would say that you should definitely always use

SELECT user_id FROM users WHERE user_email = ''

Not

SELECT user_id FROM users WHERE LEN(user_email) = 0

The first one will allow an index to be used. As a performance optimisation this will trump some string micro optimisation every time! To see this

SELECT * into #temp FROM [master].[dbo].[spt_values]

CREATE CLUSTERED INDEX ix ON #temp([name],[number])

SELECT [number] FROM #temp WHERE [name] = ''

SELECT [number] FROM #temp WHERE LEN([name]) = 0

Execution Plans

Execution Plans

Original Answer

In the code below (SQL Server 2008 - I "borrowed" the timing framework from @8kb's answer here) I got a slight edge for testing the length rather than the contents below when @stringToTest contained a string. They were equal timings when NULL. I probably didn't test enough to draw any firm conclusions though.

In a typical execution plan I would imagine the difference would be negligible and if you're doing that much string comparison in TSQL that it will be likely to make any significant difference you should probably be using a different language for it.

DECLARE @date DATETIME2
DECLARE @testContents INT
DECLARE @testLength INT

SET @testContents = 0
SET @testLength = 0


DECLARE 
  @count INT,
  @value INT,
  @stringToTest varchar(100)


set @stringToTest = 'jasdsdjkfhjskdhdfkjshdfkjsdehdjfk'
SET @count = 1

WHILE @count < 10000000
BEGIN

  SET @date = GETDATE()
  SELECT @value = CASE WHEN @stringToTest = '' then 1 else 0 end
  SET @testContents = @testContents + DATEDIFF(MICROSECOND, @date, GETDATE())

  SET @date = GETDATE()
  SELECT @value = CASE WHEN len(@stringToTest) = 0 then 1 else 0 end
  SET @testLength = @testLength + DATEDIFF(MICROSECOND, @date, GETDATE())

  SET @count = @count + 1
END

SELECT 
  @testContents / 1000000. AS Seconds_TestingContents, 
  @testLength / 1000000. AS Seconds_TestingLength
Martin Smith
+3  A: 

I would be careful about using LEN in a where clause as it could lead to table scans / index scans

Also note that if the field is NULLable that LEN(NULL) = NULL, so you would need to define the behaviour

-- Cost .33
select * from [table]
where itemid = ''

-- Cost .53
select * from [table]
where len(itemid) = 0

-- And in the case of a NULLable (assuming that we want to treat NULL and emtpy as the same)
select * from [table]
where len(itemid) = 0 or itemid is NULL
nonnb
+1 I just noticed that as well.
Martin Smith