views:

68

answers:

4

I'm generating T-SQL SELECT statements for tables for which I have no data type information up-front. In these statements, I need to perform string manipulation operations that depend on the length of the original value of the tables' columns.

One example (but not the only one) is to insert some text at a specific position in a string, including the option to insert it at the end:

SELECT 
  CASE WHEN (LEN ([t0].[Product] = 8) 
    THEN [t0].[Product] + 'test' 
    ELSE STUFF ([t0].[Product], 8, 0, 'test') 
  END
FROM [OrderItem] [t0]

(The CASE WHEN + LEN is required because STUFF doesn't allow me to insert text at the end of a string.)

The problem is that LEN excludes trailing blanks, which will ruin the calculation. I know I can use DATALENGTH, which does not exclude trailing blanks, but I can't convert the bytes returned by DATALENGTH to the characters required by STUFF because I don't know whether the Product column is of type varchar or nvarchar.

So, how can I generate a SQL statement that depends on the exact length of a string in characters without up-front information about the string data type being used?

+1  A: 

Can't you look up the type information for the columns in the system tables?

If not then to determine whether or not a column is varchar or nvarchar this would do it.

create table #test
(
c varchar(50),
n nvarchar(50)
)

insert into #test values ('1,2,3,4    ',N'1,2,3,4,5      ')

SELECT
       CASE
              WHEN datalength(CAST(c AS nvarchar(MAX))) = datalength(c)
              THEN 'c is nvarchar'
              ELSE 'c is char'
       END,
       CASE
              WHEN datalength(CAST(n AS nvarchar(MAX))) = datalength(n)
              THEN 'n is nvarchar'
              ELSE 'n is char'
       END
FROM   #test
Martin Smith
Looking up the information: probably possible, but difficult. I'm also not sure how this would perform.
Fabian Schmied
The test is also interesting, but I think I like the REPLACE hack a little better with regards to the "readability" of the resulting statement.
Fabian Schmied
@Fabien - I'm pretty certain looking up the type information will perform better than doing either a replace on every string in the column or a cast on every string.
Martin Smith
Yes, sorry, that was unclear: it's mostly difficult to get into the statement generation. But I'll look into it.
Fabian Schmied
I've tested your approach, and it works and is very fast. Still, since I've found LEN ([t0].[Product] + '#') - 1 to be just as fast and better readable, im using that.
Fabian Schmied
+2  A: 

try this:

SELECT 
  CASE WHEN (LEN (REPLACE([t0].[Product],' ', '#') = 8) 
    THEN [t0].[Product] + 'test' 
    ELSE STUFF ([t0].[Product], 8, 0, 'test') 
  END
FROM [OrderItem] [t0]
ibram
Good idea, I should've had this one myself...
Fabian Schmied
This is very inefficient, as my measurements show.
Fabian Schmied
However, a fast variation of this is fast as lightning: LEN ([t0].[Product] + '#') - 1This is about the same speed as LEN ([t0].[Product]) alone...
Fabian Schmied
+1  A: 

Use DATALENGTH and SQL_VARIANT_PROPERTY:

SELECT 
  CASE 
    WHEN 8
      = DATALENGTH([t0].[Product]) 
      / CASE SQL_VARIANT_PROPERTY([t0].[Product],'BaseType') WHEN 'nvarchar' THEN 2 ELSE 1 END
    THEN [t0].[Product] + 'test' 
    ELSE STUFF ([t0].[Product], 8, 0, 'test') 
  END
FROM [OrderItem] [t0]
Peter
I've tested this, and while it looks exactly like what I need, it doesn't work with varchar(max)/nvarchar(max) - because sql_variant can't hold such values.
Fabian Schmied
+2  A: 

Here's what I ended up using:

SELECT   
  CASE WHEN ((LEN ([t0].[Product] + '#') - 1) = 8)   
    THEN [t0].[Product] + 'test'   
    ELSE STUFF ([t0].[Product], 8, 0, 'test')   
  END  
FROM [OrderItem] [t0]  

Measurements indicate that the LEN (... + '#') - 1 trick is about the same speed as LEN (...) alone.

Thanks for all the good answers!

Fabian Schmied
+1 for a solution that works!
Peter
+1 That could be a useful trick.
Martin Smith