ansaurus

Question

Validate a string in a table in SQL Server - CLR function or T-SQL (Question updated)

Answer 1

+2 A:

CLR is faster than UDF - for this situation I would be using CLR to allow me to run regular expressions for comparisons. But PATINDEX supports limited regex syntax, so you could use:

WHERE PATINDEX('%[regex]%', t.column) > 0

...to return rows that satisfy the expression, because PATINDEX returns a number based on the first position in the string it is testing. If the value is zero, the regex isn't in the string.

OMG Ponies 2010-03-13 07:19:37

Thank you for the quick response. Actually I am stuck with writing that Regex in PATINDEX. I see a post here :- From :- http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=27205. "In a PATINDEX, or LIKE, you can use "%" for 0 or more characters, "_" for exactly one any-character, and [0-9] and [^0-9] as you would for a RegEx. But you cannot use "[0-9]*" or "[0-9]+", the use of "[0-9]" matches exactly one character. So you can use "[0-9][0-9][0-9][0-9][0-9]" to find the location of a 5 digit number, but you will struggle to NOT get mismatches on EARLIER 5 digit numbers within the string."

ydobonmai 2010-03-13 07:30:53

Answer 2

+4 A:

WHERE
    ASCII(LEFT(column, 1)) BETWEEN ASCII('a') AND ASCII('z')
    AND
    column COLLATE LATIN1_GENERAL_BIN NOT LIKE '%[^-_a-zA-Z0-9]%'

You need COLLATE to ignore accents (ä à ö etc) by default

gbn 2010-03-13 11:15:49

@gbn, thank you for your time and the answer. I have a question on the expression you wrote above. Does the expression indicate that the first character would not be a '-' followed by a '_' followed by a-z and so on? or that takes care of any order of the characters.

ydobonmai 2010-03-13 11:27:22

@Ashish Gupta: it's evaluated as - then _ then a-z then A-Z then 0-9. Finally ^ makes it negative.

gbn 2010-03-13 11:46:04

@gbn, Thanks again. Now, the thing is characters in my column values can appear in any order. So, I am not sure how I can make use of this expression. That said, thank you for "ASCII(LEFT(column, 1)) BETWEEN ASCII('a') AND ASCII('z')".

ydobonmai 2010-03-13 12:23:16

@Ashish Gupta: This does exactly what was asked for, the second part simply makes sure that the whole string only contains '_', '-', numbers and letters.

Qtax 2010-03-13 12:50:31

Though this is not the answer which solved my problem. But close and appreciate the effort. Choosing this as answer.

ydobonmai 2010-03-23 10:32:20

ansaurus

tags:

views:

answers:

Validate a string in a table in SQL Server - CLR function or T-SQL (Question updated)

related questions