tags:

views:

75

answers:

4

I'm querying across two dbs separated by a legacy application. When the app encounters characters like, 'ü', '’', 'ó' they are replaced by a '?'.
So to match messages, I've been using a bunch of 'replace' calls like so:

(replace(replace(replace(replace(replace(replace(lower(substring([Content],1,153)) , '’', '?'),'ü','?'),'ó','?'), 'é','?'),'á','?'), 'ñ','?'))

Over a couple thousand records, this can (as you expect) is very slow. There is probably a better way to do this. Thanks for telling me what it is.

A: 

Why not first do the same replace (chars to "?") on the string you are searching for in the app side using regular expressions? E.g. your SQL server query that was passed a raw string to search for and used these nested replace() calls will instead be passed a search string already containing "?"s by your app code.

DVK
That doesn't really answer his question question. If anything, you have to do the same replace on both sides, server column and passed in string.
Codewerks
@Codewerks - my understanding of the OP is that the server column already has "replaced" values.
DVK
+1  A: 

One thing you can do is implement a RegEx Replace function as a SQL assembly and call is as a user-defined function on your column instead of the Replace() calls. Could be faster. You also want to probably to the same RegEx Replace on your passed in query values. TSQL Regular Expression

Codewerks
I like the idea of doing the column first, i mean, i could even keep the million replaces but do it once, store that in a temp table and then do compares to the other databases table via the temp. it should reduce the number of compares significantly.
Irwin
A: 

Could you convert the strings to varbinary before comparing? Something like the below:

declare 
   @Test varbinary (100)
   ,@Test2 varbinary (100)
select 
   @Test = convert(varbinary(100),'abcu')
   ,@Test2 = convert(varbinary(100),'abcü')

select 
   case
      when @Test <> @Test2 then 'NO MATCH'
      else 'MATCH'
   end
Sylvia
how is this going to help with the compare per character?
Irwin
A: 

You could create a persisted computed column on the same table where the [Content] column is. Alternatively, you can probably speed up the replace by creating a user defined function in C# using a StringBuilder. And you can even combine both of these solutions.

[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static SqlString LegacyReplace(SqlString value)
{
    if(value.IsNull) return value;
    string s = value.Value;
    int l = Math.Min(s.Length, 153);
    var sb = new StringBuilder(s, 0, l, l);
    sb.Replace('’', '?');
    sb.Replace('ü', '?');
    // etc...
    return new SqlString(sb.ToString());
}
Pent Ploompuu
i think this answer is pretty cool and i've used some elements of it in my solution. thanks
Irwin