views:

30

answers:

2

I have a Companies table with a multilingual CompanyName column (defined as nvarchar(512)).

I also have a stored procedure to search through and return a list of companies, it takes 2 nvarchar parameters - one is the search term, and the other is an ISO language code.

What I would like to be able to do is return the results of the search sorted with a collation appropriate to the language code supplied in the second parameter, for example:

SELECT * FROM dbo.Companies WHERE CompanyName LIKE '%searchstring%'
ORDER BY
    CASE @lang
        WHEN 'sv' THEN CompanyName COLLATE Sami_Sweden_Finland_100_CI_AI
        WHEN 'ch' THEN CompanyName COLLATE Chinese_Simplified_Pinyin_100_CI_AI
                ELSE CompanyName
    END

however I get the following error when I try to run it:

Cannot resolve the collation conflict between "Chinese_Simplified_Pinyin_100_CI_AI" and "Sami_Sweden_Finland_100_CI_AI" in the CASE operation.

This makes no sense to me - it's not like I'm sorting across collations, why would there be a collation conflict? It's a mutually exclusive choice.

I've tried not to be too clever and just use dynamic sql, unfortunately that appears to be leaving the database unable to cache an execution plan, hence query times are taking upwards of 20 seconds (as opposed to 1) when the table contains around 2 million rows.

I'm sure that culture sensitive sorting must be a common problem, does anyone know of a good solution that doesn't involve altering the current schema (i.e. like having to create additional columns)?

A: 

Acolumn can only have one collation. The CASE measn you're tryingt to have multiple that conflicts with each other. Try this:

SELECT * FROM dbo.Companies WHERE CompanyName LIKE '%searchstring%'
ORDER BY
    CASE @lang
        WHEN 'sv' THEN CompanyName ELSE '' END 
    END COLLATE Sami_Sweden_Finland_100_CI_AI,
    CASE @lang
        WHEN 'ch' THEN CompanyName ELSE '' END 
    END Chinese_Simplified_Pinyin_100_CI_AI
gbn
Thanks, this negates the need to use dynamic SQL (although I'm still confused as to why different collations in a single CASE statement would cause a conflict - it will only ever sort on one!). The problem now is SQL doesn't seem to pick up the indexes as part of the execution plan. In the above example I have created 2 computed columns (indexed) on the CompanyName column with Swedish and Pinyin collations respectively, which get picked up when doing an ORDER BY CompanyName with an explicit collation, but not in your query above. If there a way to explicitly tell SQL to use the indexes?
Jonathan Hoult
@Jonathan Hoult: unfortunately, no. The index is invalidated by the COLLATE clause as you'd expect.
gbn
A: 

Okay I've finally worked out the answer to this, thanks to @gbn, your solution was definitely correct given my initial definition of the question.

However it turns out that I was working under the mistaken assumption that Dynamic SQL should be avoided. To get this to work I had to add a computed (and indexed) column with a value of

CompanyName COLLATE [Collation_Name]

for every collation I needed to sort on (I realise I said in my original question that I didn't want to alter the schema but what I really meant was that I didn't want to move existing data).

When running the following query

EXEC('SELECT TOP 10 * FROM dbo.Companies
      WHERE CompanyName LIKE ''%searchstring%''
      ORDER BY CompanyName COLLATE ' + @collation)

the query engine picks up the index on the associated computed column and I get a quick response. Unfortunately when using the query provided by @gbn, the indexes created on the computed columns are ignored, with a consequent performance hit.

This is all very surprising to me as I've always believed that Dynamic SQL is an evil that should be avoided unless absolutely necessary (i.e when your query varies in a way that you can't parameterise)

Jonathan Hoult