views:

133

answers:

5

Hello,

I’ve got such a KeyWord table (MS SQL):

  • KeyGuid Qualifier PrimitiveKey
  • DA7E4E27-FDE5-4D43-A365-8A789164A816 tit kirkäna
  • EED58875-FE41-4A18-A93C-A44AA62CEEEE htit kirkänbh
  • A0EB795E-EE23-4990-BAB9-897C93C70CE3 htit kirkänah
  • F7F4632B-AC82-4DEB-B966-BBA8EF4D2C9E tit kirkänb
  • C0EB795E-EE23-4990-BAB9-897C93C70CE3 nam kirkänas
  • E2F4632B-AC82-4DEB-B966-BBA8EF4D2C9E nam kirkänbs
  • A222795E-EE23-4990-BAB9-897C93C70CE3 tit kirkacb
  • B333632B-AC82-4DEB-B966-BBA8EF4D2C9E tit kirkaca
  • 1222795E-EE23-4990-BAB9-897C93C70C81 htit kirkacbh
  • E533632B-AC82-4DEB-B966-BBA8EF4D2C82 htit kirkacah

This simplest query properly returns all relevant records:

select * from KeyWord where PrimitiveKey like 'kirkän%'
  • DA7E4E27-FDE5-4D43-A365-8A789164A816 tit kirkäna
  • EED58875-FE41-4A18-A93C-A44AA62CEEEE htit kirkänbh
  • A0EB795E-EE23-4990-BAB9-897C93C70CE3 htit kirkänah
  • F7F4632B-AC82-4DEB-B966-BBA8EF4D2C9E tit kirkänb
  • C0EB795E-EE23-4990-BAB9-897C93C70CE3 nam kirkänas
  • E2F4632B-AC82-4DEB-B966-BBA8EF4D2C9E nam kirkänbs

I use such a query to limit the resutls to match specific qualifiers:

select * from KeyWord where Qualifier IN ('tit', 'htit') and PrimitiveKey Like 'kirkac%'

which works perfectly:

  • A222795E-EE23-4990-BAB9-897C93C70CE3 tit kirkacb
  • B333632B-AC82-4DEB-B966-BBA8EF4D2C9E tit kirkaca
  • 1222795E-EE23-4990-BAB9-897C93C70C81 htit kirkacbh
  • E533632B-AC82-4DEB-B966-BBA8EF4D2C82 htit kirkacah

However when the phrase contains a special character like ä it does not return results:

select * from KeyWord where Qualifier IN ('tit', 'htit') and PrimitiveKey Like 'kirkän%'

nor does it with the qualifiers limited like this:

select * from KeyWord where (Qualifier = 'tit' OR Qualifier = 'htit') and PrimitiveKey Like 'kirkän%'

However it does work like this:

select * from KeyWord where (Qualifier like 'tit' OR Qualifier like 'htit') PrimitiveKey Like 'kirkän%'
  • DA7E4E27-FDE5-4D43-A365-8A789164A816 tit kirkäna
  • EED58875-FE41-4A18-A93C-A44AA62CEEEE htit kirkänbh
  • A0EB795E-EE23-4990-BAB9-897C93C70CE3 htit kirkänah
  • F7F4632B-AC82-4DEB-B966-BBA8EF4D2C9E tit kirkänb

What is wrong with the IN approach?

A: 

This is probably best handled with a specific collation setting that copes with the language-specific characters.

Here's an article on SQL Server collation that may help: http://msdn.microsoft.com/en-us/library/aa174903(SQL.80).aspx

Neil Barnwell
Collation has been checked. I have the same collation for the DB and columns of the table I am querying. Server collation is different but I checked that it is not the problem.
danme0
A: 

I think you should take a look at Full-Text Search. I know changing the collation would help here, as Neil says, but you might get some benefit out of using FTS depending on how scalable your implementation needs to be.

Codesleuth
Full-Text Search might be an option but we need a quick solution.
danme0
A: 

Maybe you need to use unicode compatible datatypes. Declaring the PrimitiveKey column as nvarchar, try prefixing the string you want to match with an 'N' like this: select * from KeyWord where (Qualifier like 'tit' OR Qualifier like 'htit') and PrimitiveKey Like N'kirkän%' .

Evan V.
Prefixing does not help. Columns are declared as 'nvarchar'.
danme0
A: 

I have done more investigation on the problem. Here's what I found.

A. The problematic query actually returns results but containing 'ae' only:

select * from KeyWord where Qualifier IN ('tit', 'htit') and PrimitiveKey Like 'kirkän%'

returns for example 'kirkaeni'.

B. If another % is included in the query (for example: Like 'ki%rkän%') the results include those expected! (this is weird) (but also those not wanted, matching the other %, of course).

C. I have tried to reproduce the problem - creating a simple DB only with two tables (the one with 'kirk' stuff has a foreign key to the other one), I have used queries creating the problematic DB as well those creating tables, I have set the same collation (German_PhoneBook_CI_AI) + I have created indexes as in the problematic DB. However the problem did not occur, so I cannot yet actually reproduce it.

Any new ideas with these symptoms?

danme0
A: 

take a look at http://msdn.microsoft.com/en-us/library/ms179886.aspx

basically LIKE operand has it:s own collation that overrides the server and column settings. However i haven't been able to figure out where or if there is a way to change this setting. The above article is a pretty tough read, but i think the most detailed explanation is at the bottom.

mark