tags:

views:

145

answers:

3

Hi you guys. I am using sql server 2008 and I'm trying to build a query for displaying some overall results from a single sql table. I want to display count(fieldname) for each date, for example I want to know how often the name "izla" is repeated in the table for each date but it could be also "IZLA" or "Izla", so i must find a way to group this data together as one and find count for the three of them.

The problem is that if i try using uppercase or lowercase so that they are considered automatically the same I have the problem: when izla is converted to upper it becomes İZLA or on the other hand when IZLA is converted to lowercase it is displayed ızla.

The big question is how can i group this data together? Maybe the problem comes from using nvarchar but i need the column type to be like that (can't change it).

A: 

Try replacing ı and such with english equivalent after lowercasing

Ray
Well, I'm searching for a way that could resolve that automatically, because it could be another letter apart form "i" so you can't just write down every possible situation.
Izabela
+2  A: 

When you group, you should use an Accent Insensitive collation. You can add this directly to your group by clause. The following is an example:

Declare @Temp Table(Data nvarchar(100))

Insert Into @Temp Values(N'izla')
Insert Into @Temp Values(N'İZLA')
Insert Into @Temp Values(N'IZLA')
Insert Into @Temp Values(N'Izla')

Select  Data, 
     Count(*) 
From    @Temp 
Group By Data

Select  Data Collate Latin1_General_CI_AI, 
     Count(*) 
From    @Temp 
Group By Data Collate Latin1_General_CI_AI

When you run this example, you will see that the first query creates two rows (with count 3 and count 1). The second example uses an accent insensitve collation for the grouping, so all 4 items are grouped together.

I used Latin1_General_CI_AI in my example. I suggest you examine the collation of the column you are using and then use a collation that most closely matches by changing the AS on the end to AI.

G Mastros
Thank you! It helps a lot.
Izabela
Don't you mean Case Intensitive collation? Or am I missing something about accents?
pjp
@pjp, you're right. I changed the explanation. Thanks for pointing this out.
G Mastros
A: 

This all comes down to collation, which is the way that the system sorts string data.

You could say something like:

SELECT *, COUNT(*) OVER (PARTITION BY fieldname COLLATE Latin1_General_CI_AI), COUNT(*) OVER (PARTITION BY fieldname COLLATE Latin1_General_CI_AS)
FROM yourtable

This will provide some nice figures for you around how many times each name appeared in the various formats. There are many collations, and you can search in Books Online for a complete list. You may also be interested in Latin1_General_BIN for example.

Rob

Rob Farley