views:

213

answers:

2

I'd like to run a query which for a given field will count the instances of a particular character.

For example if I had a table called 'Friends' with a 'Names' field containing rows: Edward, James, Mike. I'd get the output:

A 2, D 2, E 3, I 1, and so on...

+2  A: 

The generic answer is that you need to split each name into a table of constituent characters, then count those.

You don't mention which RDBMS you're using. The answer varies a bit by database engine.

For example, this will work on SQL 2005+:

DECLARE @friends TABLE (NAMES VARCHAR(30))
DECLARE @maxLen INT

INSERT @friends (NAMES)
SELECT 'Edward'
UNION SELECT 'James'
UNION SELECT 'Mike'

SELECT @maxLen = MAX(LEN(NAMES)) FROM @friends

;WITH numsCte 
AS 
( 
        --dynamic numbers table. If you have a numbers table in your database
        --use that instead, as it will be more efficient.
        SELECT 1 AS n 
        UNION ALL 
        SELECT n+1 FROM numsCte 
        WHERE n < @maxLen
)
,charCTE
AS
( 
        --split the string into a dataset
        SELECT *
        FROM numsCte AS nm
        CROSS APPLY (SELECT NAMES, SUBSTRING(NAMES, n, 1) AS splitChar 
                     FROM @friends
                    ) AS st
        WHERE splitChar > ''
)
SELECT  UPPER(splitChar) AS letter
        ,COUNT(1) AS cnt
FROM charCTE
GROUP BY splitChar
ORDER BY splitChar

But almost certainly won't work on any other database engines.

Ed Harper
+1 You could omit NAMES from the CROSS APPLY subquery
Andomar
I'm using SQL 2005 so that's perfect. Thanks!
theaxe
A: 

One way is to use a temporary table, and populate it in a WHILE loop:

declare @letters table (letter varchar(1))
declare @pos int
set @pos = 1
while 1=1
    begin
    insert into @letters
    select substring(name,@pos,1)
    from @names
    where len(name) >= @pos

    if @@rowcount = 0
     break

    set @pos = @pos + 1
    end

select letter, count(*)
from @letters
group by letter

Another way is to create a list of valid character positions in a temporary table, or as in this example, with a recursive common table expression (CTE):

declare @maxLen int
select @maxLen = max(len(name)) from @names
;WITH CharPositions (i) AS  ( 
    select 1
    union all
    select i+1 
    from CharPositions
    where i < @maxLen
)
select substring(n.name,cp.i,1), count(*)
from @names n
inner join CharPositions cp on cp.i <= len(n.name)
group by substring(n.name,cp.i,1)

I've tested the code samples against this dataset:

declare @names table (name varchar(max))
insert into @names values ('abc')
insert into @names values ('def')
insert into @names values ('def')
insert into @names values ('g')
insert into @names values ('g')
insert into @names values ('g')
Andomar