ansaurus

Question

SQL query for selecting only first occurrences of rows with same data in the first column

Answer 1

+1 A:

I think this does what you want but I'm not 100% sure. (Based on MS SQL Server too.)

create table #t
(
PKCol int identity(1,1),
Col1 varchar(200)
)

Insert Into #t
Values ('blah something')
Insert Into #t
Values ('blah something else')
Insert Into #t
Values ('foo blah')
Insert Into #t
Values ('bar blah')
Insert Into #t
Values ('foo hello')


Select t.*
From #t t
Join (
     Select min(PKCol) as 'IDToSelect'
     From #t
     Group By Left(Col1, CharIndex(space(1), col1))
)q on t.PKCol = q.IDToSelect

drop table #t

Barry 2010-06-29 08:46:49

Answer 2

+2 A:

If you meant alphabetically on column 2, here is some SQL to get those rows:

create table #tmp (
    c1 char(20),
    c2 char(20)
)
insert #tmp values ('blah','something')
insert #tmp values ('blah','somethingelse')
insert #tmp values ('foo','ahhhh')
insert #tmp values ('foo','blah')
insert #tmp values ('bar','blah')
insert #tmp values ('foo','hello')

select c1, min(c2) c2 from #tmp
group by c1

Jonathan 2010-06-29 08:54:42

Answer 3

+2 A:

Analytic request could do the trick.

Select *
from (
    Select rank(c1) over (partition by c1) as myRank, t.*
    from myTable t )
where myRank = 1

But this is only a priority 2 for the V1.3.X

http://www.h2database.com/html/roadmap.html?highlight=RANK&search=rank#firstFound

Scorpi0 2010-06-29 09:17:32

Answer 4

+1 A:

If you are interested in the fastest possible query: It's relatively important to have an index on the first column of the table. That way the query processor can scan the values from that index. Then, the fastest solution is probably to use an 'outer' query to get the distinct c1 values, plus an 'inner' or nested query to get one of the possible values of the second column:

drop table test;
create table test(c1 char(20), c2 char(20));
create index idx_c1 on test(c1);

-- insert some data (H2 specific)
insert into test select 'bl' || (x/1000), x from system_range(1, 100000); 

-- the fastest query (64 ms)
select c1, (select i.c2 from test i where i.c1=o.c1 limit 1) from test o group by c1;

-- the shortest query (385 ms)
select c1, min(c2) c2 from test group by c1;

Thomas Mueller 2010-07-03 16:44:00

ansaurus

tags:

views:

answers:

SQL query for selecting only first occurrences of rows with same data in the first column

related questions