tags:

views:

78

answers:

2

So I have 5 rows like this

userid, col
--------------
1, a
1, b
2, c
2, d
3, e

How would I do query so it will look like this

userid, combined
1, a b
2, c d
3, e
+2  A: 

Use the GROUP_CONCAT aggregate function:

  SELECT yt.userid,
         GROUP_CONCAT(yt.col SEPARATOR ' ') AS combined
    FROM YOUR_TABLE yt
GROUP BY yt.userid

The default separator is a comma (","), so you need to specify the SEPARATOR of a single space to get the output you desire.

If you want to ensure the order of the values in the GROUP_CONCAT, use:

  SELECT yt.userid,
         GROUP_CONCAT(yt.col ORDER BY yt.col SEPARATOR ' ') AS combined
    FROM YOUR_TABLE yt
GROUP BY yt.userid
OMG Ponies
I am currently using Hive so I cant do the group by part because GROUP_CONCAT is not recognized as an aggregation function, anyway around it?
haoxu
@haoxu: Next option (from a SQL prespective) would be a cursor and string concatenation, but it would require a MySQL function. I'm not familiar with Hive - is there no way of using a native query?
OMG Ponies
Filed a JIRA request for GROUP_CONCAT in HiveQL: https://issues.apache.org/jira/browse/HIVE-1689
Jeff Hammerbacher
A: 

I'm pretty sure that you can't do this using Hive QL. However, it should be possible to do so if you write your own Map/Reduce scripts - see this tutorial to get started.

Matthew Hegarty