views:

45

answers:

2

Suppose I have two tab-delimited files that share a column. Both files have a header line that gives a label to each column. What's an easy way to take the union of the two tables, i.e. take the columns from A and B, but do so according to the value of column K?

for example, table A might be:

employee_id  name
123   john
124   mary

and table B might be:

employee_id  age
124  18
123  22

then the union based on column 1 of table A ("employee_id") should yield the table:

employee_id  name  age
123  john  22
124  mary  18

i'd like to do this using Unix utilities, like "cut" etc. how can this be done?

+4  A: 

here's a start. I leave you to format the headers as needed

$ awk 'NR>1{a[$1]=a[$1]" "$2}END{for(i in a)print a[i],i}' tableA.txt tableB.txt
 age employee_id
 john 22 123
 mary 18 124

another way

$ join <(sort tableA.txt) <(sort tableB.txt)
123 john 22
124 mary 18
employee_id name age

experiment with the join options when needed (see info page or man page)

ghostdog74
+4  A: 

you can use the join utility, but your files need to be sorted first.

join file1 file2

man join for more information

bluesmoon