tags:

views:

223

answers:

5

I have the input like this

Input:

a,b,c
d,e,f
g,h,i
k,l,m
n,o,p
q,r,s

I wan to be able to concatenate the lines with a discriminator like "|"

Output:

a,b,c|d,e,f|g,h,i 
k,l,m|n,o.p|q,r,s

The file has 1million lines and I want to be able to concatenate lines like the example before.

Any ideas about how to approach this?

A: 

gawk:

BEGIN {
  state=0
}

state==0 {
  line=$0
  state=1
  next
}

state==1 {
  line=line "|" $0
  state=2
  next
}

state==2 {
  print line "|" $0
  state=0
  next
}
Ignacio Vazquez-Abrams
A: 

If Perl is fine, you can try:

$i = 1;
while(<>) {
        chomp;
        unless($i % 3)
        { print "$line\n"; $i = 1; $line = "";}
        $line .= "$_|";
        $i++;
}

to run:

perl perlfile.pl 1millionlinesfile.txt
codaddict
A: 
$ paste -sd'|' input | sed -re 's/([^|]+\|[^|]+\|[^|]+)\|/\1\n/g'

With paste, we join the lines together, and then sed dices them up. The pattern grabs runs of 3 pipe-terminated fields and replaces their respective final pipes with newlines.

With Perl:

#! /usr/bin/perl -ln

push @a => $_;
if (@a == 3) {
  print join "|" => @a;
  @a = ();
}

END { print join "|" => @a if @a }
Greg Bacon
not saying it will happen but , what if OP's data contains "|" itself? then the sed regex will mess up.
ghostdog74
+1  A: 

@OP, if you want to group them for every 3 records,

$ awk 'ORS=(NR%3==0)?"\n":"|"' file
a,b,c|d,e,f|g,h,i
k,l,m|n,o,p|q,r,s

with Perl,

$ perl -lne 'print $_ if $\ = ($. % 3 == 0) ? "\n" : "|"' file
a,b,c|d,e,f|g,h,i
k,l,m|n,o,p|q,r,s
ghostdog74
Righteous awk fu. +1
Norman Ramsey
+1  A: 

Since your tags include sed here is a way to use it:

sed 'N;N;s/\n/|/g' datafile
Dennis Williamson
will there be a problem if there are only 2 lines?
Dyno Fu
I'm not sure I understand, but if you mean you want the result to be that every two lines (instead of three) are merged into one so you get "a,b,c|d,e,f" then just use one "N" like this: `sed 'N;s/\n/|/g' datafile`
Dennis Williamson
@Dyno, if OP wants to concat every 3 lines, and he has 2 lines only in his file, the above will not have effect. (if OP still wants to concat these 2 lines together with "|")
ghostdog74