views:

197

answers:

4

Hi

I have a file like the following:

1,
cake:01351
12,
bun:1063
scone:13581
biscuit:1931
14,
jelly:1385

I need to convert it so that when a number is read at the start of a line it is combined with the line beneath it, but if there is no number at the start the line is left as is. This would be the output that I need:

1,cake:01351
12,bun:1063
scone:13581
biscuit:1931
14,jelly:1385

Having a lot of trouble achieving this with sed, it seems it may not be the best way for what I think should be quite simple, any suggestions greatly appreciated, thanks!

+9  A: 

A very basic sed implementation:

sed -e '/^[0-9]/{N;s/\n//;}'

This relies on the first character on only the 'number' lines being a number (as you specified).

It

  • matches lines starting with a number, ^[0-9]
  • brings in the next line, N
  • deletes the embedded newline, s/\n//
martin clayton
vote up for a nice explanation.
Anders
A: 

This is a file on my intranet. I can't recall where I found the handy sed one-liner. You might find something if you search for 'sed one-liner'


Have you ever needed to combine lines of text, but it's too tedious to do it by hand.

For example, imagine that we have a text file with hundreds of lines which look like this:

14/04/2003,10:27:47,0
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
14/04/2003,10:30:51,600
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.010,0.975,0.005
14/04/2003,10:34:02,600
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.010,0.975,0.005

Each date (14/04/2003) is the start of a data record, and it continues on the next four lines.

We would like to input this to Excel as a 'comma separated value' file, and see each record in its own row.

In our example, we need to append any line starting with a G or I to the preceding line, and insert a comma, so as to produce the following:

14/04/2003,10:27:47,0,IdVg,3.000,-1.000,0.050,0.006,GmMax,0.011,0.975,0.005,IdVg,3.000,...  
14/04/2003,10:30:51,600,IdVg,3.000,-1.000,0.050,0.006,GmMax,0.011,0.975,0.0005,IdVg,3.000,...
14/04/2003,10:34:02,600,IdVg,3.000,-1.000,0.050,0.006,GmMax,0.011,0.975,0.0005,IdVg,3.000,...

This is a classic application of a 'regular expression' and, once again, sed comes to the rescue.

The editing can be done with a single sed command:

sed -e :a -e '$!N;s/\n\([GI]\)/,\1/;ta' -e 'P;D' filename >newfilename

I didn't say it would be obvious, or easy, did I?

This is the kind of command you write down somewhere for the rare occasions when you need it.

pavium
A: 

Try a regular expression, such as:

sed '/[0-9]\+,/{N}s/\n//)'

That checks the first line for a number (0-9) and a comma, then replaces the new line with nothing, removing it.

pagboy
That only checks for a single-digit number. You need `[0-9]\+`
Dennis Williamson
Ah, didn't catch that. Fixed.
pagboy
A: 
$ awk 'ORS= /^[0-9]+,$/?" ":"\n"' file
1, cake:01351
12, bun:1063
scone:13581
biscuit:1931
14, jelly:1385
ghostdog74