views:

3141

answers:

6

I have a text file and it requires some formatting.
I know that if you want to add a blank line above every line that matches your regexp, you can use:

sed '/regexp/{x;p;x;}'

But I'd like to add a blank line, not one line above, but two lines above the line which matches my regexp.

The pattern I'll be matching is a postal code, in the address line.

Here is (part of) the format that the file has.

dynamic line (belongs to previous business)
name of a new business
address of new business

And an example:

Languages Spoken: English
Arnold's Cove, Nfld (sub To Clarenville)
Nile Road, Arnolds Cove, NL, A0B1N0

I'd like to have the new line above the business name. Thus:

Languages Spoken: English

Arnold's Cove, Nfld (sub To Clarenville)
Nile Road, Arnolds Cove, NL, A0B1N0

Solutions in Python or Perl are good as well.

A: 

I tried

sed '/regexp/a\\n'

but it inserted two newlines. If that does not bother you, take it.

echo -e "a\nb\nc" | sed '/^a$/a\n'
a

b
c

Edit: Now that you state that you need to insert two lines above the matching regexp the suggested regex won't work.

I am not even sure if it would work at all with sed, as you need to remember past lines. Sounds like a job for a higher level language like python or perl :-)

lothar
That inserts two new lines below the pattern...
Dennis
Yup, that's what I said :-)
lothar
Thanks, I'll edit my question to include Python and Perl tags (I know a little of Python and no Perl, so unfortunately I'm still stuck.)
Dennis
+1  A: 

Here's an approach that works for Python.

import sys
def address_change( aFile ):
    address= []
    for line in aFile:
        if regex.match( line ):
            # end of the address
            print address[0]
            print 
            print address[1:]
            print line
            address= []
         else:
            address.append( line )
address_change( sys.stdin )

This allows you to reformat a complete address to your heart's content. You can expand this to create define an Address class if your formatting is complex.

S.Lott
+2  A: 
perl -ne 'END{print @x} push@x,$_; if(@x>2){splice @x,1,0,"\n" if /[[:alpha:]]\d[[:alpha:]]\s?\d[[:alpha:]]\d/;print splice @x,0,-2}'

If I cat your file into this, I get what you want... it's ugly, but you wanted shell (i.e., one-liner) :-) If I were to do this in full perl, I'd be able to clean it up a lot to make it approach readable. :-)

Tanktalus
+7  A: 

More readable Perl, and handles multiple files sanely.

#!/usr/bin/env perl
use constant LINES => 2;
my @buffer = ();
while (<>) {
    /pattern/ and unshift @buffer, "\n";
    push @buffer, $_;
    print splice @buffer, 0, -LINES;
}
continue {
    if (eof(ARGV)) {
        print @buffer;
        @buffer = ();
    }
}
ephemient
+5  A: 

Something a bit like your original approach in sed:

sed '/regexp/i\

$H
x'

The basic idea is to print everything delayed by one line (xchange the hold and pattern spaces - printing is implicit). That needs to be done because until we check whether the next line matches the regexp we don't know whether to insert a newline or not.

(The $H there is just a trick to make the last line print. It appends the last line into the hold buffer so that the final implicit print command outputs it too.)

Jukka Matilainen
Nice, you made it look so easy.
Dennis
It prints one empty line before all of lines.
Hynek -Pichi- Vychodil
Yes, it does print an empty line at the beginning, since it outputs the hold space content for all lines, and this is empty for the first line. Adding '1d' as the last command gets rid of this.
Jukka Matilainen
...as well as eliminating all output if the input is only a single line long.
ephemient
+3  A: 

Simple:

sed '1{x;d};$H;/regexp/{x;s/^/\n/;b};x'

Describe it

#!/bin/sed

# trick is juggling previous and current line in hold and pattern space

1 {         # at firs line
  x         # place first line to hold space
  d         # skip to end and avoid printing
}
$H          # append last line to hold space to force print
/regexp/ {  # regexp found (in current line - pattern space)
  x         # swap previous and current line between hold and pattern space
  s/^/\n/   # prepend line break before previous line
  b         # jump at end of script which cause print previous line
}
x           # if regexp does not match just swap previous and current line to print previous one

Edit: Little bit simpler version.

sed '$H;/regexp/{x;s/^/\n/;b};x;1d'
Hynek -Pichi- Vychodil
Hi Hynek, I tried it and it works, but can you explain it?
Dennis
Just holds two lines in memory. Previous line in `hold space` and current line in `pattern space`. When regexp found just prepend new line to `hold space`.
Hynek -Pichi- Vychodil
If the input is only a single line long, this sadly prints nothing out at all.
ephemient
@ephemient: For this pathological input use this sed '1{$!{x;d};b};$H;/c\|e/{x;s/^/\n/;b};x'
Hynek -Pichi- Vychodil