views:

69

answers:

2

how to delete all lines below a word except last line in a file. suppose i have a file which contains

| 02/04/2010 07:24:20 | 20-24 |         26 |       13 |        2.60 | 
| 02/04/2010 07:24:25 | 25-29 |          6 |        3 |        0.60 | 
+---------------------+-------+------------+----------+-------------+

02-04-2010-07:24 --- ER GW 03

+---------------------+-------+------------+----------+-------------+
| date                | sec   | BOTH_MO_MT | MO_or_MT | TPS_PER_SEC |
+---------------------+-------+------------+----------+-------------+
| 02/04/2010 07:00:00 | 00-04 |         28 |       14 |        2.80 | 
| 02/04/2010 07:00:05 | 05-09 |         27 |       14 |        2.70 | 
...
...
...
...
END OF TPS PER 5 REPORT

and i need to delete all contents from "02-04-2010-07:24 --- ER GW 03" except "END OF TPS PER 5 REPORT" and save the file. This has to be done for around 700 files. all files are same format, with datemonthday filename.

+1  A: 
sed -ni '/ER GW/ b end; p; d; :end $p; n; b end' $file

$file should be the filename. E.g.:

for file in *.txt ; do
    sed -ni '/ER GW/ b end; p; d; :end $p; n; b end' $file
done
Matthew Flaschen
i'm getting "sed: can't read -: No such file or directory"
akvikram
thanx a lot.. Its working.. but i dont want the line "02-04-2010-07:24 --- ER GW 03" to be printed in the file.
akvikram
@akvikram, I modified it.
Matthew Flaschen
great.. thanx a lot.. it is working fine.. :)
akvikram
A: 

The following awk script will do it:

awk '
    /^02-04-2010-07:24 --- ER GW 03$/ {skip=1}
                                      {ln=$0;if (skip!=1){print}}
    END                               {if (skip==1){print $ln}}'

as shown in the following transcript:

$ echo '| 02/04/2010 07:24:20 | 20-24 |         26 |       13 |        2.60 |
| 02/04/2010 07:24:25 | 25-29 |          6 |        3 |        0.60 |
+---------------------+-------+------------+----------+-------------+

02-04-2010-07:24 --- ER GW 03

+---------------------+-------+------------+----------+-------------+
| date                | sec   | BOTH_MO_MT | MO_or_MT | TPS_PER_SEC |
+---------------------+-------+------------+----------+-------------+
| 02/04/2010 07:00:00 | 00-04 |         28 |       14 |        2.80 |
| 02/04/2010 07:00:05 | 05-09 |         27 |       14 |        2.70 |
...
...
...
...
END OF TPS PER 5 REPORT' | awk '
    /^02-04-2010-07:24 --- ER GW 03$/ {skip=1}
    {ln=$0;if (skip!=1){print}}
    END {if (skip==1){print $ln}}'

which produces:

| 02/04/2010 07:24:20 | 20-24 |         26 |       13 |        2.60 |
| 02/04/2010 07:24:25 | 25-29 |          6 |        3 |        0.60 |
+---------------------+-------+------------+----------+-------------+

END OF TPS PER 5 REPORT

as requested.

Breaking it down:

  • skip is initially 0 (false).
  • if you find a line you want to start skipping from, set skip to 1 (true) - change this pattern where necessary.
  • if skip is false, output the line.
  • regardless of skip, store the last line.
  • at the end, is skip is true, output the last line (sjip check prevents double print).

For doing it to multiple files, you can just use for:

for fspec in *.txt ; do
    awk 'blah blah' <${fspec} >${fspec}.new
done

The command required for your update in the comment (searching for "--- ER GW 03") is:

awk '
    /--- ER GW 03/ {skip=1}
                   {ln=$0;if (skip!=1){print}}
    END            {if (skip==1){print $ln}}'
paxdiablo
thank you, it is working, but only for "02-04-2010-07:24 --- ER GW 03" condition. for all the files the timestamp varies ie "??-04-2010-??:?? --- ER GW 03" the constant parameter for all files is "--- ER GW 03". so is there a way to grep this constant parameter and delete all lines from it.
akvikram
@akvikram : yes, see the update.
paxdiablo