views:

168

answers:

3

I have nested subdirectories containing html files. For each of these html files I want to delete from the top of the file until the pattern <div id="left- This is my attempt from osx's terminal:

find . -name "*.html" -exec sed "s/.*?<div id=\"left-col/<div id=\"left-col/g" '{}' \;

I get a lot of html output in the termainal, but no files contain the substitution or are written

Thanks

A: 

You're not storing the output of sed anywhere; that's why it's spitting out the html.

dplass
+1  A: 

You're outputting the result of the sed regex to stdout, the console, when you want to be writing it to the file.

To perform find and replace with sed, use the -i flag:

find . -name "*.html" -exec sed -i "s/.*?<div id=\"left-col/<div id=\"left-col/g" '{}' \;

Make sure you backup your files before performing this command, if possible. Otherwise you risk data-loss from a mistyped regex.

fmark
hmm, I'm getting a long list of filenames and `sed: 1: "./foo/bar ...": invalid command code .` with that
Dr. Frankenstein
+3  A: 

There are two problems with your command. The first problem is that you aren't selecting an output location for sed. The second is that your sed script is not doing what you want it to do: the script you posted will look at each line and delete everything ON THAT LINE before the <div>. Lines without the <div> will be unaffected. You may want to try:

find . -name "*.html" -exec sed -i.BAK -n "/<div id=\"left-col/,$ p" {} \;

This will also automatically back up your files by appending .BAK to the original versions. If this is undesirable, change -i.BAK to simply -i.

VeeArr
Thanks, jackpot, out of interest, is there a simple way of switching this to delete after, and not before?
Dr. Frankenstein
Yes, just change the `sed` string to `"1,/yourpattern/ p"`. You can do both at the same time with `"/startpattern/,/stoppattern/ p"`.
VeeArr
THANKS - top drawer
Dr. Frankenstein