views:

56

answers:

3

I'm having some problems using sed in combination with html. The following sample illustrates the problem:

HTML="<html><body>ENTRY</body><html>"
TABLE="<table></table>"
echo $HTML | sed -e s/ENTRY/$TABLE/

This outputs:

sed: -e expression #1, char 18: unknown option to `s'

If I leave out the / from $TABLE so that it becomes <table><table> it works ok.

Any ideas on how to fix it?

Update
Here's a sample that can reproduce the problem:

template.html:

<html>
    <body>
        <table>
            ENTRIES
        </table>
    </body>
</html>

gui_template:

<tr>
  <td class="td_tut_title">TITLE</td>
  <td class="td_tut_content">
    <a href="../tutorials/GUI/FILENAME"><img src="img/bbp.png" alt="bbp" /></a>
  </td>
</tr>

genhtml.sh:

#!/bin/bash
HTML=`cat template.html`
ENTRIES=`cat gui_template | sed -e s/FILENAME/test/ | sed -e s/TITLE/title/`
DELIM=$'\377'
echo $HTML | sed -e "s${DELIM}ENTRIES${DELIM}$ENTRIES${DELIM}"

Output:

~/svn/byteblower/packages/ByteBlowerCD/htmlgen $ ./genhtml.sh 
sed: -e expression #1, char 14: unterminated `s' command
+3  A: 

Use different delimiter @ for example

echo $HTML | sed -e s@ENTRY@$TABLE@ 
Anton
That didn't work on my FreeBSD box however until I've put it in apostrophes `sed 's@ENTRY@$TABLE@'` or echoed a dollar sign `sed s@ENTRY@\$TABLE@`.
Yasir Arsanukaev
It works for the sample above, but it failed when processing an entire html file.
StackedCrooked
@StackedCrooked: How did it fail?
Anton
With exactly the same error message: "unterminated `s' command".
StackedCrooked
@StackedCrooked: Your variable contains character that is used as sed delimiter. Try using a delimiter char that is not used inside your file.
Anton
FYI I updated my post with a code sample.
StackedCrooked
@StackedCrooked: This time you have a problem with newlines in replacement. This fixed it:ENTRIES=`cat gui_template | sed -e s/FILENAME/test/ | sed -e s/TITLE/title/ | sed 's/$/\\\\n/' | tr -d '\n'`If you are not stuck to bash it maybe easier to use some other scripting language.BTW in case you don't know you can use #!/bin/bash -xto see what bash is really doing.
Anton
+1  A: 

Issuing these lines on FreeBSD console:

HTML="<html><body>ENTRY</body></html>"
TABLE="<table></table>"
echo $HTML | sed -e "s#ENTRY#$TABLE#"

Result in:

<html><body><table></table></body></html>
Yasir Arsanukaev
Thanks that works fine.
StackedCrooked
Oops, sorry I jumped to conclusions too fast. With the single quotes the variables aren't dereferenced anymore.
StackedCrooked
@StackedCrooked: `echo $HTML | sed -e "s#ENTRY#$TABLE#"` works for me.
Yasir Arsanukaev
It seems to work like this like this in a smaller sample, but not in the final script which processes a bigger html file. I'll do some more research..
StackedCrooked
A: 

You need to use a delimiter that can't appear in $TABLE, and if $TABLE is unpredictable enough this can be tricky. I'd suggest using a nonprinting character as a delimiter; it's easier to find one that's not going to show up in $TABLE and break everything. The only problem is they're harder to type in, so I'd suggest putting it in a variable and using that in the sed command:

DELIM=$'\377'
HTML="<html><body>ENTRY</body><html>"
TABLE="<table></table>"
echo "$HTML" | sed -e "s${DELIM}ENTRY${DELIM}$TABLE${DELIM}"

Note that the $'...' construct is a bash-only feature; if you need this to run under generic sh you'll have to do something messier, like DELIM="$(printf "\377")". Also, I chose \377 (that's FF in hex) because it's illegal in the UTF-8 encoding, so it should be safe if you're using UTF-8 for your HTML; if you're using something else, like Windows-1252, then \177 (the 'DEL' character) might be a safer choice.

Oh, yeah, and if you ever try to debug this with bash -x, be prepared for comedy.

Gordon Davisson
I still can't get it to work, I updated my post with a code sample.
StackedCrooked