views:

51

answers:

1

Hello,

I have a grep expression using cygwin grep on Win.

grep -a "\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u" all_fbs.txt > rockon_fbs.txt

Once I identify the emoticon class, however, I want to strip them out of the data. However, the same regexp above within a sed results in a syntax error (yes, I realize I could use /d instead of //g, but this doesn't make a difference, I still get the error.)

sed "s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g"

The full line is:

grep -a "\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u" all_fbs.txt | sed "s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g" | sed "s/^/ROCKON\t/" > rockon_fbs.txt

The result is:

sed: -e expression #1, char 14: unknown option to `s'

I know it's coming from the sed regexp I'm asking about it b/c if I remove that portion of the full line, then I get no error (but, of course, the emoticons are not filtered out).

Thanks in advance,

Steve

+1  A: 

You need to escape / otherwise it will prematurely terminate the expression.

s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g
        ^     ^     ^      ^   ^
          These need escaping.

You should also use single-quoted strings instead of double-quoted strings to prevent the backslashes being interpreted by the shell:

$ echo "\\,"
\,
$ echo '\\,'
\\,

So try this:

$ echo 'foo \m/ bar \,,/ baz' | sed 's/\(\\,,\/\|\\m\/\|\\m\/\\>\.<\/\\m\/\|:u\)*//g'
foo  bar  baz
Mark Byers
Thanks so much! The non-escape-age of the / was the problem. Much appreciated!
Steve