tags:

views:

5285

answers:

4

I'm trying to use egrep w/ a regex pattern to match whitespace. I've used RegEx w/ perl and C# before and they both support the patter \s to search for whitespace. egrep (or at least the version I'm using) does not seem to support this pattern. In a few articles online I've come across a shorthand [[:space:]], but this does not seem to work. Any help is appreciated.

Using: SunOS 5.10

A: 
$ cat > file
this line has whitespace
thislinedoesnthave
$ egrep [[:space:]] file 
this line has whitespace

Works under debian.

For Solaris, isn't there an "eselect" like (see gentoo) or alternatives file to set default your egrep version?

Have you tried grep -E, because if the egrep that is on your path is not the good one, maybe grep is.

Aif
You might get some credit if you explained where 'here' was. It presumably wasn't Solaris 10. Or, if it was Solaris 10, then it probably wasn't /usr/bin/egrep that you used.
Jonathan Leffler
A: 

Maybe you should protect the pattern with quotes (if bash, or anything equivalent for the shell you are using).

[ and ] may have special meaning for the shell.

rjack
No, not the issue.
Jonathan Leffler
+2  A: 

I see the same issue on SunOS 5.10. /usr/bin/egrep does not support extended regular expressions.

Try using /usr/xpg4/bin/egrep:

$ echo 'this line has whitespace
thislinedoesnthave' | /usr/xpg4/bin/egrep '[[:space:]]'
this line has whitespace

Another option might be to just use perl:

$ echo 'this line has whitespace
thislinedoesnthave' | perl -ne 'chomp;print "$_\n" if /[[:space:]]/'
this line has whitespace
Jon Ericson
I don't understand why you say you have the same issue, look like it works with egrep .. ?
Aif
The default egrep does not support advanced character sets like [[:space:]]. You need to either change your PATH or call out the absolute path as I did above.
Jon Ericson
works calling out full path. Thank you for your help!
+3  A: 

If you're using 'degraded' versions of grep (I quote the term because most UNIX'es I work on still use the original REs, not those fancy ones with "\s" or "[[:space:]]" :-), you can just revert to the lowest form of RE.

For example, if :space: is defined as spaces and tabs, just use:

egrep '[ ^I]' file

That ^I is an actual tab character, not the two characters ^ and I.

This is assuming :space: is defined as tabs and spaces, otherwise adjust the choices within the [] characters.

The advantage of using degraded REs is that they should work on all platforms (at least for ASCII; Unicode or non-English languages may have different rules but I rarely find a need).

paxdiablo