views:

158

answers:

3

Why can't I match the string

"1234567-1234567890"

with the given regular expression

\d{7}-\d{10}

with egrep from the shell like this:

egrep \d{7}-\d{10} file

?

+5  A: 

egrep doesn't recognize \d shorthand for digit character class, so you need to use e.g. [0-9].

Moreover, while it's not absolutely necessary in this case, it's good habit to quote the regex to prevent misinterpretation by the shell. Thus, something like this should work:

egrep '[0-9]{7}-[0-9]{10}' file

See also

References

polygenelubricants
Actually he only needs to quote the regex if it contains shell meta-characters. And now that it no longer contains backslashes, it doesn't, so quoting is optional.
sepp2k
Tried; doesn't work
@sepp2k: do you need quote for a space? I think you do. I guess you can argue that a space is a shell metacharacter. Anyway I think it's best to always quote, ala it's best to always use curly braces.
polygenelubricants
Then how would be with grep instead; I'm interested in \d prefix?!
@persistent: according to comparison chart I linked, neither POSIX ERE (egrep) nor POSIX BRE (grep) knows `\d`, `\s`, `\w`, `\b`, etc. Also `\d` is not a prefix; it's a shorthand for the digit character class supported by many but not all flavors.
polygenelubricants
Well that's odd; then where they're specified if not inside (e)grep ?
It's not prefix; inside the sintax it must be specified as a prefix before the brackets; thx for the correction anyway ;)
@persistent: different flavors of regex does things differently, that's why it's important to mention which flavor you're using when asking regex questions, etc. I'll guess that Perl popularized the `\d` shorthand, and everyone else followed later.
polygenelubricants
@polygenelubricants: Yes, you need quotes with spaces (or put a backslash before every space). And sure, it doesn't hurt to always quote.
sepp2k
So I should skip \d
@persistent: you can't use `\d` with grep/egrep; you can use its expanded form `[0-9]` which is practically the same thing, but slightly longer. In some flavors that supports Unicode, `\d` is not the same as `[0-9]` because it also includes some other Unicode digit characters.
polygenelubricants
Well mate; I already knew for the [block]{} form ; I was interested in \d; thx
+4  A: 

Use [0-9] instead of \d. egrep doesn't know \d.

sepp2k
+1; incorporating this into my answer.
polygenelubricants
How it'd be with grep?
@persistent: The same.
sepp2k
@sepp2k: According to http://www.regular-expressions.info/gnu.html, repetition in `grep` is `\{7\}`.
polygenelubricants
@polygenelubricants: Sure, I thought he was asking about \d.
sepp2k
+1  A: 

try this one:

egrep '(\d{7}-\d{10})' file
Nikhil Jain
Unfortunately nope :(
Traditional egrep did not support the { metacharacter, and some egrep implementations support \{ instead, so portable scripts should avoid { in egrep patterns and should use [{] to match a literal {.
Nikhil Jain
@Nikhil: However neither traditional egrep nor GNU egrep support \d and that's why this does not work - not because of the {. Though it'd be useful to keep the { thing in mind if you ever have to be compatible with traditional egrep.
sepp2k