views:

655

answers:

4

So I'm looking for a pattern like this:

size='0x0'

in a log file - but I'm only interested in large sizes (4 digits or more). The following regex works great in EditPadPro (nice tool BTW)

size='0x[0-9a-fA-F]{4,}

But the same regex does not work in awk - seems like the repetition {4,} is messing it up. Same with WinGrep - any idea from the regex gurus? Thanks!

+4  A: 

Hi Jeff, I don't know of any elegant alternatives to the {4,} syntax, but if it is not working in your desired environment you could resort to this ugly hack:

size='0x[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]+

Hope this helps!

Adam

Adam Alexander
nice - ugly hack worked great, and that's what counts, right? The nice thing for me was learning a new tool and not writing another C app!
Jeff
A: 

Don't forget the last apostrophe.

'
Keng
A: 

in awk, don't you need to escape the apostrophe in your regex? try with out it to see if that is the case.

Keng
+4  A: 

You can in fact use awk, with a caveat.

As mentioned on the following page, you need a special command-line option (--re-interval) to make it work out, since the interval expression (the {4,}) is not in the standard:

http://kansai.anesth.or.jp/gijutu/awk/gawk/gawk_28.html

So in the end, you'll want something that looks like:

awk --re-interval "/size='0x[0-9a-fA-F]{4,}'/" thefile

This will print out the lines that match.

Dan Fego
That's not 'awk'; it is GNU 'gawk', which is not the only version. Having said that, on Windows, the 'awk' is most likely from GNU, especially as it was the accepted answer, but that was not automatic (MKS has a version of awk, I believe).
Jonathan Leffler
Considering the ubiquity of the GNU utilities, I thought it was at least a good place to start. And since it worked, it look like my assumption was right. ;-)
Dan Fego