tags:

views:

86

answers:

2

I have a lines like this

NF419andZNF773 (e=10^-92,). 
ZNF571 (e=2 10^-14,)

What's the regex for extracting the results above so that it gives

NF419andZNF773 - 10^-92
ZNF571 - 2 10^-14

I tried this but fail.

$line =~ /(\w+)\s\(e=\s(.*),\)/;
print "$1 - $2\n";
+3  A: 

You're close, the ending of your regex is failing since it expects space before the exponent. try this:

$line =~ / (\w+) \s+ \( e= ([^,]+) /x;
Eric Strom
@ES: Doesn't seem to work.
neversaint
@neversaint => I forgot the `x` modifier since I added space to the regex to make it more readable. try it now
Eric Strom
Thanks a million Eric.
neversaint
A: 

Actually you could do this all in regex, try

$line =~ s/\(\s*e\s*=\s*([^,]+),\)/-$1/

The regex matches the (e=num^exponent,) portion of your string and while doing that it captures the num^exponent (in $1) and then replaces the entire match with $1.

Jasmeet
Your regex is not returning the desired results. This will do it: $line =~ s/^(\w+) .+? = ([^,]+) , .+/$1 - $2/x
Greg
Oh, I overlooked the desired '-' insertion. To do that simply add '-' before $1 (and as many spaces as desired for formatting). I'll edit my answer to reflect that. Btw, your method of capturing text which does not need to be modified is less efficient. It is sufficient to just focus on the stuff you need to replace. Thanks for the correction though -).
Jasmeet
"...method of capturing text..." if you mean anchoring with "^", you can gain a lot of speed by anchoring, either at the start or end of the pattern. I did benchmarks on some various ways of searching, including doing index() and it helped a lot.
Greg
No I meant just focus on the part of the string you want to modify. This is common advise in several regex related texts. For the case in point your regex tries to match the text preceeding ( e=...). While to do the substitution, it is sufficient to just focus on the ( e=...) part. Capture the required text and then replace it with the text we want. Please try the regex given, it does work. Having said that, I do agree with you that providing anchors like ^, $ etc can improve performance.
Jasmeet