views:

40

answers:

2

The reg ex string is as follows:

px\">(.+)</SPAN

When I use this code with this expression in C#, and compare with the body of an html document, I get back a short string like so:

Match match = Regex.Match( fullText, regExString, RegexOptions.IgnoreCase );

.. gets

px">Cart is empty </span><a href="http://www.somesite.co.uk/shop/cart.aspx"&gt;&lt;span style="font-size:10px">(Refresh)</span

When I use the same expression in JavaScript on the same string, I get pretty much the whole html document back.

var re = new RegExp(regExInnerString, "i");
var m = re.exec(fullText);

... gets

THE ENTIRE HTML DOC!

Now, I know that the expression is not very specific, I am expecting several matches back. But I don't understand why c# and javascript are returning such very different strings.

Can anyone help me to control the output of the expression results so they are more consistent?

Thanks

A: 

The Regex syntax is different between those two languages. You have to, unfortunately, use two different expressions.

Bruno Brant
+2  A: 

your .+ is being greedy. Try using .+? and make it lazy so that it grabs the least amount possible. This way it will grab the first </span and not the last one

Seattle Leonard