views:

367

answers:

3

I have been trying to use a simple jQuery operation to dynamically match and store all anchor tags and their texts on the page. But I have found a weird behavior. When you are using match() or exec(), if you designate the needle as a separate RegExp object or a pattern variable, then your query matches only one instance among dozens in the haystack.

And if you designate the pattern like this

match(/needle/gi)

then it matches every instance of the needle.

Here is my code.

You can even fire up Firebug and try this code right here on this page.

var a = {'text':'','parent':[]}; 

$("a").each(function(i,n) {

    var module = $.trim($(n).text());
    a.text += module.toLowerCase() + ',' + i + ','; 

    a.parent.push($(n).parent().parent()); 

});

var stringLowerCase = 'b';

var regex = new RegExp(stringLowerCase, "gi");
//console.log(a.text);
console.log("regex 1: ", regex.exec(a.text));

var regex2 = "/" + stringLowerCase + "/";
console.log("regex 2: ", a.text.match(regex2));

console.log("regex 3: ", a.text.match(/b/gi));

For me it is returning:

regex 1:  ["b"]
regex 2: null
regex 3: ["b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b"]

Can anyone explain the root of this behavior?

EDIT: I forgot to mention that for regex1, it doesn't make any difference whether you add the flags "gi" for global and case insensitive matching. It still returns only one match.

EDIT2: SOlved my own problem. I still don't know why one regex1 matches only one instance, but I managed to match all instances using the match() and the regex1.

So..this matches all and dynamically!

var regex = new RegExp(stringLowerCase, "gi");
console.log("regex 2: ", a.text.match(regex));
A: 

regex2 is a string, not a RegExp, I had trouble too using this kind of syntax, tho i'm not really sure of the behavior.

Edit : Remebered : for regex2, JS looks for "/b/" as a needle, not "b".

Clement Herreman
But what about regex1? That should have worked. How did you end up solving your trouble?
picardo
+4  A: 

This is not unusual behaviour at all. In regex 1 you are only checking for 1 instance of it where in regex 3 you have told it to return all instances of the item by using the /gi argument.

In Regex 2 you are assuming that "/b/" === /b/ when it doesn't. "/b/" !== /b/. "/b/" is a string that is searching so if you string has "/b/" in it then it will return while /b/ means that it needs to search between the slashes so you could have "abc" and it will return "b"

I hope that helps.

EDIT:

Looking into it a little bit more, the exec methods returns the first match that it finds rather than all the matches that it finds.

EDIT:

var myRe = /ab*/g;
var str = "abbcdefabh";
var myArray;
while ((myArray = myRe.exec(str)) != null)
{
  var msg = "Found " + myArray[0] + ".  ";
  msg += "Next match starts at " + myRe.lastIndex;
  console.log(msg);
}

Having a look at it again it definitely does return the first instance that it finds. If you looped through it then would return more.

Why it does this? I have no idea...my JavaScript Kung Fu clearly isnt strong enough to answer that part

AutomatedTester
I should have made it clear. For the regx1, it doesn't make a difference when I added "gi" to the definition of the RegExp object. Try it yourself.
picardo
Do you know why it does that? I did add the g flag in the RegExp definition, so it should do a global match, right?
picardo
+2  A: 

The reason regex 2 is returning null is that you're passing "/b/" as the pattern parameter, while "b" is actually the only thing that is actually part of the pattern. The slashes are shorthand for regex, just as [ ] is for array. So if you were to replace that to just new regex("b"), you'd get one match, but only one, since you're omitting the "global+ignorecase" flags in that example. To get the same results for #2 and #3, modify accordingly:

var regex2 = stringLowerCase;
console.log("regex 2: ", a.text.match(regex2, "gi"));
console.log("regex 3: ", a.text.match(/b/gi));
David Hedlund