views:

223

answers:

5

Why can't I output my regex to a variable, and then run regex on it a second time?

I'm writing a greasemonkey javascript that grabs some raw data, runs some regex on it, then runs some more regex on it to refine the results:

// I tried this on :: http://stackoverflow.com/
var tagsraw = (document.getElementById("subheader").innerHTML);
alert(tagsraw);

Getting the raw data (above code) works

var trimone = tagsraw.match(/title\W\W\w+\s\w+\s\w+\s\w+\s\w+/g);
alert(trimone);

running regex once works (above code); but running (code below) doesn't??

var trimtwo = trimone.match(/\s\w+\s\w+\s\w+\s\w+/g);
alert(trimtwo);

Can some advise me as to what is wrong with my code/approach?

A: 

.match should be returning an array, not a string.

lod3n
so you are saying alert(trimone); will display ok... But when I want to do some more regex, I can't because it is an array?
A: 

The problem is that match() returns an array and there is no built-in function to perform a regular expression on an array.

So instead you should be able to do this with the exec function from the Regexp object. It will return the matched string. You can grab the matched string from the first regexp and use it for the second.

So it'd be something like this:

var patt1 = new Regexp(/title\W\W\w+\s\w+\s\w+\s\w+\s\w+/g);
var trimone = patt1.exec(tagsraw);

if (trimone != null) // might be null if no match is found
{
  alert(trimone);

  var patt2 = new Regexp(/\s\w+\s\w+\s\w+\s\w+/g);
  var trimtwo = patt2.exec(trimone);
  alert(trimtwo);
}

Note that exec returns null if no match is found so be sure to handle that in your code like I do above.

Steve Wortham
I can't get this to work on stackoverflow.com// try this on :: http://stackoverflow.com/var tagsraw = (document.getElementById("subheader").innerHTML);//alert(tagsraw);var patt1 = new Regexp(/title\W\W\w+\s\w+\s\w+\s\w+\s\w+/g);var trimone = patt1.exec(tagsraw);if (trimone != null) // might be null if no match is found{// alert(trimone); var patt2 = new Regexp(/\s\w+\s\w+\s\w+\s\w+/g); var trimtwo = patt2.exec(trimone); alert(trimtwo);
A: 

Your case is better suited to using .exec. You could even chain the two if you don't care about the intermediate result:

/\s\w+\s\w+\s\w+\s\w+/g.exec(/title\W\W\w+\s\w+\s\w+\s\w+\s\w+/g.exec(tagsraw));
Brandon Belvin
+2  A: 

The reason the first match works, is because innerHTML returns a string.

However the match returns an array, thus treat it as one:

for (var i=0; i<trimone.length; i++)
{
    var trimtwo = trimone[i].match(/\s\w+\s\w+\s\w+\s\w+/g);
    alert(trimtwo);
}

Edit:

Try this code instead though, I think this is a bit closer to what you want to achieve:

var trimone = tagsraw.match(/title\s*=\s*".*"/g);
alert(trimone);

for (var i=0; i<trimone.length; i++)
{
    alert(trimone[i]);
}
Yannick M.
I don't believe this is the desired result. What he really needs is the matched string and that's exactly what the exec function from the Regexp object is for.
Steve Wortham
ok, this works, and returns an alert for each refinment. thanks Yannick
Sure thing, check out the edit. Also, please accept someone's answer if it helped you, so this topic does not remain unanswered.
Yannick M.
thankyou Yannick - your help is very much appreciated
+1  A: 

You could do something like this:

var str = "<title> foo bar baz quux blah</title>",
    re = [
        /title\W\W\w+\s\w+\s\w+\s\w+\s\w+/g,
        /\s\w+\s\w+\s\w+\s\w+/g
    ],
    tmp = [str];
for (var i=0, n=re.length; i<n; ++i) {
    tmp = tmp.map(function(val) {
        return val.match(re[i])[0];
    });
}
alert(tmp);
Gumbo
thanks gumbo, i don't understand how this works, in relation to my example.