views:

66

answers:

2

I need to find the most efficient way of matching multiple regular expressions on a single block of text. To give an example of what I need, consider a block of text:

"Hello World what a beautiful day"

I want to replace Hello with "Bye" and "World" with Universe. I can always do this in a loop ofcourse, using something like String.replace functions availiable in various languages.

However, I could have a huge block of text with multiple string patterns, that I need to match and replace.

I was wondering if I can use Regular Expressions to do this efficiently or do I have to use a Parser like LALR.

I need to do this in JavaScript, so if anyone knows tools that can get it done, it would be appreciated.

+2  A: 

You can pass a function to replace:

var hello = "Hello World what a beautiful day";
hello.replace(/Hello|World/g, function ($0, $1, $2) // $3, $4... $n for captures
{
    if ($0 == "Hello")
        return "Bye";
    else if ($0 == "World")
        return "Universe";
});

// Output: "Bye Universe what a beautiful day";
Andy E
Just a note, Andy E: You need a `)` before your `;` on the last line :)
macek
@smotchkkiss: Yeah, I noticed that as I was typing the comment at the bottom and completely forgot about it by the time I finished! Thanks :-)
Andy E
Thanks, this is really helpful. However, are regex matches limited to $1..$9 or can we also have $10, $11 etc....
VikrantY
@VikrantY: I probably confused you by using the same naming convention for the arguments as of those properties you find on the `RegExp` object. There are no limits (as far as I am aware) for the number of arguments passed to the replace function callback. Also, if your regexp consists of no substring captures then the only argument you're working with is `$0`.
Andy E
+2  A: 

Andy E's answer can be modified to make adding replacement definitions easier.

var text = "Hello World what a beautiful day";
text.replace(/(Hello|World)/g, function ($0){
  var index = {
    'Hello': 'Bye',
    'World': 'Universe'
  };
  return index[$0] != undefined ? index[$0] : $0;
});

output

"Bye Universe what a beautiful day";
macek
Thanks Andy/smotchkiss you have both completely sorted my problem and avoided me having to write my own algorith, for multiple replacement.
VikrantY