views:

134

answers:

2

Hello, I have a sentence structure along the lines of

[word1]{word2} is going to the [word3]{word4}

I'm trying to use a javascript regex to match the words for replacement later. To do this, I'm working towards getting the following multi-dimensional array:

[["word1", "word2"],["word3","word4"]]

I'm currently using this regex for the job:

\[(.*?)\]\{(.*?)\}

However, it comes up with results like:

["[word1]{word2}", "word1", "word2"]

or worse. I don't really understand why because this regex seems to work in Ruby just fine, and I'm not really much of a regex expert in general to understand what's going on. I'm just curious if there are any javascript rege expert's out there to whom this answer is very clear and can guide me along with what's going on here. I appreciate any help!

Edit:

This is the code I'm using just to test the matching:

function convertText(stringText) {
     var regex = /\[(.*?)\]\{(.*?)\}/;
    console.log(stringText.match(regex));
}
A: 

What you're seeing is Japanese hiragana. Make sure your input is in English maybe?

Edited to say: Upon further review, it looks like a dictionary entry in Japanese. The 私 is kanji and the わたし is hiragana, a phonetic pronunciation of the kanji. FWIW, the word is "Watashi" which is one of the words for "I" (oneself) in Japanese.

Robusto
Sorry! I've now edited it out to use the word1/word2, etc format. It is Japanese that I am trying to Regex (in case that's important), but for the purposes of discussion and readability for everyone, I switched it match the above.
japancheese
LOL. I thought you were confused because you were getting results in Japanese.
Robusto
Haha, no, but doing work with Japanese is certainly confusing enough already :P
japancheese
+1  A: 

I assume you are using the exec method of the regular expression.

What you are doing is almost correct. exec returns an array where the first element is the entire match and the remaining elements are the groups. You want only the elements at indexes 1 and 2. Try something like this, but of course store the results into an array instead of using an alert:

var string = '[word1]{word2} is going to the [word3]{word4}';
var pattern = /\[(.*?)\]\{(.*?)\}/g;
var m;
while(m = pattern.exec(string)) {
    alert(m[1] + ',' + m[2]);
} 

This displays two alerts:

  • word1,word2
  • word3,word4
Mark Byers
Ah, that sort of does what I'm looking for, and I could certainly work with those results. I don't really understand the loop part though; every time I call exec on that string, it tries to find the next match? If so, is there some way for it to find all matches within one call instead of a loop (not a deal breaker for what I need to do, just curiosity asking)
japancheese
@japancheese: Yes, exec uses lastindex to remember how far it got. It's not possible to do this in one call in Javascript that I'm aware of.
Mark Byers
Ah I see, well that's ok, I can still definitely work with this then. Thanks for the help Mark!
japancheese