views:

91

answers:

4

In Javascript i want to be able to match strings that begin with a certain phrase. However, I want it to be able to match the start of any word in the phrase, not just the beginning of the phrase.

For example:

Phrase: "This is the best"

Need to Match: "th"

Result: Matches Th and th

EDIT: \b works great however it proposes another issue:

It will also match characters after foreign ones. For example if my string is "Männ", and i search for "n", it will match the n after Mä...Any ideas?

+1  A: 

Use this:

string.match(/^th|\sth/gi);

Examples:

'is this is a string'.match(/^th|\sth/gi);


'the string: This is a string'.match(/^th|\sth/gi);

Results:

["th", " Th"]

["th"]

Michael Robinson
Since op mentions, `any word` it may not be safe to assume a space for a word boundary. Your regex doesn't match anything in, `Here-is-the-sentence!`. This is why `\b` is better as a word boundary.
Peter Ajtai
+1  A: 
var matches = "This is the best".match(/\bth/ig);

returns:

["Th", "th"]

The regular expression means: Match "th" ignoring case and globally (meaning, don't stop at just one match) if "th" is the first word in the string or if "th" is preceded by a space character.

Vivin Paliath
Since op mentions, `any word` it may not be safe to assume a space for a word boundary. Your regex doesn't match anything in, `Here-is-the-sentence!`. This is why `\b` is better as a word boundary.
Peter Ajtai
@Peter Thanks! Didn't know about `\b`!
Vivin Paliath
@Vivn - Your example still only matches "Th" because of the beginning of line character `^`. A global search for the beginning of line on a string still only returns 1 find ;) - http://jsfiddle.net/NHcLx/
Peter Ajtai
DOH! :p .......
Vivin Paliath
A: 

Use the g flag in the regex. It stands for "global", I think, and it searches for all matches instead of only the first one.

You should also use the i flag for case-insensitive matching.

You add flags to the end of the regex (/<regex>/<flags>) or as a second parameter to new RegExp(pattern, flags)

For instance:

var matches = "This is the best".match(/\bth/gi);

or, using RegExp objects:

var re = new RegExp("\\bth", "gi");
var matches = re.exec("This is the best");

EDIT: Use \b in the regex to match the boundary of a word. Note that it does not really match any specific character, but the beginning or end of a word or the string.

Frxstrem
But this will search in between words in the string which i don't want
abadaba
This will also match `moth` in the string.
Peter Ajtai
This will match all occurrences of 'th', whether they are at the start of a word or not.
Michael Robinson
abadaba: just use \b in the beginning of the regex. \b is for "boundary", which does not match any particular character, but it matches the beginning/end of a word and the string.
Frxstrem
This works great, except that it will also match characters after foreign ones.For example if my string is "Männ", and i search for "n", it will match the n after Mä...Any ideas?
abadaba
+4  A: 
Peter Ajtai
+1 thanks for introducing me to \b :)
Michael Robinson
@Michael - YW! This is a great reference for regex - http://www.regular-expressions.info/reference.html
Peter Ajtai
This works great, except that it will also match characters after foreign ones. For example if my string is "Männ", and i search for "n", it will match the n after Mä...Any ideas?
abadaba
@abadaba - Added the possibility of using `\s` to the answer.
Peter Ajtai
But then this will only match words that begin with a space
abadaba
Also, if the word matches anything following a space, can you have it so the array of matches returns the word without the space?
abadaba
@abadaba - You use parentheses for that. Hang on, I'll edit it.
Peter Ajtai
Hmm, but let's say if I have "The boat thought", if i match for "th", it will return ["Th"," th"] because it matches the space before thought. How do i get it to return ["Th","th"] instead?
abadaba
@abadaba - yes, that's true if you do not use `\b`. `\b` has no width, so it doesn't have the problem. Otherwise you have to use parentheses to match only the part you want.
Peter Ajtai
Peter, your new example with:"This is the best moth".match(/\s(th[^\s]*)|^(th[^\s]*)/gi);still returns ["This", " the"]
abadaba
@abadaba - hmmm... not sure what's going on - I'm looking through http://www.javascriptkit.com/javatutors/redev2.shtml
Peter Ajtai
@abadaba - Pretty complicated if you want to ignore spaces and not use `\b` - http://stackoverflow.com/questions/3508327/how-do-you-use-non-captured-elements-in-a-javascript-regex/3508365#3508365
Peter Ajtai