tags:

views:

159

answers:

4

I am trying to find words starts with a specific character like:

Lorem ipsum #text Second lorem ipsum. How #are You. It's ok. Done. Something #else now.

I need to get all words starts with "#". so my expected results are #text, #are, #else

Any ideas?

+4  A: 

Search for:

  • something that is not a word character then
  • #
  • some word characters

So try this:

/(?<!\w)#\w+/

Or in C# it would look like this:

string s = "Lorem ipsum #text Second lorem ipsum. How #are You. It's ok. Done. Something #else now.";
foreach (Match match in Regex.Matches(s, @"(?<!\w)#\w+"))
{
    Console.WriteLine(match.Value);
}

Output:

#text
#are
#else
Mark Byers
Ah good catch... The word boundary won't work before the #, will it? But in javascript you can't do negative-lookbehinds can you?
Jeff B
@JeffB: You are right. This will work in C# though.
Mark Byers
+2  A: 

Match a word starting with # after a white space or the beginning of a line. The last word boundary in not necessary depending on your usage.

/(?:^|\s)\#(\w+)\b/

The parentheses will capture your word in a group. Now, it depends on the language how you apply this regex.

The (?:...) is a non-capturing group.

Jeff B
+2  A: 

Try this #(\S+)\s?

Petoj
Way simpler, good job!
Fábio Batista
This will return "#word " instead of "#word". The \s? isn't necessary
zincorp
A: 
var re = /#\S+/g;

var matches = [];
var match;
while(match = re.exec("Lorem ipsum #text Second lorem ipsum. How #are You. It's ok. Done. Something #else now"))
{
    matches.push(match[0]);
}

alert(matches);

Output: #text,#are,#else

zincorp