views:

145

answers:

7

I have a string that could have any sentence in it but somewhere in that string will be the @ symbol, followed by an attached word, sort of like @username you see on some sites.

so maybe the string is "hey how are you" or it's "@john hey how are you".

IF there's an "@" in the string i want to pull what comes immediately after it into its own new string.

in this instance how can i pull "john" into a different string so i could theoretically notify this person of his new message? i'm trying to play with string.contains or .replace but i'm pretty new and having a hard time.

this btw is in c# asp.net

+1  A: 

The best way to solve this is using Regular Expressions. You can find a great resource here.

Using RegEx, you can search for the pattern you are after. I always have to refer to some documentation to write one...

Here is a pattern to start with - "@(\w+)" - the @ will get matched, and then the parentheses will indicate that you want what comes after. The "\w" means you want only word characters to match (a-z or A-Z), and the "+" indicates that there should be one or more word characters in a row.

davisoa
The regex would work, but it is kind of an overkill and too advanced for a begginer. Thoughts?
Damian Schenkelman
I agree it is advanced, but using `Substring` and `IndexOf` are prone to exceptions and issues that won't be found without a lot of testing.
davisoa
I don't think regexs are too advance. OP will have to learn them at some point... learning regular expressions has a bigger payoff It's string processing and pattern matching that regexes were made for (us programmers tend to forget that) and the question is a text processing problem.
Timothy
@Damian: I disagree. Perhaps deciding which regex to use may be advanced, but the use of them is not advanced (at least, not in .NET)
John Saunders
+6  A: 

Hi, You can use the Substring and IndexOf methods together to achieve this.

I hope this helps.

Thanks, Damian

Damian Schenkelman
+1...A surprising number of answers here are recommending the unnecessary complexity of Regular Expressions, when these simple functions will do.
Robert Harvey
+1 for providing resources. Thus answering the full question.
Caspar Kleijne
@robert, why not show it with a PLINQ sample?
Caspar Kleijne
@Robert Complexity of execution (performance), yes. But in my opinion `if (s.Contains("@") var name = s.SubString(s.IndexOf....` is less readable. Ex take a look on egrunin´s answer
lasseespeholt
These simple functions may work...for now, with a simple case like `@joe hello` where an `@` denotes the start and a ` ` the end. But they are more fragile than a regex. What happens when the "string" has characters before the space, like `@joe, hey!`? Do you have to do a bunch of substrings looking for the ',' or any one of the other non alpha-numeric characters? Regexes aren't as complex as people make them seem. A simple regex like `@[\w-_]+` takes this into account, and also allows `-` and `_` in the name, only matching the `joe` part. It's much simpler to customize.
Chad
@Chad, There are no what-if scenarios for beginners, just a "this is it". wax on, wax off ;)
Caspar Kleijne
@lasseespeholt But if any of your coworkers ever come across your regular expressions, they'll hate you, at least at most places I've worked...I've found that it's best to assume a macro-writing accountant will be the next person to sift through my code.
smoore
@Caspar Kleijne, I disagree, I think "beginners" should be taught the proper way the first time, not the *hackish* way
Chad
@smoore Why? If they know regular expressions and they are reasonable simple they should be fine :) but if they don't, then they have something to learn :D - seriously
lasseespeholt
@smoore, regexes are often simple. And a tool like Expresso breaks it down no matter the complexity into something anybody can comprehend.
Chad
@lasseespeholt That's true, and I wish my co-workers would look at things that way, but unfortunately they don't, many are still kicking and screaming about being drug into the "new" base languages (C#) my company has adopted :)
smoore
@Chad: `\w` already captures `_`, no need to specify it separately.
Fredrik Mörk
@Fredrik Mörk, indeed it does.
Chad
@Chad: he's new at this, why show him the chainsaw before the ripsaw?
egrunin
anyone have a link to a sort of regular expressions for dummies article or resource site i can check out? i'd like to learn them
korben
can anyone advise how i should award the right answer if you guys think it matters? you all kind of said just about the same thing i don't know who should get credit
korben
@korben I have putted a few links in my answer :) And just accept the answer which helped you most. If you like, you can upvote the rest of the answers which helped you.
lasseespeholt
@korben, http://www.regular-expressions.info/ is a good one
Chad
@egrunin, because if you're cutting down a tree, you use a chainsaw. The right tool for the job...though, some would argue, depending on the total scope of what korben is attempting, the right tool might be a lexer/parser.
Chad
thanks everyone
korben
+1  A: 

RegularExpressions. Dont know C#, but the RegEx would be

/(@[\w]+) / - Everything in the parans is captured in a special variable, or attached to RegEx object.

Sean
+1  A: 

You can try Regex...

I think will be something like this

string userName = Regex.Match(yourString, "@(.+)\\s").Groups[1].Value;
Zote
It matches too much: `hey @john hey how are you` => `john hey how are`. `.` is everything besides line-breaks.
lasseespeholt
he's right i'm getting this problem, in the case of @john hey how are you i just want "john" not "john hey how" which is what i'm getting?
korben
+3  A: 

You should really learn regular expressions. This will work for you:

using System.Text.RegularExpressions;

var res = Regex.Match("hey @john how are you", @"@(\S+)");

if (res.Success)
{
    //john
    var name = res.Groups[1].Value;
}

Finds the first occurrence. If you want to find all you can use Regex.Matches. \S means anything else than a whitespace. This means it also make hey @john, how are you => john, and @john123 => john123 which may be wrong. Maybe [a-zA-Z] or similar would suit you better (depends on which characters the usernames is made of). If you would give more examples, I could tune it :)

I can recommend this page:

http://www.regular-expressions.info/

and this tool where you can test your statements:

http://regexlib.com/RESilverlight.aspx

lasseespeholt
thanks for this!
korben
I would suggest this regex as an update `@([\w-]+)`, it will allow alphanumeric plus dashes and underscores to be part of the match. Spaces, commas, colons, etc will not be part of the match. You can also look at naming the group `@(?<name>[\w-]+)` so that you can use `res.Groups[name].Value` instead of `[1]`
Chad
@Chad Partly agree :) I would make the reg. exp. to match EXACTLY what characters a username is permitted to consist of. Yup, but I wouldn't make the reg. exp. overly complicated to look at. An alternative is `res.Value.Substring(1)`.
lasseespeholt
+5  A: 

Here's how you do it without regex:

string s = "hi there @john how are you";

string getTag(string s)
{
    int atSign = s.IndexOf("@");

    if (atSign == -1) return "";

    // start at @, stop at sentence or phrase end
    // I'm assuming this is English, of course
    // so we leave in ' and -
    int wordEnd = s.IndexOfAny(" .,;:!?", atSign); 

    if (wordEnd > -1)
        return s.Substring(atSign, wordEnd - atSign);
    else
        return s.Substring(atSign);

}
egrunin
won't work. What about other punctuation marks, such as commas, exclamation marks, etc.
Timothy
@Timothy: You can add all of those to the `IndexOfAny` call.
Robert Harvey
But not an apostrophe or comma, right? So this has to be explicit, even in a regex.
egrunin
So, how about when there are more than 1 @? for example - "hi @matt, have you seen @john?"
Ivan Ferić
@Ivan: he didn't say what behavior he wants there, so I left it out. I could adapt this to that case, but I'm @work :)
egrunin
@Robert thats an awful lot of possible characters to specify...
Timothy
A: 

Use this:

var r = new Regex(@"@\w+");
foreach (Match m in r.Matches(stringToSearch))
    DoSomething(m.Value);

DoSomething(string foundName) is a function that handles name (found after @).
This will find all @names in stringToSearch

Ivan Ferić