tags:

views:

113

answers:

4

Take these examples

Smith John
Smith-Crane John
Smith-Crane John-Henry
Smith-Crane John Henry

I would like to get the John The first word after the space, but it might not be until the end, it can be until a non alpha character. How would this be in Java 1.5?

+1  A: 

Hi,

You will want to use a regular expression like the follwoing.

\s{1}[A-Z-a-z]+

Enjoy!

Doug
What will you capture? because this may not work when the space or hyphen is present in the name.
Pran
Hi Pran,Yes on the contrary this expression reads as follows - find the first occurrence of a single space then one or more alpa characters following the space. So it will look for this signature weather a hyphen is following in the alpha string or not. As you can see Mark Byers used my expression in his answer above.
Doug
+2  A: 

You could use String.split:

line.split(" ");

Which for the first line would yield:

{ "Smith", "John" }

You could then iterate over the array to find it. You can also use regular expressions as the delimiter if necessary.

Is this good enough, or do you need something more robust?

Justin Ethier
Fails on 3rd example though. He want to get `John`, not `John-Henry`.
BalusC
It would be good if it could recognize only John from John-Henry
Pentium10
not sure if the original author intended to grab a particular item in the input, but if you use split as Justin suggested, use equalsIgnoreCase to test if the item is the particular item you are looking for.
predhme
+3  A: 

You can use regular expressions and the Matcher class:

String s = "Smith-Crane John-Henry";
Pattern pattern = Pattern.compile("\\s([A-Za-z]+)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
    System.out.println(matcher.group(1));
}

Result:

John
Mark Byers
+1, but I'd use `group(1)` else it will return the entire match, including leading space.
BalusC
Thanks.... updated.
Mark Byers
What if the person's name is for example Hélenè? Use `\p{L}` to match all Unicode letters instead of using `[A-Za-z]`.
jasonmp85
+1  A: 

Personally I really like the string tokenizer. I know it's out of style these days with split being so easy and all, but...

(Psuedocode because of high probability of homework)

create new string tokenizer using (" -") as separators
iterate for each token--tell it to return separators as tokens
    if token is " "
        return next token;

done.

Bill K
Perhaps put it in a blockquote instead of a code sample to avoid syntax highlighting
Patrick
StringTokenizer is so 2006. :-) As per the Javadoc, `StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.`
glowcoder
@glowcoder I can't figure out how to get split to do what tokenizer does naturally--return separators as tokens.
Bill K
@Bill you don't with split. Split isn't intended to return the delimiters. You can use the Pattern/Matcher paradigm to return delimiters. I didn't completely vet the code, but `http://snippets.dzone.com/posts/show/6453` seemed to have an example of doing so.
glowcoder
But then you are using REs. I admit that it's possibly a better solution, but personally I'll stick with tokenizer, it does exactly what I need and does it the way I prefer (Personally I'm quite anti regular expressions, admittedly an unpopular stance, but until I'm somehow actually presented with a way they will make my life better rather than worse, I'll stick with it.)
Bill K