views:

351

answers:

5

How would I parse for the word "hi" in the sentence "hi, how are you?" or in parse for the word "how" in "how are you?"?

example of what I want in code:

String word = "hi";
String word2 = "how";
Scanner scan = new Scanner(System.in).useDelimiter("\n");
String s = scan.nextLine();
if(s.equals(word)) {
System.out.println("Hey");
}
if(s.equals(word2)) {
System.out.println("Hey");
}
+6  A: 

To just find the substring, you can use contains or indexOf or any other variant:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html

if( s.contains( word ) ) {
   // ...
}

if( s.indexOf( word2 ) >=0 ) {
   // ...
}

If you care about word boundaries, then StringTokenizer is probably a good approach.

http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html

You can then perform a case-insensitive check (equalsIgnoreCase) on each word.

Ryan Emerle
Wow! Thats exactly what I was looking for! btw: in my actual version i had converted it to lowercase already, I just simplified it for the question! Thanks again!
Custard
+4  A: 

Looks like a job for Regular Expressions. Contains would give a false positive on, say, "hire-purchase".

if (Pattern.match("\bhi\b", stringToMatch)) { //...
Anon.
A hit-and-run downvote with no explanation? Are you really trying to improve SO, or just throwing away your own rep to try and hurt others'?
Anon.
Hey sorry, didnt see that there were other answers down here :pI tried it, but it doesnt seem to work at all... any thing that I might have done possibly wrong? btw: it gives me an error when i use "match" so i use "matches"
Custard
A: 

I'd go for a tokenizer, instead. Set space and other elements like commas, full stops etc. as delimiters. And rememeber to compare in case-insensitive mode.

This way you can find "hi" in "Hi, how is his test going" without getting a false positive on "his" and a false negative on "Hi" (starts with a uppercase H).

p.marino
+2  A: 

I'd go for the java.util.StringTokenizer: http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html

StringTokenizer st = new StringTokenizer(
    "Hi, how are you?", 
    ",.:?! \t\n\r"       //whitespace and puntuation as delimiters
);
 while (st.hasMoreTokens()) {
     if(st.nextToken().equals("hi")){
         //matches "hi"
     }
 }

Alternatively, take a look at java.util.regex and use regular expressions.

Roland Bouman
The javadoc for `StringTokenizer` contains the sentence: "`StringTokenizer` is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of `String` or the `java.util.regex` package instead."
Simon Nickerson
Simon Nickerson: thanks for pointing that out, I didn't realize. Pity they favour `split` since that seems to do all the work up front
Roland Bouman
what would happen if the user just typed in "hi"? there is no " " anymore after it.
Custard
@Custard: have you tried? For me, the string tokenizer correctly passes "hi" on `nextToken()`
Roland Bouman
I havent, (sorry), but I am interested! Ill get to it tomorrow!
Custard
@Custard: don't hold your breath, it works :)
Roland Bouman
A: 

You can pass a regular expression to the next() method of Scanner. So you can iterate through each word in the input (Scanner delimits on whitespace by default) and perform the appropriate processing if you get a match.

Michael Hackner