views:

541

answers:

2

Hello,

I need to match certain things from lines of an input text. The lines look like this:

 to be/ Σ _ Σ  [1pos, 1neg] {0=1, 2=1}

I am using the Scanner class to read each line of the text, and I have written the following code. However, something is not working properly, because the patter "to" is not matched against the line, and it should be, because "to" is contained in the line (I have tried to match not only "to" from the line, but nothing matches):

 Scanner scanner = new Scanner(file);
 while(scanner.hasNext()) {
      String line = scanner.nextLine();
      System.out.println("line: " + line);
      Pattern p_pos = Pattern.compile("to");
      Matcher m_pos = p_pos.matcher(line);
      String match = m_pos.group(0);
      System.out.println("match: " + match);
      boolean b_pos = m_pos.matches();
      if(b_pos) {
          System.out.println(match);
      }
 }

Output:

line:    to be/ Σ _ Σ  [1pos, 1neg] {0=1, 2=1}
Exception in thread "main" java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:485)
    at lady.PhrasesFromFile.readFile(PhrasesFromFile.java:31)
    at lady.PhrasesFromFile.main(PhrasesFromFile.java:17)

I have one more question: how can I process the line so that I store everything from the beginning of the line till the first "/" symbol? I couldn't find any method for that in the API. Is it possible to do so? I basically want consecutively to go through the line, store pieces of the line in different variables, and then use the values of these variables. Since I do not know how many token I have before the first "/" symbol, I cannot use next() a certain number of times.

Thank you in advance.

+1  A: 

.matches() tries to match the entire input string. Use .find() if you want to match a portion of the input string, or .lookingAt() if you want to match the beginning of the input string.

http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Matcher.html

Also, if you expand your pattern to include matching groups (see a general regex reference for more details on how matching groups work), you can use the .group() function after a successful match to retrieve the substring matched by a particular group within the pattern.

Amber
+1  A: 

You could extract the part you need for the tokens by using:

String tokenSection = Pattern.compile("(to\\s+.*?)/").matcher(line).find().group(1);

and then looping over that to extract the tokens using

Pattern.compile("\\w+").matcher(tokenSection).find();

Obviously, you wouldn't plug the above pieces of code right in.

Michael Deardeuff