views:

194

answers:

1

So I'm reading in a file (like java program < trace.dat) which looks something like this:

58
68
58
68
40
c
40
48
FA

If I'm lucky but more often it has several whitespace characters before and after each line.

These are hexadecimal addresses that I'm parsing and I basically need to make sure that I can get the line using a scanner, buffered reader... whatever and make sure I can then convert the hexadecimal to an integer. This is what I have so far:

Scanner scanner = new Scanner(System.in);
int address;
String binary;
Pattern pattern = Pattern.compile("^\\s*[0-9A-Fa-f]*\\s*$", Pattern.CASE_INSENSITIVE);
while(scanner.hasNextLine()) {
    address = Integer.parseInt(scanner.next(pattern), 16);
    binary = Integer.toBinaryString(address);
    //Do lots of other stuff here
}
//DO MORE STUFF HERE...

So I've traced all my errors to parsing input and stuff so I guess I'm just trying to figure out what regex or approach I need to get this working the way I want.

+1  A: 

The s.next() takes care of the white-spaces. (The default tokenizer doesn't care about them.)

import java.util.Scanner;
public class Test {
    public static void main(String... args) {
        Scanner s = new Scanner(System.in);
        while (s.hasNext())
            System.out.println(Integer.parseInt(s.next(), 16));
    }
}

If you'd really like to stick with the Pattern-approach, I would recommend you to use the XDigit class:

\p{XDigit} A hexadecimal digit: [0-9a-fA-F]

Further more; The scanner.next(pattern) will return the entire matched pattern (including the white-spaces!) You need to work with capturing groups. Try the pattern

^\\s*(\\p{XDigit}+)\\s*$

And then get the actual hex-number with matcher.group(1)

aioobe
Here is what I get when I try this: Exception in thread "main" java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:470) at Cache.access(Cache.java:82) at Cache.main(Cache.java:136)
Ranman
when you try the code I suggested?
aioobe
I am trying to fix a bug, sorry, I keep getting an InputMismatchException somewhere else in the code.
Ranman
Ok. Note that I changed `[0-9A-Fa-f]*` to `[0-9A-Fa-f]+`. Don't forget that `*` also matches 0 characters. That is `^\\s*[0-9A-Fa-f]*\\s*$` matches the empty line (in which `\\s*`, `[0-9A-Fa-f]*`, and `\\s*` all correspond to the empty string. `+` means, "at least one character", which may be closer to what you want.
aioobe
With this one I get an IllegalStateException
Ranman
OK, so with your code for some reason this still fails at any non integer character. Here is what I have: while (scanner.hasNextLine()) { m = pattern.matcher(scanner.nextLine()); if (m.matches()) { address = Integer.parseInt(m.group(1), 16); binary = Integer.toBinaryString(address); cache.access(binary); } }and this is the error I get:Exception in thread "main" java.lang.NumberFormatException: For input string: ""
Ranman
Make sure you changed `*` to a `+` in the pattern as I described above. If the problem remains, post the input in your question, or even better, post what input-line you get the exception for.
aioobe
Yes I changed it, here is what I'm using:`^\\s*(\\p{XDigit}+)\\s*$`The line it fails on is: `c`The code:`while (scanner.hasNextLine()) { m = pattern.matcher(scanner.nextLine()); if (m.matches()) { address = Integer.parseInt(m.group(1), 16); binary = Integer.toBinaryString(address); cache.access(binary); } }`
Ranman
I just realized the error was somewhere else!
Ranman
The problem, the entire time, was that it was truncating C down to 4 bits rather than 8.
Ranman