ansaurus

Question

Answer 1

+1 A:

Just open the file via the java.io methods. Show us what you've tried first, eh?

Borealid 2010-08-11 01:22:16

Answer 2

+3 A:

Textbased files should be opened with a java.io.Reader. Easiest way would be using a BufferedReader to read it line by line in a loop.

Here's a kickoff example:

BufferedReader reader = null;

try {
    reader = new BufferedReader(new FileReader("/path/to/file.txt"));
    for (String line; (line = reader.readLine()) != null;) {
        // Do your thing with the line. This example is just printing it.
        System.out.println(line); 
    }
} finally {
    // Always close resources in finally!
    if (reader != null) try { reader.close(); } catch (IOException ignore) {}
}

To breakdown the file content further in tokens, you may find a Scanner more useful.

It's probably enormous overkill to include the library for this one thing. However there are lots of useful other things in there. This approach also has the downside that it reads the entire file into memory at once, which might be unpalatable depending on the size of your file. There are alternative APIs in guava you can use for streaming lines too in a slightly more convenient way than directly using the java.io readers.

Brian Duff 2010-08-11 02:10:01

@Brian - in general it is a bad idea to hard-wire the assumption that the file is always UTF-8 encoded. If the file is really ASCII (as stated in the question), it probably won't make much difference, but it is better to either use the platform default encoding, or make the encoding command line parameter or configuration option.

Stephen C 2010-08-11 02:28:57

This is client code, you can pass in whatever you want here based on your knowledge of the encoding, or you can pass in Charset.defaultCharset() as the second parameter if you want to use the default platform encoding (or indeed use a flag as you suggest).In general, it's also a bad idea to assume the platform default encoding when working with code that may read and write files on different platforms. Companies at which I've worked generally have standards that say, "read and write all text files specifically in encoding X, don't use the platform default encoding".

Brian Duff 2010-08-11 02:45:25

FWIW, this is my understanding of why Google doesn't have a Files.toString() that takes no Charset argument in Guava. We think people often just use the platform encoding without really thinking about the fact that it may be different between platforms, so we encourage you to always be explicit and specify your encoding (even if you're explicitly indicating the platform encoding via Charset.defaultCharset()).

Brian Duff 2010-08-11 02:48:18

ansaurus

tags:

views:

answers:

Opening and Analyzing ASCII files

See also:

related questions