




Hi guys, In a Android application I want to use Scanner class to read a list of floats from a text file (it's a list of vertex coordinates for OpenGL). Exact code is:

Scanner in = new Scanner(new BufferedInputStream(getAssets().open("vertexes.off")));
final float[] vertexes = new float[nrVertexes];
for(int i=0;i<nrVertexFloats;i++){
    vertexes[i] = in.nextFloat();

It seems however that this is incredibly slow (it took 30 minutes to read 10,000 floats!) - as tested on the 2.1 emulator. What's going on? I don't remember Scanner to be that slow when I used it on the PC (truth be told I never read more than 100 values before). Or is it something else, like reading from an asset input stream?

Thanks for the help!

Don't know about Android, but at least in JavaSE, Scanner is slow.

Internally, Scanner does UTF-8 conversion, which is useless in a file with floats.

Since all you want to do is read floats from a file, you should go with the java.io package.

The guys on SPOJ struggle with I/O speed. It's is a Polish programming contest site with very hard problems. Their difference is that they accept a wider array of programming languages than other sites, and in many of their problems, the input is so large that if you don't write efficient I/O, your program will burst the time limit.

Check their forums, for example, here, for an idea of a custom parser.

Of course, I advise against writing your own float parser, but if you need speed, that's still a solution.

Even if Scanner is slow, 30 minutes for 10,000 floats is nowhere near a reasonable time, even if Scanner did 10 useless charset-conversions.
It seems that Scanner is indeed VERY slow on the device/emulator! It might be because of the huge number of memory allocations. On the emulator it takes 30 minutes to read 10,000 floats. On the PC it takes 1 second to read 20,000 floats (with Scanner). As a solution I found the following to work very well: first I parse my input file on the PC and transform it into binary data, then I read it on the device byte by byte (buffered) and reconstruct the numbers. This is MUCH faster. It takes 1.5s to read 20,000 floats. I say it's a enormous improvement from 1 hour :) Thanks for all the help!
Yes I'm not seeing anything like this. I can read about 10M floats this way in 4 secs on the desktop, but it just can't be that different.

I'm trying to think of other explanations -- is it perhaps blocking in reading the input stream from getAssets()? I might try reading that resource fully, timing that, then seeing how much additional time is taken to scan.

Scanner may be part of the problem, but you need to profile your code to know. Alternatives may be faster. Here is a simple benchmark comparing Scanner and StreamTokenizer.


I got the exactly same problem. It took 10 minutes to read my 18 KB file. In the end I wrote a desktop application that converts those human readable numbers into machine-readable format, using DataOutputStream.

The result was astonishing.

Btw, when I traced it, most of the Scanner method calls involves regular expressions, whose implementation is provided by com.ibm.icu.** packages (IBM ICU project). It's really overkill.

The same goes for String.format. Avoid it in Android!
