tags:

views:

316

answers:

4

Hi guys, In a Android application I want to use Scanner class to read a list of floats from a text file (it's a list of vertex coordinates for OpenGL). Exact code is:

Scanner in = new Scanner(new BufferedInputStream(getAssets().open("vertexes.off")));
final float[] vertexes = new float[nrVertexes];
for(int i=0;i<nrVertexFloats;i++){
    vertexes[i] = in.nextFloat();
}

It seems however that this is incredibly slow (it took 30 minutes to read 10,000 floats!) - as tested on the 2.1 emulator. What's going on? I don't remember Scanner to be that slow when I used it on the PC (truth be told I never read more than 100 values before). Or is it something else, like reading from an asset input stream?

Thanks for the help!

+2  A: 

Don't know about Android, but at least in JavaSE, Scanner is slow.

Internally, Scanner does UTF-8 conversion, which is useless in a file with floats.

Since all you want to do is read floats from a file, you should go with the java.io package.

The guys on SPOJ struggle with I/O speed. It's is a Polish programming contest site with very hard problems. Their difference is that they accept a wider array of programming languages than other sites, and in many of their problems, the input is so large that if you don't write efficient I/O, your program will burst the time limit.

Check their forums, for example, here, for an idea of a custom parser.

Of course, I advise against writing your own float parser, but if you need speed, that's still a solution.

Leonel
Even if Scanner is slow, 30 minutes for 10,000 floats is nowhere near a reasonable time, even if Scanner did 10 useless charset-conversions.
Joachim Sauer
It seems that Scanner is indeed VERY slow on the device/emulator! It might be because of the huge number of memory allocations. On the emulator it takes 30 minutes to read 10,000 floats. On the PC it takes 1 second to read 20,000 floats (with Scanner). As a solution I found the following to work very well: first I parse my input file on the PC and transform it into binary data, then I read it on the device byte by byte (buffered) and reconstruct the numbers. This is MUCH faster. It takes 1.5s to read 20,000 floats. I say it's a enormous improvement from 1 hour :) Thanks for all the help!
Cristian Vrabie
A: 

Yes I'm not seeing anything like this. I can read about 10M floats this way in 4 secs on the desktop, but it just can't be that different.

I'm trying to think of other explanations -- is it perhaps blocking in reading the input stream from getAssets()? I might try reading that resource fully, timing that, then seeing how much additional time is taken to scan.

Sean Owen
A: 

Scanner may be part of the problem, but you need to profile your code to know. Alternatives may be faster. Here is a simple benchmark comparing Scanner and StreamTokenizer.

trashgod
A: 

I got the exactly same problem. It took 10 minutes to read my 18 KB file. In the end I wrote a desktop application that converts those human readable numbers into machine-readable format, using DataOutputStream.

The result was astonishing.

Btw, when I traced it, most of the Scanner method calls involves regular expressions, whose implementation is provided by com.ibm.icu.** packages (IBM ICU project). It's really overkill.

The same goes for String.format. Avoid it in Android!

yuku