tags:

views:

103

answers:

4

I have this kind of structure that I want to both read and write to a file, and I want to do it the fastest way possible.

class Map
{
  String name;
  int tiles[][];
}

What is the best way to do this? I'm a C++ programmer mostly, and I don't know the best way to do this in Java. It seems like it should be really simple, but I don't know how to do binary io in Java.

This is what I have created so far:

void Write(String fileName)
{
  final ObjectOutputStream oos = new ObjectOutputStream(new BufferedOutputStream(new FileOutputStream(fileName)));

  oos.writeUTF(name);
  oos.writeInt(tiles.length);
  oos.writeInt(tiles[0].length);
  for(int i = 0; i < tiles.length; i++)
    for(int j = 0; j < tiles[0].length; j++)
      oos.writeInt(tiles[i][j]);

  oos.close();
}

void Read(String fileName)
{
  final ObjectInputStream ois = new ObjectInputStream(new BufferedInputStream(new FileInputStream(file)));

  name = ois.readUTF();

  int w = ois.readInt();
  int h = ois.readInt();

  tiles = new int[h][w];

  for(int i = 0; i < h; i++)
    for(int j = 0; j < w; j++)
      tiles[i][j] = ois.readInt();

  ois.close();
}

Is this about as fast as I can get?

A: 

If you are looking for the fastest way to do I/O in Java, have a look at the nio (unbuffered io) classes. I'm quite confident you can do binary I/O as well.

http://download.oracle.com/javase/1.4.2/docs/api/java/nio/package-summary.html

phineas
+2  A: 

http://download.oracle.com/javase/6/docs/api/java/io/ObjectOutputStream.html

// change to class to support serialization
class Map implements Serializable
{
    String name;
    int tiles[][];
}

code snippet to write object

FileOutputStream fos = new FileOutputStream("t.tmp");
ObjectOutputStream oos = new ObjectOutputStream(fos);

Map m = new Map();

// set Map properties ...

oos.writeObject(m);
oos.close();
Aaron Saunders
`Map` should be made implement `Serializable`. And give a link to a newer version of teh docs.
Bozho
the class Map requires to implement the interface Serializable for this to work
Yanick Rochon
+3  A: 

If all you want to do is write that one and only structure, then you should hand-code serialization and deserialization. I'd recommend writing a count, then the string chars, then the dimensions of the array, and then all the integers. You will have to worry about byte order for yourself, taking two bytes for each char and four for each int.

Do not use Java serialization if you need speed, and do not use NIO just for a single-threaded disk file I/O situation. Do use a buffered stream.

Are you sure that this is really a performance-critical operation? How big is that array, anyhow?

Have you considered the memory mapping capability of NIO? Now you're making the kernel do the heavy lifting. Making many tiny files is probably going to give you heartburn in any case. Note that I'm distinguishing two things you can do with NIO: you can use channels and buffers to just plain read and write. I'm quite doubtful of the performance advantage of that to just read in a slug of data from a file. Or, you can memory map files, and let the kernel page the data in and out. That might work well for you, depending on the total volume of data and the memory configuration involved.

bmargulies
The reading in of the files will be time critical. Each file will be part of the total map. So each map may be only 30x30 to 50x50 tile size, but as a character moves around I will have to read them in on the fly.
gamernb
@gamernb Maybe you should ask if your general approach is good or not. :) You can store a lot 30x30 or 50x50 in memory or cache.
InsertNickHere
+1  A: 

I have a very specific technique I use for this kind of thing. It's sort of a hybrid approach that I find results in the most performant basic io code, yet maintains readability and compatibility with plain Java Serialization.

The reflection used in Java Serialization was the part that was historically thought to be slow, and it was slow. But since the addition of sun.misc.Unsafe, that part is actually incredibly fast. There's still the initial hit of the very first call to clazz.getDeclaredFields() and other 'getDeclared' type methods of java.lang.Class, but these are cached at the VM level, so are low cost after the first (very noticeable) hit.

The remaining overhead of Java Serialization is the writing of class descriptor data; the class name, what fields it has and what types they are, etc. If java objects were xml, it would be like first writing the xsd so the structure is known, then writing the xml data without the tags. It is actually pretty performant in some situations, for example if you need to write 100+ instances of the same class type to the same stream -- you'll never really feel the hit of the class descriptor data being written the one time at the start of the stream.

But if you just need to write one instance of said class and maybe not much else, there is a way to invert things to your advantage. Instead of passing your object to the stream, which results in the class descriptor being first written followed by the actual data, pass the stream to the object and go straight to the data writing part. Bottom line is you take responsibility for the structure part in your code rather than having the ObjectOutput/ObjectInput do it.

Note, I also renamed your class from Map to TileMap. As BalusC points out, it's not a good class name.

import java.io.*;

public class TileMap implements Externalizable {

    private String name;
    private int[][] tiles;

    public TileMap(String name, int[][] tiles) {
        this.name = name;
        this.tiles = tiles;
    }

    // no-arg constructor required for Externalization
    public TileMap() {
    }

    public void writeExternal(ObjectOutput out) throws IOException {
        out.writeUTF(name);
        out.writeInt(tiles.length);
        for (int x = 0; x < tiles.length; x++) {
            out.writeInt(tiles[x].length);
            for (int y = 0; y < tiles[x].length; y++) {
                 out.writeInt(tiles[x][y]);
            }
        }
    }

    public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
        this.name = in.readUTF();
        this.tiles = new int[in.readInt()][];
        for (int x = 0; x < tiles.length; x++) {
            tiles[x] = new int[in.readInt()];
            for (int y = 0; y < tiles[x].length; y++) {
                tiles[x][y] = in.readInt();
            }
        }
    }

}

A write would look like this:

public static void write(TileMap tileMap, OutputStream out) throws IOException {
    // creating an ObjectOutputStream costs exactly 4 bytes of overhead... nothing really
    final ObjectOutputStream oos = new ObjectOutputStream(new BufferedOutputStream(out));

    // Instead of oos.writeObject(titleMap1) we do this...
    tileMap.writeExternal(oos);

    oos.close();
}

And a read would look like this:

public static TileMap read(InputStream in) throws IOException, ClassNotFoundException {
    final ObjectInputStream ois = new ObjectInputStream(new BufferedInputStream(in));

    // instantiate TileMap yourself
    TileMap tileMap = new TileMap();

    // again, directly call the readExternal method
    tileMap.readExternal(ois);

    return tileMap;
}
David Blevins