views:

224

answers:

9

Hello, I plan to store an array of numbers in a file and read it when needed. What would be a good way to do this? I can think of afew ways such as storing each element in a line as a text file or serialize it and store / call by that method. Speed is my first concern.

Thanks

+5  A: 

If the file doesn't need to be human-readable, then serializing it would be the better approach performance wise. If you would save every array entry as a line in a file, you would need to traverse the array, do some IO, save the file, later to restore it as the exact same array you would need to do all thos steps in reverse. Furthermore, IO operations are fairly expensive.

The built-in serialization mechanism does all this for you and arguably in the most effective way possible.

nkr1pt
+2  A: 

Speed in this context is a secondary issue. Why? Because you're reading a file anyway and I/O is slow (compared to in-memory operations). I would just store them one number per line so it's human-readable.

cletus
There's a lot to be said for text-based storage. And if storage is a problem, you can always zip it.
hbunny
A: 

Following Serialization is the best way. But if your concern is speed Serialization is not right option. (Serialization performance is poor).

Madhu
Poor compared to what?
jarnbjo
+1  A: 

If you only ever want to store an array of numbers, then writing your own manual serialization/deserialization routine would work. It'll teach you something about IO operations.

When you get to more complex types - strings, even - then using built in Serialization methods will probably serve you better in the long run as they're generally more reliable for the vast majority of use-cases.

Although I'm not a Java dev, it looks fairly simple to use serialization in Java. Sun seems to have a good introduction to Java serialization.

http://java.sun.com/developer/technicalArticles/Programming/serialization/

Will Hughes
+1  A: 

There isn't enough information about your use case to know the best approach speed wise. (will this be multi-threaded, how often will this be done, what is the size of the array and issues like that).

That being said, the only real way to know is to profile them. Serialization is trivial, and writing one number per line is pretty trivial as well, so you can try those two, profile them in the type of scenario you need, see which one is faster and see if either of them hit your performance target.

Yishai
+1  A: 

If speed is your primary concern, use DataOutputStream and DataInputStream to serialize it in binary form. something like:

public void write(DataOutput dout, int arr[]) throws IOException
{
 dout.writeInt(arr.length);
 for(int a : arr) dout.writeInt(a);
}

public int[] readArray(DataInputStream din) throws IOException
{
 int arr[] = new int[din.readInt()];
 for(int i=0;i<arr.length;i++)
  arr[i] = din.readInt();

 return arr;
}

if even this is not fast enough, consider using IntBuffer for bulk operations.

The advantages of binary form are:

  1. You read and write less data because binary data is more compact than human readable text which means the IO less significant.
  2. You save the cpu cycles of parsing the data from text format to integers.
Omry
+1  A: 

A novel approach: If your array of numbers are unique integers you could write them out as a run-length encoded "bit set". This would give a very compact representation, meaning less I/O. I would suggest this approach for storing very large arrays of unique integers.

For example, suppose your array contains the values [1 ,2 ,3 ,5 ,9], your bit set would look like this:

[1, 0, 0, 0, 1, 0, 1, 1, 1]

... and your RLE encoded bit-set would be:

013113

... which is intepreted as "0 zeros, 1 one, 3 zeros, 1 one, etc".

You could choose to either persist the RLE encoded string as characters or using a binary format.

Adamski
+1  A: 
new ObjectOutputStream(new FileOutputStream("s")).writeObject(new ArrayList());

File Saved.

Nico
obviously you should implement it a little better than this, but thats the gist of it.
Nico
+1  A: 

This might be an overkill but you might want to also consider how JSON neatly handles its key:value, array-based data. You may save your arrays like this into a single file { "myArrays":{ "1" : "[0 1 2 3 4 5]" "2" : "[0 1 2 3 4 5]"

   "n" : "[0 1 2 3 4 5]"
 }
}

To retrieve the arrays, read the file contents and store them in StringBuffer, Serialize (e.g. net.sf.json.JSONSerializer) them into a JSON object and iterate through each set of arrays conveniently.