views:

130

answers:

4

I have what amounts to an Iterator<Integer>... actually it's a class Thing that accepts a Visitor<SomeObject> and calls visit() for a subset of the SomeObjects it contains, and I have to implement Visitor<SomeObject> so it does something like this:

// somehow get all the Id's from each of the SomeObject that Thing lets me visit
public int[] myIdExtractor(Thing thing)
{
    SomeCollection c = new SomeCollection();
    thing.visitObjects(new Visitor<SomeObject>()
         {
              public void visit(SomeObject obj) { c.add(obj.getId()); }
         }
    );
    return convertToPrimitiveArray(c);
}

I need to extract an int[] containing the results, and I'm not sure what to use for SomeCollection and convertToPrimitiveArray. The number of results is unknown ahead of time and will be large (10K-500K). Is there anything that would be a better choice than using ArrayList<Integer> for SomeCollection, and this:

public int[] convertToPrimitiveArray(List<Integer> ints)
{
    int N = ints.size();
    int[] array = new int[N];
    int j = 0;
    for (Integer i : ints)
    {
        array[j++] = i;
    }
    return array;
}

Efficiency and memory usage are of some concern.

A: 

Instead of convertToPrimitiveArray, you can use List.toArray(T[] a):

ArrayList<int> al = new ArrayList<int>();
// populate al
int[] values = new int[al.size()];
al.toArray(values);

For your other concerns, LinkedList might be slightly better than ArrayList, given that you don't know the size of your result set in advance.

If performance is really a problem, you may be better off hand-managing an int[] yourself, and using System.arraycopy() each time it grows; the boxing/unboxing from int to Integer that you need for any Collection could hurt.

As with any performance-related question, of course, test and make sure it really matters before spending too much time optimizing.

Sbodd
`ArrayList<int>` is illegal. And therefore you can't pass an `int[]` into `toArray(T[])`.
Michael Myers
You can't have an ArrayList of primitive ints. It has to be Integers.
Russell Leggett
I almost suggested the same thing. But then I remembered mmyers point.
jjnguy
+2  A: 

It's not too difficult to come up with a class that collects ints in an array (even if you are not using some library which does it for you).

public class IntBuffer {
    private int[] values = new int[10];
    private int size = 0;
    public void add(int value) {
        if (!(size < values.length)) {
            values = java.util.Arrays.copyOf(values, values.length*2);
        }
        values[size++] = value;
    }
    public int[] toArray() {
        return java.util.Arrays.copyOf(values, size);
    }
}

(Disclaimer: This is stackoverflow, I have not even attempted to compile this code.)

As an alternative you could use DataOutputStream to store the ints in a ByteArrayOutputStream.

final ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
final DataOutputStream out = new DataOutputStream(byteOut);
...
    out.writeInt(value);
...
out.flush();
final byte[] bytes = byteOut.toByteArray();
final int[] ints = new int[bytes.length/4];
final ByteArrayInputStream byteIn = new ByteArrayInputStream(bytes);
final DataInputStream in = new DataOutputStream(byteIn);
for (int ct=0; ct<ints.length; ++ct) {
    ints[ct] = in.readInt();
}

(Disclaimer: This is stackoverflow, I have not even attempted to compile this code.)

Tom Hawtin - tackline
Ah... the old reallocate-the-buffer x2 trick. Good one.
Jason S
you have a typo, btw: "values[size++] = value", not "- value"
Jason S
Thanks Jason, done.
Tom Hawtin - tackline
+1  A: 

You could look at something like pjc to handle this. That is a collections framework made for primitives.

Yishai
A: 

for benchmarking's sake I put together a test program using an LFSR generator to prevent the compiler from optimizing out test arrays. Couldn't download pjc but I assume timing should be similar to Tom's IntBuffer class, which is by far the winner. The ByteArrayOutputStream approach is about the same speed as my original ArrayList<Integer> approach. I'm running J2SE 6u13 on a 3GHz Pentium 4, and with approx 220 values, after JIT has run its course, the IntBuffer approach takes roughly 40msec (only 40nsec per item!) above and beyond a reference implementation using a "forgetful" collection that just stores the last argument to visit() (so the compiler doesn't optimize it out). The other two approaches take on the order of 300msec, about 8x as slow.

edit: I suspect the problem with the Stream approach is that there is the potential for exceptions which I had to catch, not sure.

(for arguments run PrimitiveArrayTest 1 2)

package com.example.test.collections;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class PrimitiveArrayTest {
    interface SomeObject {
     public int getX();
    }
    interface Visitor {
     public void visit(SomeObject obj);
    }

    public static class PlainObject implements SomeObject
    {
     private int x;
     public int getX() { return this.x; }
     public void setX(int x) { this.x = x; }  
    }

    public static class Thing
    {
     /* here's a LFSR  
      * see http://en.wikipedia.org/wiki/Linear_feedback_shift_register
      * and http://www.ece.cmu.edu/~koopman/lfsr/index.html
      */
     private int state;
     final static private int MASK = 0x80004;
     private void _next()
     {
      this.state = (this.state >>> 1) 
      ^ (-(this.state & 1) & MASK);
     }
     public Thing(int state) { this.state = state; }  
     public void setState(int state) { this.state = state; }

     public void inviteVisitor(Visitor v, int terminationPoint)
     {
      PlainObject obj = new PlainObject();
      while (this.state != terminationPoint)
      {
       obj.setX(this.state);
       v.visit(obj);
       _next();
      }
     }
    }

    static public abstract class Collector implements Visitor
    {
     abstract public void initCollection();
     abstract public int[] getCollection();
     public int[] extractX(Thing thing, int startState, int endState)
     {
      initCollection();
      thing.setState(startState);
      thing.inviteVisitor(this, endState);
      return getCollection();
     }
     public void doit(Thing thing, int startState, int endState)
     {
      System.out.printf("%s.doit(thing,%d,%d):\n",
        getClass().getName(),
        startState,
        endState);
      long l1 = System.nanoTime();
      int[] result = extractX(thing,startState,endState);
      long l2 = System.nanoTime();
      StringBuilder sb = new StringBuilder();
      sb.append(String.format("%d values calculated in %.4f msec ",
        result.length, (l2-l1)*1e-6));
      int N = 3;
      if (result.length <= 2*N)
      {
       sb.append("[");
       for (int i = 0; i < result.length; ++i)
       {
        if (i > 0)
         sb.append(", ");
        sb.append(result[i]);
       }
       sb.append("]");
      }
      else
      {
       int sz = result.length;
       sb.append(String.format("[%d, %d, %d... %d, %d, %d]",
         result[0], result[1], result[2], 
         result[sz-3], result[sz-2], result[sz-1]));
      }
      System.out.println(sb.toString());   
     }
    }

    static public class Collector0 extends Collector
    {
     int lastint = 0;
     @Override public int[] getCollection() { return new int[]{lastint}; }
     @Override public void initCollection() {}
     @Override public void visit(SomeObject obj) {lastint = obj.getX(); }
    }
    static public class Collector1 extends Collector
    {
     final private List<Integer> ints = new ArrayList<Integer>();

     @Override public int[] getCollection() {
      int N = this.ints.size();
      int[] array = new int[N];
      int j = 0;
      for (Integer i : this.ints)
      {
       array[j++] = i;
      }
      return array;   
     }
     @Override public void initCollection() { }
     @Override public void visit(SomeObject obj) { ints.add(obj.getX()); }
    }

    static public class Collector2 extends Collector
    {
     /*
      * adapted from http://stackoverflow.com/questions/1167060
      * by Tom Hawtin
      */
     private int[] values;
     private int size = 0;
     @Override public void visit(SomeObject obj) { add(obj.getX()); }
     @Override public void initCollection() { values = new int[32]; }
     private void add(int value) {
      if (!(this.size < this.values.length)) {
       this.values = java.util.Arrays.copyOf(
         this.values, this.values.length*2);
      }
      this.values[this.size++] = value;
     }
     @Override public int[] getCollection() {
      return java.util.Arrays.copyOf(this.values, this.size);
     }  
    }

    static public class Collector3 extends Collector
    {
     /*
      * adapted from http://stackoverflow.com/questions/1167060
      * by Tom Hawtin
      */
     final ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
     final DataOutputStream out = new DataOutputStream(this.byteOut);
     int size = 0;
     @Override public int[] getCollection()  {
      try
      {
       this.out.flush();
       final int[] ints = new int[this.size];
       final ByteArrayInputStream byteIn 
        = new ByteArrayInputStream(this.byteOut.toByteArray());
       final DataInputStream in = new DataInputStream(byteIn);

       for (int ct=0; ct<ints.length; ++ct) {
        ints[ct] = in.readInt();
       }
       return ints;
      }
      catch (IOException e) { /* gulp */ }

      return new int[0]; // failure!?!??!
     }

     @Override public void initCollection() { }
     @Override public void visit(SomeObject obj) {
      try {
       this.out.writeInt(obj.getX());
       ++this.size;
      }
      catch (IOException e) { /* gulp */ }
     }  
    }
    public static void main(String args[])
    {
     int startState = Integer.parseInt(args[0]);
     int endState = Integer.parseInt(args[1]);
     Thing thing = new Thing(0);
     // let JIT do its thing
     for (int i = 0; i < 20; ++i)
     {
      Collector[] collectors = {new Collector0(), new Collector1(), new Collector2(), new Collector3()};
      for (Collector c : collectors)
      {
       c.doit(thing, startState, endState);
      }
      System.out.println();
     }
    }
}
Jason S