views:

191

answers:

5

I am looking for a pre-built Java data structure with the following characteristics:

  1. It should look something like an ArrayList but should allow indexing via double-precision rather than integers. Note that this means that it's likely that you'll see indicies that don't line up with the original data points (i.e., asking for the value that corresponds to key "1.5"). EDIT: For clarity, based on the comments, I'm not looking to change the ArrayList implementation. I'm looking for a similar interface and developer experience.

  2. As a consequence, the value returned will likely be interpolated. For example, if the key is 1.5, the value returned could be the average of the value at key 1.0 and the value at key 2.0.

  3. The keys will be sorted but the values are not ensured to be monotonically increasing. In fact, there's no assurance that the first derivative of the values will be continuous (making it a poor fit for certain types of splines).

  4. Freely available code only, please.

For clarity, I know how to write such a thing. In fact, we already have an implementation of this and some related data structures in legacy code that I want to replace due to some performance and coding issues.

What I'm trying to avoid is spending a lot of time rolling my own solution when there might already be such a thing in the JDK, Apache Commons or another standard library. Frankly, that's exactly the approach that got this legacy code into the situation that it's in right now....

Is there such a thing out there in a freely available library?

+3  A: 

Allowing double values as indices is a pretty large change from what ArrayList does.

The reason for this is that an array or list with double as indices would almost by definition be a sparse array, which means it has no value (or depending on your definition: a fixed, known value) for almost all possible indices and only a finite number of indices have an explicit value set.

There is no prebuilt class in Java SE that supports all that.

Personally I'd implement such a data structure as a skip-list (or similar fast-searching data structure) of (index, value) tuples with appropriate interpolation.

Edit: Actually there's a pretty good match for the back-end storage (i.e. everything except for the interpolation): Simply use a NavigableMap such as a TreeMap to store the mapping from index to value.

With that you can easily use ceilingEntry() and (if necessary) higherEntry() to get the closest value(s) to the index you need and then interpolate from those.

Joachim Sauer
For clarity, I'm not looking for a change to the ArrayList. I'm looking for something like the interface, not a change to the original implementation.
Bob Cross
@Bob: that's what I thought, I just wanted to point out that "something like ArrayList" implies a level of similarity that probably can't be reached.
Joachim Sauer
@Joachim, really? That's the code that I'm looking at right now but I'm trying to replace with a better implementation. The front-end interface isn't hard to achieve. It's the under-the-covers implementation that requires actual work.
Bob Cross
A: 

That's a huge change from ArrayList.

Same as Joachim's response above, but I'd probably implement this as a binary tree, and when I didn't find something I was looking for, average the value of the next smallest and largest values, which should be quick to traverse to.

Dean J
A: 

Your description that it should be "like an ArrayList" is misleading, since what you've described is a one dimensional interpolator and has essentially nothing in common with an ArrayList. This is why you're getting suggestions for other data structures which IMO are sending you down the wrong path.

I don't know of any available in Java (and couldn't easily find one one google), but I think you should have a look at GSL - GNU Scientific Library which includes a spline interpolator. It may be a bit heavy for what you're looking for since it's a two dimensional interpolator, but it seems like you should be looking for something like this rather than something like an ArrayList.

If you'd like it to "look like an ArrayList" you can always wrap it in a Java class which has access methods similar to the List interface. You won't be able to actually implement the interface though, since the methods are declared to take integer indices.

Graphics Noob
+1  A: 

In the Apache commons-math library, if you implement the UnivariateRealInterpolator and the return value of its interpolate method which is typed UnivariateRealFunction you'll be most of the way there.

The interpolator interface takes two arrays, x[] and y[]. The returned function has a method, value() that takes an x' and returns the interpolated y'.

Where it fails to provide an ArrayList-like experience is in the ability to add more values to the range and domain as if the List is growing.

Additionally, they look to be in need of some additional interpolation functions. There are only 4 implementations in the library for the stable release. As a commenter pointed out, it seems to be missing 'linear' or something even simpler like nearest neighbor. Maybe that's not really interpolation...

Jay R.
+1  A: 

If your current implementation has complexity O(log N) for interpolating a value, the implementation I just made up may be for you:

package so2675929;

import java.util.Arrays;

public abstract class AbstractInterpolator {
  private double[] keys;
  private double[] values;
  private int size;

  public AbstractInterpolator(int initialCapacity) {
    keys = new double[initialCapacity];
    values = new double[initialCapacity];
  }

  public final void put(double key, double value) {
    int index = indexOf(key);
    if (index >= 0) {
      values[index] = value;
    } else {
      if (size == keys.length) {
        keys = Arrays.copyOf(keys, size + 32);
        values = Arrays.copyOf(values, size + 32);
      }
      int insertionPoint = insertionPointFromIndex(index);
      System.arraycopy(keys, insertionPoint, keys, insertionPoint + 1, size - insertionPoint);
      System.arraycopy(values, insertionPoint, values, insertionPoint + 1, size - insertionPoint);
      keys[insertionPoint] = key;
      values[insertionPoint] = value;
      size++;
    }
  }

  public final boolean containsKey(double key) {
    int index = indexOf(key);
    return index >= 0;
  }

  protected final int indexOf(double key) {
    return Arrays.binarySearch(keys, 0, size, key);
  }

  public final int size() {
    return size;
  }

  protected void ensureValidIndex(int index) {
    if (!(0 <= index && index < size))
      throw new IndexOutOfBoundsException("index=" + index + ", size=" + size);
  }

  protected final double getKeyAt(int index) {
    ensureValidIndex(index);
    return keys[index];
  }

  protected final double getValueAt(int index) {
    ensureValidIndex(index);
    return values[index];
  }

  public abstract double get(double key);

  protected static int insertionPointFromIndex(int index) {
    return -(1 + index);
  }
}

The concrete interpolators will only have to implement the get(double) function.

For example:

package so2675929;

public class LinearInterpolator extends AbstractInterpolator {

  public LinearInterpolator(int initialCapacity) {
    super(initialCapacity);
  }

  @Override
  public double get(double key) {
    final double minKey = getKeyAt(0);
    final double maxKey = getKeyAt(size() - 1);
    if (!(minKey <= key && key <= maxKey))
      throw new IndexOutOfBoundsException("key=" + key + ", min=" + minKey + ", max=" + maxKey);

    int index = indexOf(key);
    if (index >= 0)
      return getValueAt(index);

    index = insertionPointFromIndex(index);
    double lowerKey = getKeyAt(index - 1);
    double lowerValue = getValueAt(index - 1);
    double higherKey = getKeyAt(index);
    double higherValue = getValueAt(index);

    double rate = (higherValue - lowerValue) / (higherKey - lowerKey);
    return lowerValue + (key - lowerKey) * rate;
  }

}

And, finally, a unit test:

package so2675929;

import static org.junit.Assert.*;

import org.junit.Test;

public class LinearInterpolatorTest {

  @Test
  public void simple() {
    LinearInterpolator interp = new LinearInterpolator(2);
    interp.put(0.0, 0.0);
    interp.put(1.0, 1.0);

    assertEquals(0.0, interp.getValueAt(0), 0.0);
    assertEquals(1.0, interp.getValueAt(1), 0.0);
    assertEquals(0.0, interp.get(0.0), 0.0);
    assertEquals(0.1, interp.get(0.1), 0.0);
    assertEquals(0.5, interp.get(0.5), 0.0);
    assertEquals(0.9, interp.get(0.9), 0.0);
    assertEquals(1.0, interp.get(1.0), 0.0);

    interp.put(0.5, 0.0);

    assertEquals(0.0, interp.getValueAt(0), 0.0);
    assertEquals(0.0, interp.getValueAt(1), 0.0);
    assertEquals(1.0, interp.getValueAt(2), 0.0);
    assertEquals(0.0, interp.get(0.0), 0.0);
    assertEquals(0.0, interp.get(0.1), 0.0);
    assertEquals(0.0, interp.get(0.5), 0.0);
    assertEquals(0.75, interp.get(0.875), 0.0);
    assertEquals(1.0, interp.get(1.0), 0.0);
  }

  @Test
  public void largeKeys() {
    LinearInterpolator interp = new LinearInterpolator(10);
    interp.put(100.0, 30.0);
    interp.put(200.0, 40.0);

    assertEquals(30.0, interp.get(100.0), 0.0);
    assertEquals(35.0, interp.get(150.0), 0.0);
    assertEquals(40.0, interp.get(200.0), 0.0);

    try {
      interp.get(99.0);
      fail();
    } catch (IndexOutOfBoundsException e) {
      assertEquals("key=99.0, min=100.0, max=200.0", e.getMessage());
    }
    try {
      interp.get(201.0);
      fail();
    } catch (IndexOutOfBoundsException e) {
      assertEquals("key=201.0, min=100.0, max=200.0", e.getMessage());
    }
  }

  private static final int N = 10 * 1000 * 1000;

  private double measure(int size) {
    LinearInterpolator interp = new LinearInterpolator(size);
    for (int i = 0; i < size; i++)
      interp.put(i, i);
    double max = interp.size() - 1;
    double sum = 0.0;
    for (int i = 0; i < N; i++)
      sum += interp.get(max * i / N);
    return sum;
  }

  @Test
  public void speed10() {
    assertTrue(measure(10) > 0.0);
  }

  @Test
  public void speed10000() {
    assertTrue(measure(10000) > 0.0);
  }

  @Test
  public void speed1000000() {
    assertTrue(measure(1000000) > 0.0);
  }
}

So the functionality seems to work. I only measured speed in some simple cases, and these suggest that scaling will be better than linear.

Update (2010-10-17T23:45+0200): I made some stupid mistakes in checking the key argument in the LinearInterpolator, and my unit tests didn't catch them. Now I extended the tests and fixed the code accordingly.

Roland Illig