views:

824

answers:

3

I have 20+ tables similar to table 1. where all letters represent actual values

Table 1:
$ / cars |<1 | 2 | 3 | 4+
<10,000  | a | b | c | d
20,000   | e | f | g | h
30,000   | i | j | k | l
40,000+  | m | n | o | p

A user input could be for example, (2.4, 24594) which is a value between f, g, j, and k. my python function definition and pseudo-code to calculate this bilinear interpolation is as follow:

def bilinear_interpolation( x_in, y_in, x_high, x_low, y_low, y_high ):
   # interpolate with respect to x
   # interpolate with respect to y
   # return result

Q: How should i store the data from table 1 (a file, a dict, tuple of tuples, or dict of lists) so i can perform the bilinear interpolation most efficiently and correctly?

A: 

There's nothing special about bilinear interpolation that makes your use case particularly odd; you just have to do two lookups (for storage units of full rows/columns) or four lookups (for array-type storage). The most efficient method depends on your access patterns and the structure of the data.

If your example is truly representative, with 16 total entries, you can store it however you want and it'll be fast enough for any kind of sane loads.

kquinn
+3  A: 

I'd keep a sorted list of the first column, and use the bisect module in the standard library to look for the values -- it's the best way to get the immediately-lower and immediately-higher indices. Every other column can be kept as another list parallel to this one.

Alex Martelli
+5  A: 

If you want the most computationally efficient solution I can think of and are not restricted to the standard library, then I would recommend scipy/numpy. First, store the a..p array as a 2D numpy array and then both the $4k-10k and 1-4 arrays as 1D numpy arrays. Use scipy's interpolate.interp1d if both 1D arrays are monotonically increasing, or interpolate.bsplrep (bivariate spline representation) if not and your example arrays are as small as your example. Or simply write your own and not bother with scipy. Here are some examples:

# this follows your pseudocode most closely, but it is *not*
# the most efficient since it creates the interpolation 
# functions on each call to bilinterp
from scipy import interpolate
import numpy
data = numpy.arange(0., 16.).reshape((4,4))  #2D array
prices = numpy.arange(10000., 50000., 10000.)
cars = numpy.arange(1., 5.)
def bilinterp(price,car):
    return interpolate.interp1d(cars, interpolate.interp1d(prices, a)(price))(car)
print bilinterp(22000,2)

The last time I checked (a version of scipy from 2007-ish) it only worked for monotonically increasing arrays of x and y)

for small arrays like this 4x4 array, I think you want to use this: http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.bisplrep.html#scipy.interpolate.bisplrep which will handle more interestingly shaped surfaces and the function only needs to be created once. For larger arrays, I think you want this (not sure if this has the same restrictions as interp1d): http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp2d.html#scipy.interpolate.interp2d but they both require a different and more verbose data structure than the three arrays in the example above.

bpowah
please give some examples, I have a similar problem but can't crack it in O(log n)
Reef
I like this since i'm already using numpy in my application :D thank you
dassouki