views:

3073

answers:

12

Hi,

I have a little Java problem I want to translate to Python. Therefor I need a multidimensional array. In Java it looks like:

double dArray[][][] = new double[x.length()+1][y.length()+1][x.length()+y.length()+3];
dArray[0][0][0] = 0;
dArray[0][0][1] = POSITIVE_INFINITY;

Further values will be created bei loops and written into the array.

How do I instantiate the array?

PS: There is no matrix multiplication involved...

+3  A: 

For numeric data, Numpy Arrays:

>>> matrix1 = array(([0,1],[1,3]))
>>> print matrix1
[[0 1]
[1 3]]

For general data (e.g. strings), you can use a list of lists, list of tuples, ...

matrix2 = [['a','b'], ['x','y']]
gimel
Can I initialize it a given number of rows and columns like in C?
Mutable objects (e.g. list) can change - so initialization is not always relevant. In numpy, an Array has a Shape (dimensions) that can be modified without changing the data.
gimel
@Popper: Yes. zeros( (3,4) ) will create a 3x4 array of zeros.
S.Lott
+18  A: 

If you restrict yourself to the Python standard library, then a list of lists is the closest construct:

arr = [[1,2],[3,4]]

gives a 2d-like array. The rows can be accessed as arr[i] for i in {0,..,len(arr}, but column access is difficult.

If you are willing to add a library dependency, the NumPy package is what you really want. You can create a fixed-length array from a list of lists using:

import numpy
arr = numpy.array([[1,2],[3,4]])

Column access is the same as for the list-of-lists, but column access is easy: arr[:,i] for i in {0,..,arr.shape[1]} (the number of columns).

In fact NumPy arrays can be n-dimensional.

Empty arrays can be created with

numpy.empty(shape)

where shape is a tuple of size in each dimension; shape=(1,3,2) gives a 3-d array with size 1 in the first dimension, size 3 in the second dimension and 2 in the 3rd dimension.

If you want to store objects in a NumPy array, you can do that as well:

 arr = numpy.empty((1,), dtype=numpy.object)
 arr[0] = 'abc'

For more info on the NumPy project, check out the NumPy homepage.

Barry Wark
There is no need to use tuple. `numpy.empty([1])` works as well.
J.F. Sebastian
The simple "random" number generator: `noise = lambda: numpy.empty([1])[0]`
J.F. Sebastian
True, any iterable (list, tuple, etc.) works as the dimensions argument. Your "random" number generator relies on the uninitialized memory being "random" and should *never* be used as a source of randomness for anything that maters (e.g. cryptography).
Barry Wark
@Barry: True, that's why I've put quotes around `random`
J.F. Sebastian
@JFSebastian: I'm sure you know the difference. Just wanted to make sure the Google did too, for posterity and all that.
Barry Wark
@Barry: For posterity it should be noted that `numpy.empty()` expects a sequence as the first argument (not just any iterable).
J.F. Sebastian
@JFSebastian: Thanks ;)
Barry Wark
+7  A: 

To create a standard python array of arrays of arbitrary size:

a = [[0]*cols for _ in [0]*rows]

It is accessed like this:

a[0][1] = 5 # set cell at row 0, col 1 to 5

A small python gotcha that's worth mentioning: It is tempting to just type

a = [[0]*cols]*rows

but that'll copy the same column array to each row, resulting in unwanted behaviour. Namely:

>>> a[0][0] = 5
>>> print a[1][0]
5
Deestan
I'd rather use: `[[0]*cols for _ in xrange(rows)]`
J.F. Sebastian
+10  A: 

Multidimensional arrays are a little murky. There are few reasons for using them and many reasons for thinking twice and using something else that more properly reflects what you're doing. [Hint. your question was thin on context ;-) ]

If you're doing matrix math, then use numpy.

However, some folks have worked with languages that force them to use multi-dimensional arrays because it's all they've got. If your as old as I am (I started programming in the 70's) then you may remember the days when multidimensional arrays were the only data structure you had. Or, your experience may have limited you to languages where you had to morph your problem into multi-dimensional arrays.

Say you have a collection n 3D points. Each point has an x, y, z, and time value. Is this an n x 4 array? Or a 4 * n array? Not really.

Since each point has 4 fixed values, this is more properly a list of tuples.

a = [ ( x, y, z, t ), ( x, y, z, t ), ... ]

Better still, we could represent this as a list of objects.

class Point( object ):
    def __init__( self, x, y, z, t ):
        self.x, self.y, self.z, self.t = x, y, z, t

a = [ Point(x,y,x,t), Point(x,y,z,t), ... ]
S.Lott
+1. I'd mention `namedtuple`s
J.F. Sebastian
A: 

Another option is to use a dictionary:

>>> from collections import defaultdict
>>> array = defaultdict(int) # replace int with the default-factory you want
>>> array[(0,0)]
0
>>> array[(99,99)]
0

You'll need to keep track of the upper & lower bounds as well.

Matthew Trevor
+9  A: 

You can create it using nested lists:

matrix = [[a,b],[c,d],[e,f]]

If it has to be dynamic it's more complicated, why not write a small class yourself?

class Matrix(object):
    def __init__(self, rows, columns, default=0):
        self.m = []
        for i in range(rows):
            self.m.append([default for j in range(columns)])

    def __getitem__(self, index):
        return self.m[index]

This can be used like this:

m = Matrix(10,5)
m[3][6] = 7
print m[3][6] // -> 7

I'm sure one could implement it much more efficient. :)

If you need multidimensional arrays you can either create an array and calculate the offset or you'd use arrays in arrays in arrays, which can be pretty bad for memory. (Could be faster though…) I've implemented the first idea like this:

class Matrix(object):
    def __init__(self, *dims):
        self._shortcuts = [i for i in self._create_shortcuts(dims)]
        self._li = [None] * (self._shortcuts.pop())
        self._shortcuts.reverse()

    def _create_shortcuts(self, dims):
        dimList = list(dims)
        dimList.reverse()
        number = 1
        yield 1
        for i in dimList:
            number *= i
            yield number

    def _flat_index(self, index):
        if len(index) != len(self._shortcuts):
            raise TypeError()

        flatIndex = 0
        for i, num in enumerate(index):
            flatIndex += num * self._shortcuts[i]
        return flatIndex

    def __getitem__(self, index):
        return self._li[self._flat_index(index)]

    def __setitem__(self, index, value):
        self._li[self._flat_index(index)] = value

Can be used like this:

m = Matrix(4,5,2,6)
m[2,3,1,3] = 'x'
m[2,3,1,3] // -> 'x'
Georg
+10  A: 

Take a look at numpy

here's a code snippet for you

import numpy as npy

d = npy.zeros((len(x)+1, len(y)+1, len(x)+len(y)+3))
d[0][0][0] = 0 # although this is unnecessary since zeros initialises to zero
d[i][j][k] = npy.inf

I don't think you need to be implementing a scientific application to justify the use of numpy. It is faster and more flexible and you can store pretty much anything. Given that I think it is probably better to try and justify not using it. There are legitimate reasons, but it adds a great deal and costs very little so it deserves consideration.

P.S. Are your array lengths right? It looks like a pretty peculiar shaped matrix...

Simon
+4  A: 

Probably not relevant for you but if you are doing serious matrix work see numpy

Martin Beckett
+5  A: 

If you are OK using sparse arrays, you could use a dict to store your values. Python's dicts allow you to use tuples as keys, as such, you could assign to and access elements of the "sparse array" (which is really a dict here) like this:

d = {}
d[0,2,7] = 123 # assign 123 to x=0, y=2, z=7
v = d[0,2,7]
paprika
Nice trick to store a multidimensional array. However things would get very messy when trying to iterate over elements say of a column, or a row, or even to be able to get the 'dimensions'.
Indeed. If I were to use something like this, I'd of course encapsulate this into a class and introduce methods for read/write access. An instance variable would store the dimension of the array and the read/write methods would return a default value for unassigned elements (by catching KeyError).
paprika
+1: dictionary is considerably simpler. Iteration over dimensions is easy, not messy. xRange = set( [ k[0] for k in d.keys() ] ), yRange = set( [ k[1] for k in d.keys() ] )
S.Lott
@S.Lott: what I mean by messy is that iteration that you suggest would only work if you have filled all the elements of this multidimensional array, however if you have "gaps" this simply wouldn't work. anyways +1 to paprika for an original solution.
+2  A: 

Here's a quick way to create a nested 3-dimensional list initialized with zeros:

# dim1, dim2, dim3 are the dimensions of the array
a =[[[0 for _ in range(dim1)] for _ in range(dim2)] for _ in range(dim1) ]
a[0][0][0] = 1

this is a list of lists of lists, a bit more flexible than an array, you can do:

a[0][0] = [1,2,3,4]

to replace a whole row in the array, or even abuse it like that:

a[0] = "Ouch"
print a[0][0] #will print "O", since strings are indexable the same way as lists
print a[0][0][0] #will raise an error, since "O" isn't indexable

but if you need performance, then I agree that numpy is the way to go.

Also, beware of:

a = [[[0] * 5]*5]*5]

If you try a[0][0][0]=7 on the object above, you will see what's wrong with that.

Rafał Dowgird
A: 

I've just stepped into a similar need and coded this:

def nDimensionsMatrix(dims, elem_count, ptr=[]):
    if (dims > 1):
     for i in range(elem_count[dims-1]):
      empty = []
      ptr.append(empty)
      nDimensionsMatrix(dims-1, elem_count, empty)
     return ptr
    elif dims == 1:
     ptr.extend([0 for i in range(elem_count[dims])])
     return ptr

matrix = nDimensionsMatrix(3, (2,2,2))

I'm not looking at speed, only funcionality ;)

I want to create a matrix with N dimensions and initialize with 0 (a *elem_count* number of elements in each dimension).

Hope its helps someone

mrlinx
A: 

Easy, when using numpy:

b = ones((2,3,4)) # creates a 2x3x4 array containing all ones.

'ones' can be replaced with 'zeros'