views:

175

answers:

3

I'm trying to generate random numbers from a gaussian distribution. Python has the very useful random.gauss() method, but this is only a one-dimensional random variable. How could I programmatically generate random numbers from this distribution in n-dimensions?

For example, in two dimensions, the return value of this method is essentially distance from the mean, so I would still need (x,y) coordinates to determine an actual data point. I suppose I could generate two more random numbers, but I'm not sure how to set up the constraints.

I appreciate any insights. Thanks!

+4  A: 

Numpy has multidimensional equivalents to the functions in the random module

The function you're looking for is numpy.random.normal

Andrew Walker
Perfect. Thank you!
Magsol
A: 

You need to properly decompose your multi-dimensional distribution into a composition of one-dimensional distributions. For example, if you want a point at a Gaussian-distributed distance from a given center and a uniformly-distributed angle around it, you'll get the polar coordinates for the delta with a Gaussian rho and a uniform theta (between 0 and 2 pi), then, if you want cartesian coordinates, you of course do a coordinate transformation.

Alex Martelli
That's, mathematically, exactly what I was shooting for, which I think could be accomplished (as Daniel Stutzbach mentioned) by assuming complete independence between each individual dimension.
Magsol
A: 

It sounds like you are asking for a Multivariate Normal Distribution. To generate a value from that distribution, you need to have a covariance matrix that spells out the relationship between x and y. How are your x and y related? If x and y are independent, you can just generate two values with random.gauss().

If you're not sure what your covariance matrix is, then you have a math problem that you need to solve before you can work on the software problem. If you provide more information about what you're trying to model, we might be able to help (and I see that Alex Martelli just posted some solutions for common models).

Daniel Stutzbach
My covariance matrix could itself be random, or it could simply be 0, since basically I don't care if there's independence or not. I'm simply trying to synthesize relatively naive data points to test a clustering algorithm I'm writing, in which case, a covariance of 0 may be easiest to test. I dismissed the idea of generating random normally-distributed numbers for each dimension out of a sense of oversimplification, but now that you bring it up, it seems like that would work just fine. Thanks!
Magsol