numpy

Using Numpy to find the average distance in a set of points

I have an array of points in unknown dimensional space, such as: data=numpy.array( [[ 115, 241, 314], [ 153, 413, 144], [ 535, 2986, 41445]]) and I would like to find the average euclidean distance between all points. Please note that I have over 20,000 points, so I would like to do this as efficiently as possible. Thanks. ...

how to load data and store the data from a file using numpy

I have the following file like this: 2 qid:1 1:0.32 2:0.50 3:0.78 4:0.02 10:0.90 5 qid:2 2:0.22 5:0.34 6:0.87 10:0.56 12:0.32 19:0.24 20:0.55 ... he structure is follwoing like that: output={} rel=2 qid=1 features={} # the feature list "1:0.32 2:0.50 3:0.78 4:0.02 10:0.90" output.append([rel,qid,features]) ... How can I write my ...

Convert list in tuple to numpy array?

I have tuple of lists. One of these lists is a list of scores. I want to convert the list of scores to a numpy array to take advantage of the pre-built stats that scipy provides. In this case the tuple is called 'data' In [12]: type data[2] -------> type(data[2]) Out[12]: <type 'list'> In [13]: type data[2][1] -------> type(data[2][1]...

weighted std in numpy?

Hi Folks, numpy.average() has a weights option, but numpy.std() does not. Do folks have suggestions for a workaround? Thanks! /YGA ...

How to make the angles in a matplotlib polar plot go clockwise with 0° at the top?

I am using matplotlib and numpy to make a polar plot. Here is some sample code: import numpy as N import matplotlib.pyplot as P angle = N.arange(0, 360, 10, dtype=float) * N.pi / 180.0 arbitrary_data = N.abs(N.sin(angle)) + 0.1 * (N.random.random_sample(size=angle.shape) - 0.5) P.clf() P.polar(angle, arbitrary_data) P.show() You wil...

vectorize is indeterminate

I'm trying to vectorize a simple function in numpy and getting inconsistent behavior. I expect my code to return 0 for values < 0.5 and the unchanged value otherwise. Strangely, different runs of the script from the command line yield varying results: sometimes it works correctly, and sometimes I get all 0's. It doesn't matter which ...

Removing duplicates (within a given tolerance) from a Numpy array of vectors

I have an Nx5 array containing N vectors of form 'id', 'x', 'y', 'z' and 'energy'. I need to remove duplicate points (i.e. where x, y, z all match) within a tolerance of say 0.1. Ideally I could create a function where I pass in the array, columns that need to match and a tolerance on the match. Following this thread on Scipy-user, I ca...

numpy.equal with string values

The numpy.equal function does not work if a list or array contains strings: >>> import numpy >>> index = numpy.equal([1,2,'a'],None) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: function not supported for these types, and can't coerce safely to supported types What is the easiest way to workaroun...

Unpacking tuples/arrays/lists as indices for Numpy Arrays

I would love to be able to do >>> A = numpy.array(((1,2),(3,4))) >>> idx = (0,0) >>> A[*idx] and get 1 however this is not valid syntax. Is there a way of doing this without explicitly writing out >>> A[idx[0], idx[1]] ? EDIT: Thanks for the replies. In my program I was indexing with a Numpy array rather than a tuple and gettin...

2d convolution using python and numpy

Hi I am trying to perform a 2d convolution in python using numpy I have a 2d array as follows with kernel H_r for the rows and H_c for the columns data = np.zeros((nr, nc), dtype=np.float32) #fill array with some data here then convolve for r in range(nr): data[r,:] = np.convolve(data[r,:], H_r, 'same') for c in range(nc): ...

Matplotlib installation problems

Hi, I need to install matplotlib in a remote linux machine, and I am a normal user there. I downlodad the source and run python setup.py build but I get errors, related with numpy, which is not installed, so I decieded to install it first. I download and compile with python setup.py build My question now is, how do I tell to te...

Reordering matrix elements to reflect column and row clustering in naiive python

Hello, I'm looking for a way to perform clustering separately on matrix rows and than on its columns, reorder the data in the matrix to reflect the clustering and putting it all together. The clustering problem is easily solvable, so is the dendrogram creation (for example in this blog or in "Programming collective intelligence"). Howev...

MemoryError when running Numpy Meshgrid

I have 8823 data points with x,y coordinates. I'm trying to follow the answer on how to get a scatter dataset to be represented as a heatmap but when I go through the X, Y = np.meshgrid(x, y) instruction with my data arrays I get MemoryError. I am new to numpy and matplotlib and am essentially trying to run this by adapting the examp...

Numpy image - rotate matrix 270 degrees...

I've got a Numpy 2d array that represents a grey-scale image and I need to rotate it 270 degrees. Might be being a bit thick here but the two ways I can find to do this seem quite... circulous: rotated = numpy.rot90(numpy.rot90(numpy.rot90(orignumpyarray))) rotated = numpy.fliplr(numpy.flipud(numpy.rot90(orignumpyarray))) I'm thinki...

Numpy histogram of large arrays

I have a bunch of csv datasets, about 10Gb in size each. I'd like to generate histograms from their columns. But it seems like the only way to do this in numpy is to first load the entire column into a numpy array and then call numpy.histogram on that array. This consumes an unnecessary amount of memory. Does numpy support online binnin...

Calculate Matrix Rank using scipy

I'd like to calculate the mathematical rank of a matrix using scipy. The most obvious function numpy.rank calculates the dimension of an array (ie. scalars have dimension 0, vectors 1, matrices 2, etc...). I am aware that the numpy.linalg.lstsq module has this capability, but I was wondering if such a fundamental operation is built into ...

Euclidian Distances between points

I have an array of points in numpy: points = rand(dim, n_points) And I want to: Calculate all the l2 norm (euclidian distance) between a certain point and all other points Calculate all pairwise distances. and preferably all numpy and no for's. How can one do it? ...

undo or reverse argsort(), python

Given an array 'a' I would like to sort the array by columns "sort(a, axis=0)" do some stuff to the array and then undo the sort. By that I don't mean re sort but basically reversing how each element was moved. I assume argsort() is what I need but it is not clear to me how to sort an array with the results of argsort() or more important...

Stretch array (Numpy, Python)

I have a numpy array [1,2,3,4,5,6,7,8,9,10,11,12,13,14] and want to have an array structured like [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]]. Sure this is possible by looping over the large array and adding arrays of length four to the new array, but I'm curious if there is some secret 'magic' Python method doing just this :)...

String comparison in Numpy

In the following example In [8]: import numpy as np In [9]: strings = np.array(['hello ', 'world '], dtype='|S10') In [10]: strings == 'hello' Out[10]: array([False, False], dtype=bool) The comparison fails because of the whitespace. Is there a Numpy built-in function that does the equivalent of In [12]: np.array([x.strip()==...