numpy

List comprehension, map, and numpy.vectorize performance

I have a function foo(i) that takes an integer and takes a significant amount of time to execute. Will there be a significant performance difference between any of the following ways of initializing a: a = [foo(i) for i in xrange(100)] a = map(foo, range(100)) vfoo = numpy.vectorize(foo) a = vfoo(range(100)) (I don't care whether t...

Sorting a 2D numpy array by multiple axes

I have a 2D numpy array of shape (N,2) which is holding N points (x and y coordinates). For example: array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]]) I'd like to sort it such that my points are ordered by x-coordinate, and then by y in cases where the x coordinate is the same. So the array above should look ...

draw csv file data as a heatmap using numpy and matplotlib

Hello all, I was able to load my csv file into a numpy array: data = np.genfromtxt('csv_file', dtype=None, delimiter=',') Now I would like to generate a heatmap. I have 19 categories from 11 samples, along these lines: COG station1 station2 station3 station4 COG0001 0.019393497 0.1831224...

Fastest Way to generate 1,000,000+ random numbers in python

I am currently writing an app in python that needs to generate large amount of random numbers, FAST. Currently I have a scheme going that uses numpy to generate all of the numbers in a giant batch (about ~500,000 at a time). While this seems to be faster than python's implementation. I still need it to go faster. Any ideas? I'm open to w...

Bitwise Operations on Rows of lil_matrix

How can I quickly extract two rows of a scipy.sparse.lil_matrix and apply bitwise operations on them? I've tried: np.bitwise_and(A[1,:], A[2,:]) but NumPy seems to want an array type according to the documentation. ...

making binned boxplot in matplotlib with numpy and scipy in Python

I have a 2-d array containing pairs of values and I'd like to make a boxplot of the y-values by different bins of the x-values. I.e. if the array is: my_array = array([[1, 40.5], [4.5, 60], ...]]) then I'd like to bin my_array[:, 0] and then for each of the bins, produce a boxplot of the corresponding my_array[:, 1] values that fall ...

Fastest way to generate delimited string from 1d numpy array

I have a program which needs to turn many large one-dimensional numpy arrays of floats into delimited strings. I am finding this operation quite slow relative to the mathematical operations in my program and am wondering if there is a way to speed it up. For example, consider the following loop, which takes 100,000 random numbers in a nu...

List of objects or parallel arrays of properties?

The question is, basically: what would be more preferable, both performance-wise and design-wise - to have a list of objects of a Python class or to have several lists of numerical properties? I am writing some sort of a scientific simulation which involves a rather large system of interacting particles. For simplicity, let's say we hav...

Reading numpy arrays outside of Python

In a recent question I asked about the fastest way to convert a large numpy array to a delimited string. My reason for asking was because I wanted to take that plain text string and transmit it (over HTTP for instance) to clients written in other programming languages. A delimited string of numbers is obviously something that any client ...

slicing arrays in numpy/scipy

I have an array like: a = array([[1,2,3],[3,4,5],[4,5,6]]) what's the most efficient way to slice out a 1x2 array out of this that has only the first two columns of "a"? I.e., array([[2,3],[4,5],[5,6]]) in this case. thanks. ...

Why numpy is 'slow' by itself?

Given the thread here It seems that numpy is not the most ideal for ultra fast calculation. Does anyone know what overhead we must be aware of when using numpy for numerical calculation? ...

numpy arange with multiple intervals

Hi, i have an numpy array which represents multiple x-intervals of a function: In [137]: x_foo Out[137]: array([211, 212, 213, 214, 215, 216, 217, 218, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950]) as you can see, in x_foo are two intervals: one from 211 to 218, and one from 940 to 950. these are intervals, which i w...

How to make scipy.interpolate give a an extrapolated result beyond the input range?

I'm trying to port a program which uses a hand-rolled interpolator (developed by a mathematician colleage) over to use the interpolators provided by scipy. I'd like to use or wrap the scipy interpolator so that it has as close as possible behavior to the old interpolator. A key difference between the two functions is that in our origina...

Incremental PCA

Hi, Lately, I've been looking into an implementation of an incremental PCA algorithm in python - I couldn't find something that would meet my needs so I did some reading and implemented an algorithm I found in some paper. Here is the module's code - the relevant paper on which it is based is mentioned in the module's documentation. I w...

Enthought Python, Sage, or others (in Unix clusters)

I have access to a cluster of Unix machines, but they don't have the software I need (numpy, scipy, matplotlib, etc), so I have to install them by myself (I don't have root permissions, either, so commands like apt-get or yast don't work). In the worst case, I will have to compile them all from source. Is there any better way to proceed...

vectorized approach to binning with numpy/scipy in Python

I am binning a 2d array (x by y) in Python into the bins of its x value (given in "bins"), using np.digitize: elements_to_bins = digitize(vals, bins) where "vals" is a 2d array, i.e.: vals = array([[1, v1], [2, v2], ...]). elements_to_bins just says what bin each element falls into. What I then want to do is get a list whose len...

Error Converting PIL B&W images to Numpy Arrays

I am getting weird errors when I try to convert a black and white PIL image to a numpy array. An example of the code I am working with is below. if image.mode != '1': image = image.convert('1') #convert to B&W data = np.array(image) #Have also tried np.asarray(image) n_lines = data.shape[0] #number of raster passes ...

Doing arithmetic with up to two decimal places in Python?

I have two floats in Python that I'd like to subtract, i.e. v1 = float(value1) v2 = float(value2) diff = v1 - v2 I want "diff" to be computed up to two decimal places, that is compute it using %.2f of v1 and %.2f of v2. How can I do this? I know how to print v1 and v2 up to two decimals, but not how to do arithmetic like that. The ...

Mapping functions of 2D numpy arrays

I have a function foo that takes a NxM numpy array as an argument and returns a scalar value. I have a AxNxM numpy array data, over which I'd like to map foo to give me a resultant numpy array of length A. Curently, I'm doing this: result = numpy.array([foo(x) for x in data]) It works, but it seems like I'm not taking advantage of t...

Python/Numpy: Divide array

Hi all I have some data represented in a 1300x1341 matrix. I would like to split this matrix in several pieces (e.g. 9) so that I can loop over and process them. The data needs to stay ordered in the sense that x[0,1] stays below (or above if you like) x[0,0] and besides x[1,1]. Just like if you had imaged the data, you could draw 2 ver...