The following code runs too slowly even though everything seems to be vectorized.
from numpy import *
from scipy.sparse import *
n = 100000;
i = xrange(n); j = xrange(n);
data = ones(n);
A=csr_matrix((data,(i,j)));
x = A[i,j]
The problem seems to be that the indexing operation is implemented as a python function, and invoking A[i,...
I am trying to export a list of text strings from Python to MATLAB using scipy.io. I would like to use scipy.io because my desired .mat file should include both numerical matrices (which I learned to do here) and text cell arrays.
I tried:
import scipy.io
my_list = ['abc', 'def', 'ghi']
scipy.io.savemat('test.mat', mdict={'my_list': my...
I am trying to find a numerical package which will fit a natural spline which minimizes weighted least squares.
There is a package in scipy which does what I want for unnatural splines.
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate, randn
x = np.arange(0,5,1.0/6)
xs = np.arange(0,5,1.0/500)
y = np....
Hello,
I'm looking for a way to perform clustering separately on matrix rows and than on its columns, reorder the data in the matrix to reflect the clustering and putting it all together. The clustering problem is easily solvable, so is the dendrogram creation (for example in this blog or in "Programming collective intelligence"). Howev...
Is there any form of short-time Fourier transform with corresponding inverse transform built into SciPy or NumPy or whatever?
There's the pyplot specgram function in matplotlib, which calls ax.specgram(), which calls mlab.specgram(), which calls _spectral_helper():
#The checks for if y is x are so that we can use the same function to
...
I have a bunch of csv datasets, about 10Gb in size each. I'd like to generate histograms from their columns. But it seems like the only way to do this in numpy is to first load the entire column into a numpy array and then call numpy.histogram on that array. This consumes an unnecessary amount of memory.
Does numpy support online binnin...
I'd like to calculate the mathematical rank of a matrix using scipy. The most obvious function numpy.rank calculates the dimension of an array (ie. scalars have dimension 0, vectors 1, matrices 2, etc...). I am aware that the numpy.linalg.lstsq module has this capability, but I was wondering if such a fundamental operation is built into ...
I have a numpy array [1,2,3,4,5,6,7,8,9,10,11,12,13,14] and want to have an array structured like [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]].
Sure this is possible by looping over the large array and adding arrays of length four to the new array, but I'm curious if there is some secret 'magic' Python method doing just this :)...
What's wrong with this snippet of code?
import numpy as np
from scipy import stats
d = np.arange(10.0)
cutoffs = [stats.scoreatpercentile(d, pct) for pct in range(0, 100, 20)]
f = lambda x: np.sum(x > cutoffs)
fv = np.vectorize(f)
# why don't these two lines output the same values?
[f(x) for x in d] # => [0, 1, 2, 2, 3, 3, 4, 4, 5, 5]...
Hey, folks.
So, I'm doing some Kmeans classification using numpy arrays that are quite sparse-- lots and lots of zeroes. I figured that I'd use scipy's 'sparse' package to reduce the storage overhead, but I'm a little confused about how to create arrays, not matrices.
I've gone through this tutorial on how to create sparse matrices:
h...
1) I am using scipy's hcluster module.
so the variable that I have control over is the threshold variable.
How do I know my performance per threshold? i.e. In Kmeans, this performance will be the sum of all the points to their centroids. Of course, this has to be adjusted since more clusters = less distance generally.
Is there an obse...
Dear all,
There is a nonzero() method for the csr_matrix of scipy library, however trying to use that function for csr matrices result in an error, according to the manual that should return a tuple with row and colum arrays. Any ideas on this problem?
Best regards,
Umut
...
I have a input file which are all floating point numbers to 4 decimal place.
i.e. 13359 0.0000 0.0000 0.0001 0.0001 0.0002` 0.0003 0.0007 ...
(the first is the id).
My class uses the loadVectorsFromFile method which multiplies it by 10000 and then int() these numbers. On top of that, I also loop through each ...
Has anyone tried compiling SciPy 0.7.1 on Windows using numpy-1.3.0 that was built with the pre-built ATLAS libraries (atlas3.6.0_WinNT_P4SSE2.zip) linked in the installation document.
I get the following linker error, and have no ideas as to how to fix this issue.
$ python setup.py config --compiler=mingw32 build --compiler=mingw32 i...
I have a problem where depending on the result of a random coin flip, I have to sample a random starting position from a string. If the sampling of this random position is uniform over the string, I thought of two approaches to do it: one using multinomial from numpy.random, the other using the simple randint function of Python standard...
suppose I have a python list or a python 1-d array (represented in numpy). assume that there is a contiguous stretch of elements how can I find the start and end coordinates (i.e. indices) of the stretch of non-zeros in this list or array? for example,
a = [0, 0, 0, 0, 1, 2, 3, 4]
nonzero_coords(a) should return [4, 7]. for:
b = ...
Often, I am building an array by iterating through some data, e.g.:
my_array = []
for n in range(1000):
# do operation, get value
my_array.append(value)
# cast to array
my_array = array(my_array)
I find that I have to first build a list and then cast it (using "array") to an array. Is there a way around these? all these casting c...
I'm trying to vectorize a for loop that I have inside of a class method. The for loop has the following form: it iterates through a bunch of points and depending on whether a certain variable (called "self.condition_met" below) is true, calls a pair of functions on the point, and adds the result to a list. Each point here is an element i...
Hi together,
I have a scipy.sparse.dok_matrix (dimensions m x n), wanting to add a flat numpy-array with length m.
for col in xrange(n):
dense_array = ...
dok_matrix[:,col] = dense_array
However, this code raises an Exception in dok_matrix.__setitem__ when it tries to delete a non existing key (del self[(i,j)]).
So, for now ...
I am fitting a Gaussian kernel density estimator to a variable that is the difference of two vectors, called "diff", as follows: gaussian_kde_covfact(diff, smoothing_param) -- where gaussian_kde_covfact is defined as:
class gaussian_kde_covfact(stats.gaussian_kde):
def __init__(self, dataset, covfact = 'scotts'):
self.covfac...