views:

201

answers:

1

I have 8823 data points with x,y coordinates. I'm trying to follow the answer on how to get a scatter dataset to be represented as a heatmap but when I go through the

X, Y = np.meshgrid(x, y)

instruction with my data arrays I get MemoryError. I am new to numpy and matplotlib and am essentially trying to run this by adapting the examples I can find.

Here's how I built my arrays from a file that has them stored:

XY_File = open ('XY_Output.txt', 'r')
XY = XY_File.readlines()
XY_File.close()

Xf=[]
Yf=[]
for line in XY:
    Xf.append(float(line.split('\t')[0]))
    Yf.append(float(line.split('\t')[1]))
x=array(Xf)
y=array(Yf)

Is there a problem with my arrays? This same code worked when put into this example but I'm not too sure.

Why am I getting this MemoryError and how can I fix this?

+3  A: 

Your call to meshgrid requires a lot of memory -- it produces two 8823*8823 floating point arrays. Each of them are about 0.6 GB.

But your screen can't show (and your eye can't really process) that much information anyway, so you should probably think of a way to smooth your data to something more reasonable like 1024*1024 before you do this step.

Andrew Jaffe
Am I not calculating `8823 * 8823 * 8 bytes = 600 MB` or so correctly? In any event, it's realistic that this 1.2GB could push the limits of a normal machine.
Mike Graham
So that's what's happening! I knew I didn't want a 8823x8823 image. What I want is to take all those data points and reflect their occurence rate on an image, converting a scatter plot that would have many overlapping dots to a heatmap that shows a higher incidence in some areas. Do you mind taking a look at http://stackoverflow.com/questions/2369492/generate-a-heatmap-in-matplotlib-using-a-scatter-data-set and also reply there if you know how I could achieve this? I'm going to mark this answer as accepted because it explains the problem in my question.
greye
@Mike Of course you are. It is early here...
Andrew Jaffe