tags:

views:

688

answers:

5

I am writing a program to simulate the actual polling data companies like Gallup or Rasmussen publish daily: www.gallup.com and www.rassmussenreports.com

I'm using a brute force method, where the computer generates some random daily polling data and then calculates three day averages to see if the average of the random data matches pollsters numbers. (Most companies poll numbers are three day averages)

Currently, it works well for one iteration, but my goal is to have it produce the most common simulation that matches the average polling data. I could then change the code of anywhere from 1 to 1000 iterations.

And this is my problem. At the end of the test I have an array in a single variable that looks something like this:

[40.1, 39.4, 56.7, 60.0, 20.0 ..... 19.0]

The program currently produces one array for each correct simulation. I can store each array in a single variable, but I then have to have a program that could generate 1 to 1000 variables depending on how many iterations I requested!?

How do I avoid this? I know there is an intelligent way of doing this that doesn't require the program to generate variables to store arrays depending on how many simulations I want.

Thanks, Andy


Code testing for McCain:

 test = [] 

while x < 5: 

   test = round(100*random.random())

   mctest.append(test) 

   x = x +1 


mctestavg = (mctest[0] + mctest[1] + mctest[2])/3 

#mcavg is real data

if mctestavg == mcavg[2]: 
  mcwork = mctest
  • How do I repeat without creating multiple mcwork vars?
+2  A: 

Are you talking about doing this?

>>> a = [ ['a', 'b'], ['c', 'd'] ]
>>> a[1]
['c', 'd']
>>> a[1][1]
'd'
Nick Stinemates
So it is just an array of arrays? Makes me feel stupid.
andy
Don't feel stupid. Python is Python ;)
Nick Stinemates
+1  A: 

Lists in python can contain any type of object -- If I understand the question correctly, will a list of lists do the job? Something like this (assuming you have a function generate_poll_data() which creates your data:

data = []

for in xrange(num_iterations):
    data.append(generate_poll_data())

Then, data[n] will be the list of data from the (n-1)th run.

dF
s/nth/(n+1)th/ run. Runs are counted from 1 (first run, second run, etc).
J.F. Sebastian
D'oh! Fixed, thanks!
dF
+1  A: 

since you are thinking in variables, you might prefer a dictionary over a list of lists:

data = {}
data['a'] = [generate_poll_data()]
data['b'] = [generate_poll_data()]

etc.

Daren Thomas
Thinking in variables? What else could I think in?Some background. I know a little PASCAL and am using this as a project to learn Python.
andy
I'd prefer a Dict of lists to a list of lists also.
Corey Goldberg
+1  A: 

I would strongly consider using NumPy to do this. You get efficient N-dimensional arrays that you can quickly and easily process.

Vinay
+2  A: 

Would something like this work?

from random import randint    

mcworks = []

for n in xrange(NUM_ITERATIONS):
    mctest = [randint(0, 100) for i in xrange(5)]
    if sum(mctest[:3])/3 == mcavg[2]:
        mcworks.append(mctest) # mcavg is real data

In the end, you are left with a list of valid mctest lists.

What I changed:

  • Used a list comprehension to build the data instead of a for loop
  • Used random.randint to get random integers
  • Used slices and sum to calculate the average of the first three items
  • (To answer your actual question :-) ) Put the results in a list mcworks, instead of creating a new variable for every iteration
dF
Thanks! Much simpler than what I wrote
andy