ansaurus

Question

Python/Biomolecular Physics- Trying to code a simple stochastic simulation of a system exhibiting conditional behavior!

Answer 1

A:

I don't know the Gillespie algorithm, but I assume that you have checked that the program converges to the correct equilibrium. Therefore I interpret you questions as

"Here is a working physics program, how can I make it more pythonic"

It would probably be more pythonic to do something like the following

R = [ kAB * A * C,  kCD * C, kAB * A * C]
dt = [(-math.log(1-random.random())/x,i) for i,x in enumerate(R) if x > 0]
if dt:
     interval,transition = min(dt)
else:
     transition = None

If you want to use python in physics, then I suggest that you learn numpy. Because numpy is faster for many problems. So here are some untested parts of a numpy solution. Add the following to the header of you program

from numpy import log, array, isinf, zeros
from numpy.random import rand

Then you can replace the inside TransitionData with something like the following

R = array([ kAB * A * C,  kCD * C, kAB * A * C])
dt = -log(1-rand(3))/R
transition = dt.argmin()
interval = dt[transition]
if isinf(interval):
    transition = None

I don't know if it would be more pythonic to raise a StopIteration exception instead of returning None, but that is a detail.

You should also store your concentrations in a single data structure. If you use numpy, then I suggest that you use an array. Similarly yoy can use an array dABCD to store the changes in the concentration (You can probably come up with better variable names). Add the following code outside your loop

ABCD = array([A,B,C,D])

dABCD = zeros(3,4)
dABCD[0,0] = -1#A decreases when transition == 0
dABCD[0,1] = 1 #B increases when transition == 0
dABCD[1,2] = -1#C decreases when transition == 1
dABCD[1,3] = 1 #D increases when transition == 1
.....      etc

Now you can replace you main loop with something like the following

while t <= tEnd:
       print t, '\t', transition, '\t', ABCD
       transition, interval = transitionData(ABCD, kAB, kCD, kDC)
       if transition != None:
            t += interval
           ABCD += dABCD[transition,:]
       else:
           break;
else:
       print "Warning: Stopping due to timeout. The system did not equilibrate"

There is probably more to do. As an example dABCD should probably be a sparse array, but I hope that these ideas can be a start.

nielsle 2010-06-17 10:30:56

nielsle, you are the man. I really appreciate the input. I'm in the process of rewriting w/ numpy and matplotlib. Please check back tomorrow, as I'm sure I'll have some questions- they should be concise and answerable, scout's honor. Justin Peel, my thoughts on the monte carlo bit have been appended as an additional answer below (please comment, criticism is welcome!) Also, the 3rd 'answer' is some info on what the equations mean in the context of biophysics. Might be a little boring for most of you guys.

DocDubya 2010-06-17 23:44:33

Answer 2

A:

*Edit* I originally explained this wrong!!!! The following is correct though- Justin, this program uses a clever criteria to ‘weight’ each event. The RAB, RCD, and RDC values are all given a true/false parameter by multiplying kAB, kCD, and kDC by C or D, which in this case can be either one or zero. A zero value for D, and thus RDC would prevent dt[2] from being drawn in the

for n in range(len(dt)): if dt[n] > 0.0 and dt[n] < interval:

statement. Furthermore, the following-

    if transition == 1:
        C -= 1
        D += 1
    if transition == 2:
        C += 1
        D -= 1

dictates that when the event C->D occurs (transition 1), the next event necessarily must be D->C (transition 2), since of the three values in dt[], only dt[1] is nonzero and thus meets the aforementioned criteria. So, how are we weighting the likelihood that transition 0 or transition 1 occur? It's a little tricky, but is inherent in the following lines:

interval = 1e36
transition = -1
for n in range(len(dt)):  
    if dt[n] > 0.0 and dt[n] < interval:            
        interval = dt[n]            
        transition = n            
return transition, interval

"for n in range (len(dt)):" returns all values of the list dt[]. The next line specifies the criteria that must be met, namely that each value has to be greater than 0 and less than interval. For transition 0, interval is 1e36 (which is supposed to be representative of infinity). The rub is that interval is then set to transition 0, so for the second value in dt[], transition 1, the criteria states that it must be smaller than the dt for transition 0 in order to occur... or in other words that it must have happened faster to have happened at all, which agrees with chemical logic. My biggest concern is that the accumulated t values mandated by the "t += interval" line might not be entirely fair... because since t1 firing is independent of t0 firing, t0 firing and taking say, .1 sec, shouldn't exclude t1 from using the same .1 sec to fire... but I'm working on a fix for this... suggestions welcome! This is a verbose print out from the script, including a firing of transition 1 and 2:

Time Transition A B C D

dt0= 0.0350693547214
dt1= 2.26710773787
interval= 1e+36
dt= 0.0350693547214
transition= 0

0.0 0 100 0 1 0

dt0= 0.000339596342313
dt1= 0.21083283004
interval= 1e+36
dt= 0.000339596342313 
transition= 0

0.0350693547214 0 99 1 1 0

dt0= 0.0510125874767
dt1= 1.26127048627
interval= 1e+36 
dt= 0.0510125874767
transition= 0

0.0354089510637 0 98 2 1 0

dt0= 0.0809691957218
dt1= 0.593246425076
interval= 1e+36
dt= 0.0809691957218
transition= 0

0.0864215385404 0 97 3 1 0

dt0= 0.00205040633531
dt1= 1.70623338677
interval= 1e+36
dt= 0.00205040633531
transition= 0

0.167390734262 0 96 4 1 0

dt0= 0.106140534256
dt1= 0.0915160378053
interval= 1e+36
dt= 0.106140534256
transition= 0
interval= 0.106140534256
dt= 0.0915160378053
transition= 1

0.169441140598 1 95 5 1 0

dt2= 0.0482892532952
interval= 1e+36
dt= 0.0482892532952
transition= 2

0.260957178403 2 95 5 0 1

dt0= 0.112545351421
dt1= 1.84936696832
interval= 1e+36
dt= 0.112545351421
transition= 0

0.309246431698 0 95 5 1 0

Justin, I'm not sure what you mean by dt[2] being less than 1e36 making it 'stay' on transition 2? That doesn't happen because of the

if transition == 2:
            C += 1
            D -= 1

statement. Anyone know of a more direct way to accomplish this

Haha, let the flaming begin- you guys are awesome though-I really appreciate the feedback! Stackoverflow is soooooo legit.

DocDubya 2010-06-17 23:37:55

Answer 3

A:

Info on the math behind simple stochastic simulation of chemical rxns:

Typically, processes like this are simulated as discrete events with each event occurring with probability 'P' given a specific rate constant 'k' and a number of possible events 'n' in the interval of time 'dt': P=1-e**(-k*dt*n). Here we are neglecting the actual time of each event (~0) and focusing instead on the interval of time in which the event occurs. Anyone familiar with N choose K problems/bernouli trials will appreciate the presence of 1/e e.g. when N=K and N->oo, the probability of not choosing a specific element from N approaches 1/e. Hence, in a stochastic chemical reaction (first order), the probability that a molecule will not undergo reaction (not be chosen) is some power of 1/e... that power dependent on the time interval and rate constant as well as the number of molecules and rate constant in question. Conversely, 1-(1/e)^xyz gives the probability that any specific molecule will react (be chosen).

In terms of simulation, it would be logical to divide our total time interval into ever smaller intervals and use a random number generator to predict whether an event happened in a given time interval- e.g. if we divided the dt for a single even into 10 smaller intervals, a number between 0 and 0.1 would indicate an event occurred, while a number between .1 and 1.0 would indicate it did not. There is however uncertainty as to exactly when the event occurred- so we must make our intervals smaller- this quickly becomes a loosing battle as uncertainty persists with this method.

The solution to this problem is to take the natural log (‘ln’ here, log() by default in py) of both sides of the above equation and solving for dt, which gives dt= (-ln(1-P))/(k*n). The probability P is then randomly generated, giving a definitive dt for each event.

DocDubya 2010-06-17 23:46:33

ansaurus

tags:

views:

answers:

Python/Biomolecular Physics- Trying to code a simple stochastic simulation of a system exhibiting conditional behavior!

related questions