views:

50

answers:

2

When dealing with Dirichlet Processes, according to [Teh, 2007], a DP is defined as by a base Probability H and a scale factor "alpha"

According to the Stick Breaking Construction, the random draws G from a DP:

G~DP(alpha,H)

are given by:

G=sum(pi_k*delta_theta_k) over k from 1 to infinity

pi_k are ordered draws from a Beta Distribution given the length of an unitary stick

delta_theta_k is a point mass centered in "theta_k" (theta_k are random draws from the base distribution)

I have pretty much a clear understanding of all the variables, but I do not know what do they mean by "mass point" is it the probability density of that draw, or is it something else.

It would be great if you could point me in any direction, only a reference would be amazing.

Thanks

+1  A: 

There are several references on

http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=Dirichlet#sclient=psy&hl=en&q=Dirichlet+Processes&aq=f&aqi=g1g-m4&aql=&oq=&gs_rfai=&pbx=1&fp=f4e2c44985951506

google for this topic.

I checked out:

http://en.wikipedia.org/wiki/Dirichlet_process

Which has some nice examples and a simplistic explanation of its parts.

There is also:

http://www.cs.cmu.edu/~kbe/dp_tutorial.pdf

which shows some great visual demonstrations of the Dirichlet Processes.

Michael Eakins
Thanks, but i already read and googled all of that, mine is a rather specific question.
Leon palafox
+1  A: 

G is a probability distribution over probability distributions. These (sub) probability distributions are over some domain, let's call it BigTheta.

Each theta_k is a draw from a distribution over BigTheta, so it is some element of BigTheta.

Each delta_theta_k is a probability distribution over BigTheta, defined to be delta_theta_k(theta_k) = 1 and delta_theta_k(anything else) = 0. This is what they call the 'point mass' distribution, because all the mass of the distribution is over a single point of the domain.

G is a probability distribution over probability distributions over BigTheta, defined as: for some distribution over BigTheta called f (which is parameterised by theta), G(f(theta)) = sum (pi_k * delta_theta_k(theta)).

I hope that helps, I think you generally have the right idea it's just the notation can get a little complicated (and SO isn't the best for this kind of notation). It's generally helpful whenever you encounter a symbol to think about what type of function it is, i.e. what is it defined over.

StompChicken
That was a really good answer, thanks, sorry by not pointing the entire reference
Leon palafox
No problem, good luck trying to understand Dirichlet processes - they sure confuse the hell out of me :)
StompChicken
By the way, asking these kinds of questions (i.e. discussions on NLP/Bayesian stats) at http://metaoptimize.com/qa/ might get better results.
StompChicken