tags:

views:

33

answers:

2

I'd like to create a showing the degree of concentration of resources creation among the users of a web application. The plot would have % of resources on the y axis, and % (percentile?) of users on the x axis. This feels like a cumulative distribution, but my experiments with the empirical cdf in the stats package aren't getting me what I want, because that gives me the % of resources y-axis, but the x axis is a scale from 1 to the number of users.

What I've done is follow the example plot(cdf(user_counts)) where user_counts is a list of resources created per user.

Does anyone know a better way to tackle this?

A: 

Sounds like you want a culmulative sum

you could try plot(0:100/100, cumsum(sort(user_counts))/sum(user_counts))

Does that help?

Martin
Yes, - while in your solution the array lengths weren't the same, I built on that to come up with plot(cumsum(rep(1,length(p)))/length(p), cumsum(sort(p))/sum(p)) appears to build a chart of the type I want.thanks!
jkebinger
A: 

Try Lorenz charts. The "ineq" R-package is a good start.

camcam2