I'm not sure if this is quite the right place, but it seems like a decent place to ask.
My current job involves manual analysis of large data sets (at several levels, each more refined and done by increasingly experienced analysts). About a year ago, I started developing some utilities to track analyst performance by comparing results a...
Hi,
I am looking for an easy and not expensive solution to work on a large amount of records coming from sensors and save in a MYSQL database.
I would like to do statistics calculation on these records and other heavy calculation.
this tool that I am looking for will be used by researchers or engineers who are expert in math and stati...
I can't find the type of problem I have and I was wondering if someone knew the type of statistics it involves. I'm not sure it's even a type that can be optimized.
I'd like to optimize three variables, or more precisely the combination of 2. The first is a likert scale average the other is the frequency of that item being rated on that...
Hey,
I have a sampling problem, for an analysis i have to calculate a revenue growth share. I found an article about sampling with negative numbers very helpfull since i have to work with a sum of revenues that is negative. The article suggests to make all the numbers absolute, but unfortunately this does not solve my problem. Since i am...
Hi,
I'm a statistician by trade and I'd like some recommendations on how to set up a website that can collect data into a database. For personal use, I use Google Forms to collect data, and everything gets populated into a spreadsheet. However, this may not be appropriate in a more professional setting, especially when we have multipl...
Let's say I'm playing 10 different games. For each game, I know the probability of winning, the probability of tying, and the probability of losing (each game has different probabilities).
From these values, I can calculate the probability of winning X games, the probability of losing X games, and the probability of tying X games (for ...
Is there a way that this can be improved, or done more simply?
means.by<-function(data,INDEX){
b<-by(data,INDEX,function(d)apply(d,2,mean))
return(structure(
t(matrix(unlist(b),nrow=length(b[[1]]))),
dimnames=list(names(b),col.names=names(b[[1]]))
))
}
The idea is the same as a SAS MEANS BY statement. The function 'me...
I have postfix on my server.
and my server is sending about 5K emails daily
i need to get some statistics about these emails in web interface (web tool)
for example how many of them went to each domain (500 to @yahoo, 242 to @gmail and so on)
and some other statistics.
i need something other than postfix log-watch
Thanks
...
How to write a query suitable for generating an age pyramid like this:
I have a table with a DATE field containing their birthday and a BOOL field containing the gender (male = 0, female = 1). Either field can be NULL.
I can't seem to work out how to handle the birthdays and put them into groups of 10 years.
EDIT:
Ideally the X axis...
I've got a list of timestamps (in ticks), and from this list I'd like to create another one that represents the delta time between entries.
Let's just say, for example, that my master timetable looks like this:
10
20
30
50
60
70
What I want back is this:
10
10
20
10
10
What I'm trying to accomplish here is detect that #3 in the ...
I have been unable to find this function in any of the standard packages, so I wrote the one below. Before throwing it toward the Cheeseshop, however, does anyone know of an already published version? Alternatively, please suggest any improvements. Thanks.
def fivenum(v):
"""Returns Tukey's five number summary (minimum, lower-hinge...
I have a MongoDB collection which has a created_at stored in each document. These are stored as a MongoDB date object e.g.
{ "_id" : "4cacda7eed607e095201df00", "created_at" : "Wed Oct 06 2010 21:22:23 GMT+0100 (BST)", text: "something" }
{ "_id" : "4cacdf31ed607e0952031b70", "created_at" : "Wed Oct 06 2010 21:23:42 GMT+0100 (BST)",...
I have an interesting conceptual problem, and I'm wondering if anyone can help me quantify it. Basically, I'm playing a set of games... and for each game I know the probability that I will win, the probability that I will tie, and the probability that I will lose (each game will have different probabilities).
At a high level, what I wa...
My actual problem is a bit more general that this, but here is a specific example. In basketball, you calculate free throw percentage as:
Free-Throw Percentage (FT%) = Free-Throws Made (FTM) / Free-Throws Attempted (FTA)
I have two teams, and for each team I have the mean and variance of the team's FTM and FTA, so I can model each as ...
I am looking for a Ruby gem or library that does logarithmic regression (curve fitting to a logarithmic equation). I've tried statsample (http://ruby-statsample.rubyforge.org/), but it doesn't seem to have what I'm looking for. Anybody have any suggestions?
...
Possible Duplicate:
Converting a Uniform Distribution to a Normal Distribution
Hello.
I'd like to know of any algorithm implemented in C which can take a random value between 0 and 1, the mean and standard deviation and then return a normally distributed result.
I have too little brainpower to figure this out for myself righ...
Hello,
One of my application is an engine that executes some complex calculations. These calculations may take several hours. I want to know the activity of this engine among time.
If you are using Hudson CI server, there is such a feature in Administration > Usages statistics option. Here is an example:
In my application, I alread...
I just can't find real up to date info on what programming and scripting most used today
And in which environment for example : web , desktop , mobile.
where can find such info?
...
I'm trying to calculate the absolute deviation of a vector online, that is, as each item in the vector is received, without using the entire vector. The absolute deviation is the sum of the absolute difference between each item in a vector and the mean:
I know that the variance of a vector can be calculated in such a manner. Va...
use strict;
use warnings;
use Statistics::Descriptive;
use 5.012;
my @data = ( -2, 7, 7, 4, 18, -5 );
my $stat = Statistics::Descriptive::Full->new();
$stat->add_data(@data);
say ($stat->percentile(100) // "undef"); # return 18. OK.
say ($stat->percentile(0) // "undef"); # return undef instead of "-inf". see doc below
Statistics::Desc...