I've been working on generating Perlin noise for a map generator of mine. The problem I've run into is that the random noise is not distributed normally, and is more likely a normal distribution of kinds.
Given two integers X and Y, and a seed value, I do the following:
Use MurmurHash2 to generate a random number (-1,1). This is unifo...
Hello Everybody !
I want to use R to perform a stepwise linear Regression using p-values as a selection criterion e.g. at each step dropping variables that have the highest i.e. the most insignificant p-values, stopping when all values are significant defined by some treshold alpha.
I am totally aware that I should use the AIC (e.g. co...
I've got a gaming-oriented website with 200+ users. The site has a large database tracking user plays, and one of the motivations for continued participation is the extensive statistics and rankings (S&R) with which the site provides the user.
As the list of S&Rs tracked has grown, some of the more intricate calculations have been moved...
I'm doing this only for learning purposes. I've no intentions of reversing the methods of IMDB.
I asked myself I owned IMDB or similar website. How would I compute the movie rating?
All I can think of is Weighted Average(which is nothing but Arithmetic Mean)
For a movie data provided below computation would be
(38591*10 + 27994*9...
Hi,
I'm trying to calculate how good are my measurements in machine learning!
Let's say that I have five choices, and that error is 4,2, 0.002, 3, 6. Naturally, I will pick third one for the hit, but I would like to say following:
I'm X% certain that hit is third pick
I'm Y% certain that hit is first (last) pick
Of course, X>>Y but I ...
I have this feature_list that contains several possible values, say "A", "B", "C" etc. And there is time in time_list.
So I will have a loop where I will want to go through each of these different values and put it in a formula.
something like for(i in ...) and then my_feature <- feature_list[i] and my_time <- time_list[i]
then i put ...
What is the most efficient way to collect and report performance statistic analysis from an application?
If I have an application that uses a series of network apis, and I want to report statistics at runtime, e.g.
Method doA() was called 3 times and consumed on avg 500ms
Method doB() was called 5 times and consumed on avg 1200ms et...
I run a website for music students that allows them to stream a variety of content from a number of sources.
Our primary 'customers' are really the librarians at the institutions who subscribe to our service, so it's important to them to see the actual usage of the service, but they want more information than simple web analytics are ab...
Hi, i recently finished my Mac OSX Application, and struggled to monitoring the statistic of my app. I wondering if there's kind of service such as Google Analytics for application distribution ? it would be great if they provide hosting too..
thanks
...
I think I'm getting a scoping error when using transformBy(), part of the doBy package for R. Here is a simple example of the problem:
> library(doBy)
>
> test.data = data.frame(
+ herp = c(1,2,3,4,5),
+ derp = c(2,3,1,3,5)
+ )
>
> transformData = function(data){
+
+ five = 5
+
+ transformBy(
+ ~ herp,
+ data=data,
+ sum=he...
I am at the final stages of my website, and currently I need to find a suitable statistics application/tool.
I have looked into webalizer, but it seems outdated.
Also, I have looked into Google analytics, but I am afraid that if I implement it, my website will go slow. It is already pretty heavy with database material being displayed w...
I'm looking for something that I guess is rather sophisticated and might not exist publicly, but hopefully it does.
I basically have a database with lots of items which all have values (y) that correspond to other values (x). Eg. one of these items might look like:
x | 1 | 2 | 3 | 4 | 5
y | 12 | 14 | 16 | 8 | 6
This is just a a rando...
Under the user generated posts on my site, I have an Amazon-like rating system:
Was this review helpful to you: Yes | No
If there are votes, I display the results above that line like so:
5 of 8 people found this reply helpful.
I would like to sort the posts based upon these rankings. If you were ranking from most helpful to ...
Hello,
I would like to know which is the current status of the statistical modules in CPAN, does any one know any recent review or could comment about its likes/dislikes with those modules?
I have used the clasical: Statistics::Descriptive, Statistics::Distributions, and some others contained in Bundle::Math::Statistics
Some of the ...
An statistical accumulator allows one to perform incremental calculations. For instance, for computing the arithmetic mean of a stream of numbers given at arbitrary times one could make an object which keeps track of the current number of items given, n and their sum, sum. When one requests the mean, the object simply returns sum/n.
An ...
Hi,
I crawled some blogs for my project and extracted a few features, like length of the document, in links, out links. Each of these blogs talks about some specific subject and there can be numerous articles on each subject, and I need to decide at most one or two important blogs for each subject. How can I assign weights to these feat...
Hi
I'm about to generate some statistics based on the values of a MySQL table. I would like to generate some numbers foreach month of the year and foreach day of the month.
I could of course do all this manually but that doesn't seem like a good approach :)
So anybody who has some ideas on how i generate these statistics.
OBS. I would...
Hi,
How do u decide on a test statistic while developing a test for random number testing and its likely distribution. How do u calculate and decide on the formula for calculating a p value for the test statistics distribution.
TIA
...
I'm examining some biological data which is basically a long list (a few million values) of integers, each saying how well this position in the genome is covered. Here is a graphical example for a data set:
I would like to look for "valleys" in this data, that is, regions which are significantly lower than their surrounding environmen...
I have been using MATLAB for my work, but I have started learning Python lately. I employ statistical analysis, more precisely geostatistics, in my work. I was wanting to ask, from your perspectives, which one among the two languages is good for statistical analysis? What are the pros and cons, other than accessibility, for each?
...