statistics

Calculating a moving average in F#

I'm still working on groking the F# thing - trying to work out how to 'think' in F# rather than just translating from other languages I know. I've recently been thinking about the cases where you don't have a 1:1 map between before and after. Cases where List.map falls down. One example of this is moving averages, where typically you w...

Graduate Level Degree for Simulation/Statistics/Prediction?

I am wondering if anyone has any insight into this. I am thinking of going to grad school to get some computer science related degree. I have always been intrigued by people who are working on problems using statistical packages or simulation to solve problems. What would I study to get a good breadth of knowledge of these things? Do the...

Calculate exact result of complex throw of two D30

Okay, this bugged me for several years, now. If you sucked in statistics and higher math at school, turn away, now. Too late. Okay. Take a deep breath. Here are the rules. Take two thirty sided dice (yes, they do exist) and roll them simultaneously. Add the two numbers If both dice show <= 5 or >= 26, throw again and add the result to...

Does sp_updatestats cause tables to be inaccessible in SQL Server 2005?

Does updating statistics cause tables to be inaccessible? In other words, can you run this procedure without downtime? Specifically for SQL Server 2005 ...

Algorithm for sampling without replacement?

I am trying to test the likelihood that a particular clustering of data has occurred by chance. A robust way to do this is Monte Carlo simulation, in which the associations between data and groups are randomly reassigned a large number of times (e.g. 10,000), and a metric of clustering is used to compare the actual data with the simulat...

Statistical Analysis of Server Logs - Correctness of Extrapolation

We had an ISP failure for about 10 minutes one day, which unfortunately occurred during a hosted exam that was being written from multiple locations. Unfortunately, this resulted in the loss of postback data for candidates' current page in progress. I can reconstruct the flow of events from the server log. However, of 317 candidates, ...

Java Statistics Package? (Markov Chains and advanced distributions)

Hi guys, I'm having trouble searching for a decent Java library that provides Markov chains, and other advanced distributions (as in, statistics). I've found http://sourceforge.net/projects/hydra-mcmc/ on source forge, and it looks somewhat useable, but does anyone know / use a more up-to-date package? (Haven't really have a trove thr...

Open source or free financial analysis programs/libraries

I'm looking for something containing similar functions to Matlab’s financial and financial derivatives toolbox but don’t have the cash to spend on matlab. I would appreciate any info on free or open source libraries or programs that will let me easily calculate interest rates, risk etc. ...

Where do I get stats about Internet usage?

Where do I get stats about Internet usage? Number of internet users Domains most visited Emails sent each day etc... ...

Statistics engine for Java EE Web Application

We are working on a Java EE Web Application, and the people from marketing need some really detailed stats for our site. Something similar to Google Analytics, gathering the user's information, and their navigation through the site (where they come from, what they click, where they go, etc.). Depending on a third party service like Anal...

filter out deviating record with sql

We have this set of data that we need to get the average of a column. a select avg(x) from y does the trick. However we need a more accurate figure. I figured that there must be a way of filtering records that has either too high or too low values(spikes) so that we can exclude them in calculating the average. ...

As programmer: How many hours in average you stay with computer a day?

I am wondering how many programmers here are spending more than 10 hrs (on average) in front of a computer regardless what they do while they are on the computer. No programmer/gamer doesn't like using their computer at home. But I love it and my average is 11hrs/day. Hopefully I am not alone. ...

tool analyzing log4net logs

Is there a tool which can be used to analyze log4net logs. Particulary I would like to extract two method calls by thread id and analyze the duration between the two, to create some statistics of call duration. Plus this over multiple (100x10Mb) files. I suppose grep would also do it. ...

How many C# vs. VB developers are out there?

As a company it's important to choose which .NET language to go with. Many have chosen C# but are there any actual numbers out there to support going with C# over VB.NET? How many C# developers are out there vs. VB devs. I know that a good developer will be able to work with any language but the choice of language might dissuade a perso...

What applications are there that I can pass data as it's generated and have it analyze some statistics for?

The basic requirement is pass to some command type and execution time (possibly other data as well, but that's the basic data we're concerned with at the moment) from C# code (either managed code or something that can take data periodically from the command line. and perform some statistical analysis on it: avg time for each command type...

How to decode google gclids

Now, I realise the initial response to this is likely to be "you can't" or "use analytics", but I'll continue in the hope that someone has more insight than that. Google adwords with "autotagging" appends a "gclid" (presumably "google click id") to link that sends you to the advertised site. It appears in the web log since it's a query ...

Reliable way to see process-specific perf statistics on an IIS6 app pool

In perfmon in Windows Server 2003, there are counter objects to get per-process processor time and memory working set statistics. The only problem is that in an environment with multiple application pools, there is no way to reliably identify the correct worker process. In perfmon, they are all called "w3wp", and if there is more than on...

Computing a 95% confidence interval on the sum of n i.i.d. exponential random variables.

Let's in fact generalize to a c-confidence interval. Let the common rate parameter be a. (Note that the mean of an exponential distribution with rate parameter a is 1/a.) First find the cdf of the sum of n such i.i.d. random variables. Use that to compute a c-confidence interval on the sum. Note that the max likelihood estimate (MLE...

Web Usage Statistics

What is a good site to check current web usage statistics -- particularly Java version, OS, browser. I have been trying to figure out Google Zeitgeist because it supposedly has this information, but I can't find it. ...

Fitting polynomials to data

Is there a way, given a set of values (x,f(x)), to find the polynomial of a given degree that best fits the data? I know polynomial interpolation, which is for finding a polynomial of degree n given n+1 data points, but here there are a large number of values and we want to find a low-degree polynomial (find best linear fit, best quadr...