statistics

Algorithm to "smooth out" data values for visualization

I'm reading some data for countries around the world and am playing with Google's visualization gadgets, in particular the map visualizations. The problem is, that the US always comes out way in front. While most countries have values between 1 and 50, the US consistently has a value of 2000+. Which means in the visualization, it's hard ...

Occurrence prediction

I'd like to know what method is best suited for predicting event occurrences. For example, given a set of data from 5 years of malaria infection occurrences and several other factors that affect the occurrences, I'd like to predict the next five years for malaria infection occurrences. What I thought of doing was to derive a kind of occ...

What's a good data model for cross-tabulation?

I'm implementing a cross-tabulation library in Python as a programming exercise for my new job, and I've got an implementation of the requirements that works but is inelegant and redundant. I'd like a better model for it, something that allows a nice, clean movement of data between the base model, stored as tabular data in flat files, a...

How do I implement a real time *financial* statistics engine from SQL server data for dashboard display?

We currently use excel automation to calculate time series statistics and store the results in our SQL Server 2008 database for easy display/sorting/etc. later. I'm currently redesigning the home screen of our app to present the most important information (as identified by the team using the app) in dashboard form. I'd like the display...

how to generate pseudo-random positive definite matrix with constraints on the off-diagonal elements?

The user wants to impose a unique, non-trivial, upper/lower bound on the correlation between every pair of variable in a var/covar matrix. For example: I want a variance matrix in which all variables have 0.9 > |rho(x_i,x_j)| > 0.6, rho(x_i,x_j) being the correlation between variables x_i and x_j. Thanks. ...

how to generate pseudo-random positive definite matrix with constraints on the off-diagonal elements?

The user wants to impose a unique, non-trivial, upper/lower bound on the correlation between every pair of variable in a var/covar matrix. For example: I want a variance matrix in which all variables have 0.9 > |rho(x_i,x_j)| > 0.6, rho(x_i,x_j) being the correlation between variables x_i and x_j. Thanks. Ok, something of a quick&di...

Best way to store "views" of an topic

I use this code to update views of an topic. UPDATE topics SET views = views + 1 WHERE id = $id Problem is that users likes spam to F5 to get ridiculous amounts of views. How should I do to get unique hits? Make a new table where I store the IP? Don't want to store it in cookies. It's too easy to clear your cookies. ...

Pinch Media - Core Location Optional

Will using PinchMedia and including Core Location frameworks make it unusable on the iPod Touch which doesn't have GPS? If so, is there a way to minimize this dependency since my actual application doesn't care? It would be nice to see, but basically I'm trying to use it to provide feedback from users on things they'd like to see improve...

Adding line separator to Gridview

Hello, I’m relatively new to ASP.NET and SQL, so what I’m asking maybe a simple question for some, but not for me. What I have is a Grid View that I’m trying to populate softball hitting statistics with. In it I’ve stacked statistics yearly statistics on top of career totals at the very bottom of it. I’ve accomplished this by doing a...

Are there any generic statistical board software ?

I was thinking about monitoring the evolution of a single value (nope, this is not the SO rep :-p), and I'd like to have some nice histograms about it. My needs are simple: daily / weekly / monthly / yearly evolution histograms; daily / weekly / monthly / yearly calculation for max, min and average value. Ideally the product should b...

Where to find browser statistics?

I know about w3schools statistics, but it much differs from the one on my company's site. ...

"On-line" (iterator) algorithms for estimating statistical median, mode, skewness, kurtosis?

Is there an algorithm to estimate the median, mode, skewness, and/or kurtosis of set of values, but that does NOT require storing all the values in memory at once? I'd like to calculate the basic statistics: mean: arithmetic average variance: average of squared deviations from the mean standard deviation: square root of the varianc...

KenKen puzzle addends: REDUX A (corrected) non-recursive algorithm.

This question relates to those parts of the KenKen Latin Square puzzles which ask you to find all possible combinations of ncells numbers with values x such that 1 <= x <= maxval and x(1) + ... + x(ncells) = targetsum. Having tested several of the more promising answers, I'm going to award the answer-prize to Lennart Regebro, because: ...

most readable programming language to simulate 10,000 chutes and ladders game plays?

I'm wondering what language would be most suitable to simulate the game Chutes and Ladders (Snakes and Ladders in some countries). I'm looking to collect basic stats, like average and standard deviation of game length (in turns), probability of winning based on turn order (who plays first, second, etc.), and anything else of interest you...

Probability of observing sequence of 7 of the same (heads or tails) in 100 coin flipping trials?

Inspired by a Radiolab postcast: what ways are there to compute the probability of observing 7 heads (or 7 tails) in a row when flipping a coin 100 times? ...

SQL Server 2005 Query Statistics

Where can I find some in-depth information on tuning statistics in SQL Server 2005? I need to really delve in to what statistics are being used in a number of different queries, how they are interacting with indexes, how/when/where to use custom statistics (over and above what the database tuning advisor recommends), when/how to update ...

Static Variables in R

I have a function in R that I call multiple times. I want to keep track of the number of times that I've called it and use that to make decisions on what to do inside of the function. Here's what I have right now: f = function( x ) { count <<- count + 1 return( mean(x) ) } count = 1 numbers = rnorm( n = 100, mean = 0, sd = 1 ) fo...

Statistic solution

Hi, i need a solution for the following problem: Lets say i'm employing a "branch" website, where some shops can publish their data, like addresses, openinghours etc. I want every shop to have a login where he can lookup some statistical data, such like unique visitors for example... My question now: Can you think of the easiest way...

Classifying english words into rare and common

Hi all, I'm trying to devise a method that will be able to classify a given number of english words into 2 sets - "rare" and "common" - the reference being to how much they are used in the language. The number of words I would like to classify is bounded - currently at around 10,000, and include everything from articles, to proper noun...

Tracking visitor stats with Ruby on Rails

Are there any visitor statistics solutions for Ruby on Rails? I'm talking something like Google Analytics, but without passing data through a third party. I'd like to track such parameters as visitor count, visit depth, bounce rate, referer (by host or by GET parameter), etc. ...