statistics

Cake Comparison Algorithm

This is literally about comparing cakes. My friend is having a cupcake party with the goal of determining the best cupcakery in Manhattan. Actually, it's much more ambitious than that. Read on. There are 27 bakeries, and 19 people attending (with maybe one or two no-shows). There will be 4 cupcakes from each bakery, if possible incl...

Getting a grasp of how many people use my software

We have a very small, specialized user base. No community. My boss wants to find out who is using it. And his approach is to simply make a hidden connection, maybe an auto update function, enabled by default WITHOUT notification when there is no update ... I don't really like the idea and try to come up with something different. There i...

Step-by-Step How-to on Mediation Analysis in R

I'd like to know if anybody can provide a step-by-step how to on how to use mediation analysis using Keele, Tingley, Yamamoto and Imai's mediation package. I think there are two approaches to this - the classic Baron and Kenny (1986) and the new one by Preacher, Rucker and Hayes (2007) - I'd like to know how to do both approaches in R ...

Compute statistical significance with Excel

Hello, I have 2 columns and multiple rows of data in excel. Each column represents an algorithm and the values in rows are the results of these algorithms with different parameters. I want to make statistical significance test of these two algorithms with excel. Can anyone suggest a function? As a result, it will be nice to state somet...

How to set alpha in R?

I have this example from the coin package of R: library(coin) library(multcomp) ### Length of YOY Gizzard Shad from Kokosing Lake, Ohio, ### sampled in Summer 1984, Hollander & Wolfe (1999), Table 6.3, page 200 YOY <- data.frame(length = c(46, 28, 46, 37, 32, 41, 42, 45, 38, 44, 42, 60, 32, 42, ...

How do you implement Velicer's MAP criterion in R

I'm looking at the psych package and the VSS tutorial, do I simply replace VSS with MAP? Like this: MAP(x, n = 8, rotate = "varimax", diagonal = FALSE, fm = "pa", n.obs=NULL,plot=TRUE,title="Very Simple Structure",...) or is there another way to do this? I've doing factor analysis right now and I'm using the elbow method on a scree pl...

Overall Title for Plotting Window

If I create a plotting window in R with m rows and n columns, how can I give the "overall" graphic a main title? For example, I might have three scatterplots showing the relationship between GPA and SAT score for 3 different schools. How could I give one master title to all three plots, such as, "SAT score vs. GPA for 3 schools in CA"? ...

Plot time data in R to various resolutions (to the minute, to the hour, to the second, etc.)

I have some data in CSV like: "Timestamp", "Count" "2009-07-20 16:30:45", 10 "2009-07-20 16:30:45", 15 "2009-07-20 16:30:46", 8 "2009-07-20 16:30:46", 6 "2009-07-20 16:30:46", 8 "2009-07-20 16:30:47", 20 I can read it into R using read.cvs. I'd like to plot: Number of entries per second, so: "2009-07-20 16:30:45", 2 "2009-07-20 16:...

Algorithm to Match Time Dependent (1D) Signals

Hi, I was wondering if someone could point me to an algorithm/technique that is used to compare time dependent signals. Ideally, this hypothetical algorithm would take in 2 signals as inputs and return a number that would be the percentage similarity between the signals (0 being that the 2 signals are statistically unrelated and 1 being...

Manager game: How to calculate market values?

Hello! Usually players in a soccer manager game have market values. The managers sell their players in accordance with these market values. They think: "Oh, the player is worth 3,000,00 so I'll try to sell him for 3,500,000". All players have three basic qualities: strength value (1-99) maximal strength they can ever attain (1-99) mo...

How can I display charts with a asp.net control on web page?

I want to display chart on my web page for which I have the the source (for X & Y axis values) table in sqlserver. ...

Identifying accurate matches from SQL Server Full Text Search

Hello I am using SQL Server 2008 Full Text Search, and joining to the FreeTextTable to determine ranking of results. How do I determine whether the result set is giving an accurate match or not? For example, for one search I may get these results: Manufacturer | Rank =================== LG U300 ------- 102 LG C1100 ------ 54 LG GT50...

git find fat commit

Is it possible to get info about how much space is wasted by changes in every commit — so I can find commits which added big files or a lot of files. This is all to try to reduce git repo size (rebasing and maybe filtering commits) ...

Whats the most widespread monitoring protocol/library?

Hi, I need to expose certain monitoring statistics from my application and I'm wondering what the most widespread framework or protocol is for doing this? ...

What is a "good" R value when comparing 2 signals using cross correlation?

I apologize for being a bit verbose in advance: if you want to skip all the background mumbo jumbo you can see my question down below. This is pretty much a follow up to a question I previously posted on how to compare two 1D (time dependent) signals. One of the answers I got was to use the cross-correlation function (xcorr in MATLAB), ...

Simple way to calculate median with MySQL

What's the simplest (and hopefully not too slow) way to calculate the median with MySQL? I've used AVG(x) for finding the mean, but I'm having a hard time finding a simple way of calculating the median. For now, I'm returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of d...

Creating an endless Iterator with a given distribution

Given a java.util.Collection what is the easiest way to create an endless java.util.Iterator which returns those elements such that they show up according to a given distribution (org.apache.commons.math.distribution)? ...

Formulas in user-defined functions in R

Formulas are a very useful feature of R's statistical and graphical functions. Like everyone, I am a user of these functions. However, I have never written a function that takes a formula object as an argument. I was wondering if someone could help me, by either linking to a readable introduction to this side of R programming, or by givi...

best way to statistically detect anomalies in data

Hi, our webapp collects huge amount of data about user actions, network business, database load, etc etc etc All data is stored in warehouses and we have quite a lot of interesting views on this data. if something odd happens chances are, it shows up somewhere in the data. However, to manually detect if something out of the ordinary ...

Algorithm that takes 2 'similar' matrices and 'aligns' one to another

First of all, the title is very bad, due to my lack of a concise vocabulary. I'll try to describe what I'm doing and then ask my question again. Background Info Let's say I have 2 matrices of size n x m, where n is the number of experimental observation vectors, each of length m (the time series over which the observations were collect...