statistics

Oracle AUTOTRACE alternative

I need collect statistics for my long SQL-scripts. Script files generated by Java application and execute by third-party fast DB-driver. In this way I can't use AUTOTRACE, because it`s not SQLPlus. But for perfomance reasons I need to know stat information about every statement in script. Can you advice approaches or best practices? I...

python statistical analysis

Given 15 players - 2 Goalkeepers, 5 defenders, 5 midfielders and 3 strikers, and the fact that each has a value and a score, I want to calculate the highest scoring team for the money I have. Each team must consist of 1 GK then a formation e.g. 4:4:2, 4:3:3 etc. I started with sample data such as this player role points cost I then...

inter-rater reliability

Which inter-rater reliability methods are appropriate for ordinal or interval data? ...

example algorithm for generating random value in dataset with normal distribution?

I'm trying to generate some random numbers with simple non-uniform probability to mimic lifelike data for testing purposes. I'm looking for a function that accepts mu and sigma as parameters and returns x where the probably of x being within certain ranges follows a standard bell curve, or thereabouts. It needn't be super precise or ev...

T-SQL: A better sliding distribution function/query

I need a T-SQL ranking approach similar to the one provided by NTILE(), except that the members of each tile would be on a sliding distribution so that higher ranking tiles have fewer members. For example CREATE TABLE #Rank_Table( id int identity(1,1) not null, hits bigint not null default 0, PERCENTILE smallint null ) --Slant the dist...

Is there any free multi purpose development server?

There are so many tools out there. You can do so much things around developing that it is a full time job on its own. So why not integrating features / tools to an powerful server application. Is there a server which integrates (some of) these features: static code analysis automated builds (e.g. through maven) continuious integreatio...

Algorithm to determine most popular article last week, month and year?

I'm working on a project where I need to sort a list of user-submitted articles by their popularity (last week, last month and last year). I've been mulling on this for a while, but I'm not a great statitician so I figured I could maybe get some input here. Here are the variables available: Time [date] the article was originally publ...

Migrating from Stata to Python

Some coworkers who have been struggling with STATA 11 are asking for my help to try to automate their laborious work. they mainly use 3 commands in stata: tsset (sets a time series analysis) as in: tsset year_column, yeary varsoc (Obtain lag-order selection statistics for VARs) as in: varsoc column_a column_b vec (vecto...

selecting a random element of the power set.

For a problem that I'm working on right now, I would like a reasonably uniform random choice from the powerset of a given set. Unfortunately this runs right into statistics which is something that I've not studied at all (something that I need to correct now that I'm getting into real programming) so I wanted to run my solution past some...

Calculating Pearson correlation and significance in Python

I am looking for a function that takes as input two lists, and returns the Pearson correlation, and the significance of the correlation. I am using Python. Thank you very much. Ariel ...

Probability Code Problem

When I switch out isWinDefault with isWinConfidence I get drastically different results. I felt they should be the same. Is there a bug in my code or a property of statistics that I've misunderstood? This test is meant to simulate flipping a single coin 1x vs a coin many times. The question is If P(x) is 70% then should p(x) * 100 be...

How to get php page load time statistics?

Recently we've been having problems with our LAMP setup and we started to see the number of MySQL database connections spike up every now and then. We suspect that some mysql operation is taking longer than usual and apache just started to build a backlog of connections to deal with incoming requests. Question is, is there a way to per...

Generating samples from the logistic distribution

I am working on some statistical code and exploring different ways of creating samples from random distributions - starting from a random number generator that generates uniform floating point values from 0 to 1 I know it is possible to generate approximate samples from a normal distribution by adding together a sufficiently large numbe...

algorithm to pick set of winners using different weights

Hello, I'm attempting to design an algorithm that does the following. Input: I've a set of keys (total n) that are mapped to set of properties. The properties contain the weight for each property and the value for the property. Output: Identify a set of keys that are qualified (total k) based on the set of properties and their ...

R: Two dimensional non-parametric regression

What packages and functions in R can perform a two dimensional non-additive local regression/smooth. For example consider b<-seq(-6*pi,6*pi,length=100) xy<-expand.grid(b,b) x=xy[[1]] y=xy[[2]] z= sin(x)+cos(y) + 2*sin(x)*cos(y) contour(b,b,matrix(z,100,100)) What functions could estimate this? ...

Calculating Percentiles on the fly

I'm programming in Java. Every 100 ms my program gets a new number. It has a cache with contains the history of the last n = 180 numbers. When I get a new number x I want to calculate how many numbers there are in the cache which are smaller than x. Afterwards I want to delete the oldest number in the cache. Every 100 ms I want to rep...

Recomendations for PHP & MySQL Statistics Reporting for a WebApp

background: I have 'inherited' a php webapp in my small company and after years of nagging have finally gotten the go to throw away the spaghetti code and start again. We want to log every action that is made in the system for example: user X viewed item Y user X updated item Y new item Y on city Z and later provide graphs on diffe...

How to view Execution Plans for Oracle Database in Java

With SQL Plus for Oracle Database, I can call SET autotrace on and then see Execution Plan, statistics, etc. The problem is that I want access to information about the Execution Plan and statistics in my Java program. I typically have done something like this to execute a sql statement, Connection connection = //INITIALIZE HERE; St...

Statistics - Daily Mode from MySQL table with numbers and date

I need some help writing a MySQL query that grabs the mode (most frequently occurring number) each day Data set looks something like this datetime number 2010-01-01 00:35:00 500 2010-01-01 01:15:10 500 2010-01-02 01:25:10 1500 2010-01-02 01:35:10 50 2010-01-03 12:35:50 100 2010-01-05 05:25:10 2500 (etc) ...

Statistical analysis for PHP

I am trying to write a reporting system for a project I'm working on that analyzes people's pre and post test scores and does some analysis on it. I have searched to the best of my ability, but cannot find a PHP Class / set of functions that seems to do the analysis I need. Currently, I am dumping the dataset to a CSV, and one of my co...