Here is the real world issue that we are solving. We have some rather large data sets that need to be aggregated and summarized in real time with a number of filters and formulas applied to them. It works fine to apply these to each record in real time when the data set is less than 50,000 records but as we approach 100,000 and then 100+...
I am using Octave and I would like to use the anderson_darling_test from the Octave forge Statistics package to test if two vectors of data are drawn from the same statistical distribution. Furthermore, the reference distribution is unlikely to be "normal". This reference distribution will be the known distribution and taken from the hel...
I need to evaluate the effectiveness of algorithms which predict the probability of something occurring.
My current approach is to use "root mean squared error", ie. the square root of the mean of the errors squared, where the error is 1.0-prediction if the event occurred, or prediction if the event did not occur.
The algorithms have n...
I have a few hundred network devices that check in to our server every 10 minutes. Each device has an embedded clock, counting the seconds and reporting elapsed seconds on every check in to the server.
So, sample data set looks like
CheckinTime Runtime
2010-01-01 02:15:00.000 101500
2010-01-01 02:25:00.000 102100
2010-...
how do news outlets like google news automatically classify and rank documents about emerging topics, like "obama's 2011 budget"?
i've got a pile of articles tagged with baseball data like player names and relevance to the article (thanks, opencalais), and would love to create a google news-style interface that ranks and displays new po...
I'm working on a project now that's rather unlike anything I've done before. I have two tests with binary results that will be administered to the same sample, which is drawn from a clustered population (i.e., some subjects will be from the same family). I'd like to compare proportions of positive test results, but the clustering makes...
I have data points that represent a logarithmic function.
Is there an approach where I can just estimate the function that describes this data using R?
Thanks.
...
Hi,
I Update indexes with full scan weekly. so when I run:
SELECT name AS index_name,
STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
Ref: link text
I expect it to show me that all indexes were updated this weekend. But there are several records which look like:
index_name StatsUpdated
clust 2005-10-14 01:36:2...
I previously asked this question which was useful in plotting a function. I want to try and plot twenty functions on the same axes to illustrate how a function varies between two ranges. I have successfully done this using individually specified functions, but I wanted to do this using a loop.
What I have attempted doing is:
## add gg...
I'm having some memory problems with an application, but it's a bit difficult to figure out exactly where it is. I have two sets of data:
Pageviews
The page that was requested
The time said page was requested
Memory use
The amount of memory being used
The time this memory use was recorded
I'd like to see exactly which pageviews...
I'm doing a bit more statistical analysis on some things lately, and I'm curious if there are any programming languages that are particularly good for this purpose. I know about R, but I'd kind of prefer something a bit more general-purpose (or is R pretty general-purpose?).
What suggestions do you guys have? Are there any languages o...
I'm sure this is an issue anyone who uses Stata for publications or reports has run into: how do you conveniently export your output to something that can be parsed by a scripting language or Excel?
There are a few ADO files that to this for specific commands (try findit tabout or findit outreg2). But what about exporting the output of ...
I want to display in a HTML page some datas with errors, for example:
(value, error) -> string
(123, 12) -> (12 +- 1) x 10^1
(4234.3, 2) -> (4234 +- 2)
(0.02312, 0.003) -> (23 +- 3) x 10^-3
I've produced this:
from math import log10
def format_value_error(value,error):
E = int(log10(abs(error)))
val = float(value) / 10**E
...
Possible Duplicate:
Function to Calculate Median in Sql Server
I have a table containing two field (more, but not relevant). The fields are Price and Quantity. I want to find several statistically data for this table, and among them is median price when adjusted to quantity.
Today I have a basic-slow-not so good looking funct...
I have two dendrograms which I wish to compare to each other in order to find out how "similar" they are. But I don't know of any method to do so (let alone a code to implement it, say, in R).
Any leads ?
Thanks,
Tal
...
I find myself needing to process network traffic captured with tcpdump. Reading the traffic is not hard, but what gets a bit tricky is spotting where there are "spikes" in the traffic. I'm mostly concerned with TCP SYN packets and what I want to do is find days where there's a sudden rise in the traffic for a given destination port. Ther...
I'm doing an ongoing survey, every quarter. We get people to sign up (where they give extensive demographic info).
Then we get them to answer six short questions with 5 possible values much worse, worse, same, better, much better.
Of course over time we will not get the same participants,, some will drop out and some new ones will si...
I want to use Fluent NHibernate to model a Markov chain. It's basically a set of different states with transition probabilities between the states.
I want to map the transition probabilities into MarkovState.TransitionProbabilities as a Dictionary. I want to use the NEXT state as key (using either MarkovState or int as key), so that I c...
Hiya,
I'm building some custom content types to capture customer data on a website. Admins will enter the data, users will be able to view it, but I also need to be able to bolt on some statistics and infographics to the data.
The problem I have is that I can't see any simple way of doing this within Drupal. Are there modules which ca...
We would like to drop support for our application on stock Windows XP and XP SP1 and thus require SP2 or higher.
I tried finding some statistics about market share of the various service packs of Windows but failed. Do you have such links? Do you still support XP before SP2?
...