I'm doing some work processing some statistics for home approvals in a given month. I'd like to be able to show trends - that is, which areas have seen a large relative increase or decrease since the last month(s).
My first naive approach was to just calculate the percentage change between two months, but that has problems when the data...
I'm looking for something that can spit out statistics on Visual Studio 2008 web projects, both Forms and MVC.
Things like:
Number of pages per project
Number of user controls
Number of classes
Number of methods
Number of images/CSS
File creation dates
If the information exists already in Visual Studio, I can't find it. I've also tr...
Does anybody know of anywhere where one might discover the relative levels of use of different enterprise platforms? eg. percentage of J2EE vs. Spring vs. .NET vs. the various other (sometimes more obscure) platforms.
I have seen lots of comparisons of Java vs. C# and so on, but I am not interested in the Desktop or Web side, I am talki...
Hello.
I have some data in a dataframe calvarbyruno.1 with variables Nominal and PAR that represent the Peak Area Ratio (PAR) found from analysis of a set of standards using a particular analytical technique, and two lm models of that data (linear and quadratic) for the relationship PAR ~ Nominal. I'm trying to use the predict.lm funct...
I'm trying to diagnose a slow stored procedure (see this question) and I've noticed that for my auto-generated stats (the ones named things like _WA_Sys_0000000A_0D0FEE32) I cant view the detailed histogram. If I click on the "Details" tab I just get the message:
No statistics information available.
If I click on the details tab for a...
I have two sets of statistics generated from processing. The data from the processing can be a large amount of results so I would rather not have to store all of the data to recalculate the additional data later on.
Say I have two sets of statistics that describe two different sessions of runs over a process.
Each set contains
Stati...
I've a problem concerning construction of log y-axis in a graphic. How can I manage that the units/numbers of my log y-axis aren't shown in 1e+03, 1e+04, 1e+05 etc...., but only in regluar arabic numbers (1000, 10000, 100000)?
Thanks.
...
Previously I have enjoyed TortoiseSvn's ability to generate simple commit stats for a given SVN repository. I wonder what is available in Git and am particularly interested in :
Number of commits per user
Number of lines changed per user
activity over time (for instance aggregated weekly changes)
Any ideas?
...
I'm looking for an elegant way to change multiple vectors' datatypes in R.
I'm working with an educational dataset: 426 students' answers to eight multiple choice questions (1 = correct, 0 = incorrect), plus a column indicating which instructor (1, 2, or 3) taught their course.
As it stands, my data is sitting pretty in data.df, like th...
Hi everyone,
In our logfiles we store response times for the requests. What's the most efficient way to calculate the median response time, the "75/90/95% of requests were served in less than N time" numbers etc? (I guess a variation of my question is: What's the best way to calculate the median and standard deviation of a bunch stre...
Consider a sales department that sets a sales goal for each day. The total goal isn't important, but the overage or underage is. For example, if Monday of week 1 has a goal of 50 and we sell 60, that day gets a score of +10. On Tuesday, our goal is 48 and we sell 46 for a score of -2. At the end of the week, we score the week like this:
...
I have a dataframe, and I want to produce a table of summary statistics including number of valid numeric values, mean and sd by group for each of three columns. I can't seem to find any function to count the number of numeric values in R. I can use length() which tells me how many values there are, and I can use colSums(is.na(x)) to c...
My current dataset data.df comes from about 420 students who took an 8-question survey under one of 3 instructors. escore is my outcome variable of interest.
'data.frame': 426 obs. of 10 variables:
$ ques01: int 1 1 1 1 1 1 0 0 0 1 ...
$ ques02: int 0 0 1 1 1 1 1 1 1 1 ...
$ ques03: int 0 0 1 1 0 0 1 1 0 1 ...
...
I need a random number generator that picks numbers over a specified range with a programmable mean.
For example, I need to pick numbers between 2 and 14 and I need the average of the random numbers to be 5.
I use random number generators a lot. Usually I just need a uniform distribution.
I don't even know what to call this type of di...
I have a site with millions of URLs. Each time a URL is clicked, a database row corresponding to that URL is updated indicating the timestamp of that click. I would like to, using additional columns for sure, but without the need to insert distinct rows for every click, estimate the number of clicks per hour this URL receives. Some...
Let's say I have a STATA dataset that has two variables: type and price. The type value for each observation is a number between 1 and 10.
I want to add a third value that is the average price of all variables of that type. So, for example, if the first observation had a type of 3 and a price of 10, then I'd like to add a third valu...
Hi, I have a requirement to calculate the Moving Range of a load of data (at least I think this is what it is called) in SQL Server. This would be easy if I could use arrays, but I understand this is not possible for MS SQL, so wonder if anyone had a suggestion.
To give you an idea of what I need:
Lets say I have the following in a sql...
I am looking for a library that does advanced math, statistics, statistical distribution, etc..
Currently I am looking for something that does binomial and poisson distribution.
...
Hi.
Has anybody seen studies of ratios of maintainance programming to new development?
Thanks.
...
Update
Just for future reference, I'm going to list all of the statistics that I'm aware of that can be maintained in a rolling collection, recalculated as an O(1) operation on every addition/removal (this is really how I should've worded the question from the beginning):
Obvious
Count
Sum
Mean
Max*
Min*
Median**
Less Obvious
Var...