I'd like to know the estimation for the supply of professional software developers globally, and, wherever it's possible, regionally.
Although weird, I hope this question to shed some light on the global availability of software development services, or, at the very least, realizing just how much of a commodity we are.
Edit: by "profes...
Hi,
I want to do sparse high dimensional (a few thousand features) least squares regression with a few hundred thousands of examples. I'm happy to use non fancy optimisation - stochastic gradient descent is fine.
Does anyone know of any software already implemented for doing this, so I don't have to write to my own?
Kind regards.
...
The Oracle view V$OSSTAT holds a few operating statistics, including:
IDLE_TICKS Number of hundredths of a second that a processor has been idle, totalled over all processors
BUSY_TICKS Number of hundredths of a second that a processor has been busy executing user or kernel code, totalled over all processors
The documentation I've ...
I want to be able to introduce new 'tag lines' into a database that are shown 'randomly' to users. (These tag lines are shown as an introduction as animated text.)
Based upon the number of sales that result from those taglines I'd like the good ones to trickle to the top, but still show the others less frequently.
I could come up with ...
I want to check in Transact SQL if a specific column in a table has statistics and if so to get them all.
...
Using any tools which you would expect to find on a nix system (in fact, if you want, msdos is also fine too), what is the easiest/fastest way to calculate the mean of a set of numbers, assuming you have them one per line in a stream or file?
...
TF-IDF (term frequency - inverse document frequency) is a staple of information retrieval. It's not a proper model though, and it seems to break down when new terms are introduced into the corpus. How do people handle it when queries or new documents have new terms, especially if they are high frequency. Under traditional cosine match...
Hello,
The expected probability of randomly selecting an element from a set of n elements is P=1.0/n .
Suppose I check P using an unbiased method sufficiently many times. What is the distribution type of P? It is clear that P is not normally distributed, since cannot be negative. Thus, may I correctly assume that P is gamma distributed? ...
I have a MySQL database table with a couple thousand rows. The table is setup like so:
id | text
The id column is an auto-incrementing integer, and the text column is a 200-character varchar.
Say I have the following rows:
3 | I think I'll have duck tonight
4 | Maybe the chicken will be alright
5 | I have a pet duck now, awesome!
...
How do I take an efficient simple random sample in SQL? The database in question is running MySQL; my table is at least 200,000 rows, and I want a simple random sample of about 10,000.
The "obvious" answer is to:
SELECT * FROM table ORDER BY RAND() LIMIT 10000
For large tables, that's too slow: it calls RAND() for every row (which al...
I need to conduct a survey of 3 questions.
The first question will be Yes/No, the second will have multiple answers, in which you can select multiple answers for just that question, as well as a "other" box that you can fill in an answer.
And the last will be a textarea in which they can enter general comments/suggestions.
I would lov...
I'm looking for a an R package which can be used to train a Dirichlet prior from counts data. I'm asking for a colleague who's using R, and don't use it myself, so I'm not too sure how to look for packages. It's a bit hard to search for, because "R" is such a nonspecific search string. There doesn't seem to be anything on CRAN, but ar...
I have a program that I'm porting from one language to another. I'm doing this with a translation program that I'm developing myself. The relevant result of this is that I expect that there are a number of bugs in my system that I am going to need to find and fix. Each bug is likely to manifest in many places and fixing it will fix the b...
Is there a way to run a one-liner in sas, or do I have to create a file? I'm looking for something like the -e flag in perl.
...
I have a game in which you can score from -40 to +40 on each match.
Users are allowed to play any number of matches.
I want to calculate a total score that implicitly takes into account the number of matches played.
Calculating only the average is not fair.
For example, if Peter plays four games and gets 40 points on each match, he wil...
When you use the POISSON function in Excel (or in OpenOffice Calc), it takes two arguments:
an integer
an 'average' number
and returns a float.
In python (i tried RandomArray and NumPy) it returns an array of random poisson numbers.
What I really want is the percentage that this event will occur (it is a constant number and the array...
I have an existing web app that allows users to "rate" items based on their difficulty. (0 through 15). Currently, I'm simply taking the average of each user's opinion and presenting the average straight from MySQL. However, it's becoming clear to me (and my users) that weighting the numbers would be more appropriate.
Oddly enough, a...
For large n (see below for how to determine what's large enough), it's safe to treat, by the central limit theorem, the distribution of the sample mean as normal (gaussian) but I'd like a procedure that gives a confidence interval for any n. The way to do that is to use a Student T distribution with n-1 degrees of freedom.
So the quest...
If you have any experiences with them, what are your thoughts on them as well?
...
I'm looking for a basic software for statistical analysis. Most important is simple and intuitive use, getting started "right out of the box". At least basic operations should be interactive. Free would be a bonus :)
The purpose is analysis of data dumps and logs of various processes.
Importing a comma/tab separated file
sorting and...