questions about statistics | ansaurus

statistics

Need helping creating a loop in R. Have many similarly named variables I have to apply the same functions to. Thanks!

I am using a dataset w/variables that have very similar names. I have to apply the same functions to 13 variables at a time and I'm trying to shorten the code, instead of doing each variable individually. q01a.F=factor(q01a) q01b.F=factor(q01b) q01c.F=factor(q01c) q01d.F=factor(q01d) q01e.F=factor(q01e) q01f.F=factor(q01f) q01g.F=factor...

code-statistics

Automating R Script

I have written an R script that pulls some data from a database, performs several operations on it and post the output to a new database. I would like this script to run every day at a specific time but I can not find any way to do this effectively. Can anyone recommend a resource I could look at to solve this issue? I am running thi...

R question. Using lappy on a data.frame and creating new variables w/ output.

Hello, I have 13 quantitative variables in a data.frame (called 'UNCA'). The variables are named q01_a, q01_b, ...q01_m. I want to create 13 new variables that have the same values but are coded as a factor. I would like to name these 13 new variables q01_a.F, q01_b.F, ...q01_m.F. Any help would be greatly appreciated! ...

Optimal two variable linear regression calculation

Problem Am looking to apply the y = mx + b equation (where m is SLOPE, b is INTERCEPT) to a data set, which is retrieved as shown in the SQL code. The values from the (MySQL) query are: SLOPE = 0.0276653965651912 INTERCEPT = -57.2338357550468 SQL Code SELECT ((sum(t.YEAR) * sum(t.AMOUNT)) - (count(1) * sum(t.YEAR * t.AMOUNT))) / ...

linear-regression

Suggest a good book for Quantitative Methods & R Programming

Hi folks, Please suggest a good book for beginner in Quantitative Methods/Techniques. Adding to this, a good book for beginners in R programming language, used in Quantitative Methods. And I've a few questions about this: ? Should I have to learn the other subjects like Probability, Statics, etc. before learning Quantitative Methods ? ...

What percent of web sites use JavaScript?

I'm wondering just how pervasive JavaScript is. This article states that 73% of websites they tested rely on JavaScript for important functionality, but it seems to me that the number must be larger. Have any surveys been done on this topic? Maybe a better way to phrase this question is - are there any sites that don't use JavaScript? ...

xts problem with dynlm

Hello, I am trying to use xts as much as possible in my time series work as it seems to be the suggested way of doing things. However, I have getting a strange error. CPI.NSA and INT are xts objects. library(dynlm) CPI.NSA.x <- CPI.NSA[dr1] INT.x <- INT[dr1] CPI.NSA.z <- as.zoo(CPI.NSA.x) INT.z <- as.zoo(INT.x) > dynlm(CPI.NSA.z ~ I...

What tools & media are available out there that helps me in understanding statistics in better way?

Hello, I'm a graduate student very much interested in Machine Learning, Pattern Recognition & Artificial Intelligence. These subjects are just applications of great mathematical subject Statistics. As an undergraduate, I've done courses in Probability and Statistics. We've used it in many other subjects. But sadly its still is very abs...

computer-science

How can I neatly clean my R workspace while preserving certain objects?

Suppose I'm messing about with some data by binding vectors together, as I'm wont to do on a lazy sunday afternoon. x <- rnorm(25, mean = 65, sd = 10) y <- rnorm(25, mean = 75, sd = 7) z <- 1:25 dd <- data.frame(mscore = x, vscore = y, caseid = z) I've now got my new dataframe dd, which is wonderful. But there's also ...

Any C++ library for Johansen co-integration test ?

Any Ideas ? Will be highly appreciated. ...

Naive Bayesian classification (spam filtering) - Doubt in one calculation? Which one is right? Plz clarify

Hi guys, I am implementing Naive Bayesian classifier for spam filtering. I have doubt on some calculation. Please clarify me what to do. Here is my question. In this method, you have to calculate P(S|W) -> Probability that Message is spam given word W occurs in it. P(W|S) -> Probability that word W occurs in a spam message. P(W...

Given a document, select a relevant snippet.

When I ask a question here, the tool tips for the question returned by the auto search given the first little bit of the question, but a decent percentage of them don't give any text that is any more useful for understanding the question than the title. Does anyone have an idea about how to make a filter to trim out useless bits of a que...

natural-language

text-processing

incremental way of counting quantiles for large set of data

I need to count the quantiles for a large set of data. Let's assume we can get the data only through some portions (i.e. one row of a large matrix). To count the Q3 quantile one need to get all the portions of the data and store it somewhere, then sort it and count the quantile: List<double> allData = new List<double>(); foreach(var ro...

numerical-methods

Dimension Reduction in Categorical Data with missing values

I have a regression model in which the dependent variable is continuous but ninety percent of the independent variables are categorical(both ordered and unordered) and around thirty percent of the records have missing values(to make matters worse they are missing randomly without any pattern, that is, more that forty five percent of the ...

Using ARIMA to model and forecast stock prices using user-friendly stats program

Hi people, Can anyone please offer some insight into this for me? I'm coming from a functional magnetic resonance imaging research background where I analyzed a lot of time series data, and I'd like to analyze the time series of stock prices (or returns) by: 1) modeling a successful stock in a particular market sector and then cross-co...

Question with R. Element wise multiplication, addition, and division with 2 data.frames with varying amounts of missing data in a given row.

I have a various data.frames with columns of the same length where I am trying to multiple 2 rows together element-wise and then sum this up. For example, below are two vectors I would like to perform this operation with. > a.1[186,] q01_a q01_b q01_c q01_d q01_e q01_f q01_g q01_h q01_i q01_j q01_k q01_l q01_m 3 3 3 3 ...

How Should I Generate Trade Statistics For CouchDB/Rails3 Application?

My Problem: I am trying to developing a web application for currency traders. The application allows traders to enter or upload information about their trades and I want to calculate a wide variety of statistics based on what the user entered. Now, normally I would use a relational database for this, but I have two requirements that...

R question. Create new data set that meets all of 4 conditions.

Hello, I would like to create a new dataset where the following four conditions are all met. rowSums(is.na(UNCA[,11:23]))<12 rowSums(is.na(UNCA[,27:39]))<12 rowSums(is.na(UNCA[,40:52]))<12 rowSums(is.na(UNCA[,53:65]))<12 Thanks! ...

Need a R package for piecewise linear regression ?

Does anybody aware of a package for "piecewise linear regression" ? ...

R Question. Numeric variable vs. Non-numeric and "names" function

> scores=cbind(UNCA.score, A.score, B.score, U.m.A, U.m.B) > names(scores)=c('UNCA.scores', 'A.scores', 'B.scores','UNCA.minus.A', 'UNCA.minus.B') > names(scores) [1] "UNCA.scores" "A.scores" "B.scores" "UNCA.minus.A" "UNCA.minus.B" > summary(UNCA.scores) X6.69230769230769 Min. : 4.154 1st Qu.: 7.333 Medi...

1
...
28
29
30
31
32
...
43