regression

include error terms in linear regression model with R

I was wondering if there is a way to include error terms for a linear regression model like r = lm(y ~ x1+x2) ? ...

How to separate linear regression plots in R ?

For a linear model with 2 variables r = lm(y ~ x1+x2) When I run plot(r) , I get a bunch of plots such as residuals vs fitted values and so on , but I can only look at one of them at a time . Isnt there a way to seperate them ? ...

scipy linregress function erroneous standard error return?

Hello, I have a weird situation with scipy.stats.linregress seems to be returning an incorrect standard error: >>> from scipy import stats >>> x = [5.05, 6.75, 3.21, 2.66] >>> y = [1.65, 26.5, -5.93, 7.96] >>> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) >>> gradient 5.3935773611970186 >>> intercept -16.28112...

Mysql multivariable linear regression

I am trying to do a multivarible (9 variables) linear regression on data in my mysql 5.0 database (the result value field only has 2 possible values, 1 and 0). I've done some searching and found I can use: mysql> SELECT -> @n := COUNT(score) AS N, -> @meanX := AVG(age) AS "X mean", -> @sumX := SUM(age) AS "X sum", -> @s...

Conditional Logistic Regression - detailed examples

Hi, anyone has online resources / book references that has detailed tutorials/examples on setting up Conditional Logistic Regression ? (Preferably in R, Matlab or Python) ...

Regression in Matlab assuming Student's t Distributed Error Terms

I see that it is possible to use regress/regstats for OLS, and I found an online implementation of L1-Regression (Laplace), but I can't quite seem to figure out how to implement t distributed error terms. I have tried maximizing the log-likelihood of the residuals, but don't seem to be coming up with the right answer. classdef student ...

Automation testing tool for Regression testing of desktop application

Hi I am working on a desktop application which uses Infragistic grids. We need to automate the regression tests for same. QTP alone does not support this, we need to buy new plug in for same which my company is not very much interested in. Do we have any open source tool for automating regression testing of desktop application? Appli...

Is there an equivalence of "anova" (for lm) to an rpart object ?

When using R's rpart function, I can easily fit a model with it. for example: # Classification Tree with rpart library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start, method="class", data=kyphosis) printcp(fit) # display the results plotcp(fit) summary(fit) # detailed summary of splits # plot tree plot(fit, ...

Regressing panel data in SAS.

Hey Guys, thanks to your help I succesfully managed all my databases! I am now looking at a panel data set on which I have to regress. Since I only started my Phd this semester together with the econometrics courses I am still new to many statistic applications and regression methods. I want to do a simple regression as in Y = x1 x2 x3...

regressions with many nested categorical covariates

I have a few hundred thousand measurements where the dependent variable is a probability, and would like to use logistic regression. However, the covariates I have are all categorical, and worse, are all nested. By this I mean that if a certain measurement has "city - Phoenix" then obviously it is certain to have "state - Arizona" and "c...

Multi-variate regression using NumPy in Python?

Is it possible to perform multi-variate regression in Python using NumPy? The documentation here suggests that it is, but I cannot find any more details on the topic. ...

What is the difference between Multiple R-squared and Adjusted R-squared in a single-variate least squares regression?

Could someone explain to the statistically naive what the difference between Multiple R-squared and Adjusted R-squared is? I am doing a single-variate regression analysis as follows: v.lm <- lm(epm ~ n_days, data=v) print(summary(v.lm)) Results: Call: lm(formula = epm ~ n_days, data = v) Residuals: Min 1Q Median 3Q...

What is the definition of different regression bugs in regression test for a software?

Hi, there are 3 regression bugs while doing a regression test for a software. "local","unmasked" and "remote". Does any one know the definition of each? thanks ...

What libraries exist for Regression or Ordinary-Least-Squares on Amazon EC2?

What libraries or services exist to run logistic or linear regressions in a distributed fashion on cloud providers like EC2? Alternatively, what Ordinary Least Squares regression libraries exist for cloud distribution? The data set is far larger than can fit in memory - so the library must handle this constraint. ...

Regression Testing and Deployment Strategy

I'd like some advice on a deployment strategy. If a development team creates an extensive framework, and many (20-30) applications consume it, and the business would like application updates at least every 30 days, what is the best deployment strategy? The reason I ask is that there seems to be a lot of waste (and risk) in using an agil...

What's the correct terminology for something that isn't quite classification nor regression?

Let's say that I have a problem that is basicly classification. That is, given some input and a number of possible output classes, find the correct class for the given input. Neural networks and decision trees are some of the algorithms that may be used to solve such problems. These algorithms typically only emit a single result however:...

Screening (multi)collinearity in a regression model

I hope that this one is not going to be "ask-and-answer" question... here goes: (multi)collinearity refers to extremely high correlations between predictors in the regression model. How to cure them... well, sometimes you don't need to "cure" collinearity, since it doesn't affect regression model itself, but interpretation of an effect o...

large-scale regression in R with a sparse feature matrix

i'd like to do large-scale regression (linear/logistic) in R with many (e.g. 100k) features, where each example is relatively sparse in the feature space---e.g., ~1k non-zero features per example. it seems like the SparseM package slm should do this, but i'm having difficulty converting from the sparseMatrix format to a slm-friendly for...

PHP Estimation Function

I am trying to calculate value $x in a number series based on an array of numbers (as $numbers). Ex: $numbers = array(1=>1000,2=>600,3=>500,4=>450,5=>425,6=>405,7=>400,8=>396); function estimateNumber($x) { // function to estimate number $x in $numbers data set } What would be the most statistically accurate method? ...

Logistic Regression in R (SAS-like output)

I have a problem at hand which I'd think is fairly common amongst groups were R is being adopted for Analytics in place of SAS. Users would like to obtain results for logistic regression in R that they have become accustomed to in SAS. Towards this end, I was able to propose the Design package in R which contains many functions to extra...