r

in R, how to distribution data into different group

i have group like 1-10 , 10-20, 20-30, 30-40. and i have data like "1,23,24,11,33,22,5,6,7,8,3,2" how can i find out how many in each group ...

Suggest a good book for Quantitative Methods & R Programming

Hi folks, Please suggest a good book for beginner in Quantitative Methods/Techniques. Adding to this, a good book for beginners in R programming language, used in Quantitative Methods. And I've a few questions about this: ? Should I have to learn the other subjects like Probability, Statics, etc. before learning Quantitative Methods ? ...

How to get the second sub element of every element in a list in R

I know I've come across this problem before, but I'm having a bit of a mental block at the moment. and as I can't find it on SO, I'll post it here so I can find it next time. I have a dataframe that contains a field representing an ID label. This label has two parts, an alpha prefix and a numeric suffix. I want to split it apart and ...

Panel data with binary dependent variable in R

Is it possible to do regressions in R using a panel data set with a binary dependent variable? I am familiar with using glm for logit and probit and plm for panel data, but am not sure how to combine the two. Are there any existing code examples? Thank you. EDIT It would also be helpful if I could figure out how to extract the matrix ...

R - Specifying colClasses in the read.csv

Hi, I am trying to specify the colClasses options in the read.csv function in R. In my data, the first column "time" is basically a character vector while the rest of the columns are numeric. data<-read.csv("test.csv" , comment.char="" , colClasses=c(time="character","numeric") , strip.white=FALSE) In the above command, I would want ...

Margin adjustments when using ggplot's geom_tile()

From the documentation for ggplot2's geom_tile() function, we have the following simple plot: > # Generate data > pp <- function (n,r=4) { + x <- seq(-r*pi, r*pi, len=n) + df <- expand.grid(x=x, y=x) + df$r <- sqrt(df$x^2 + df$y^2) + df$z <- cos(df$r^2)*exp(-df$r/6) + df + } > p <- ggplot(pp(20), aes(x=x,y=y)) > > p + g...

Can I use rattle on 64-bit R?

Trying to install rattle on a windows server 2008 R2 64bit machine, using 64-bit R ver2.11, I got the following message: install.packages("rattle", dependencies=TRUE) Warning: dependencies ‘RGtk2’, ‘rggobi’, ‘RSvgDevice’, ‘Biobase’, ‘multicore’, ‘marray’, ‘affy’, ‘snowFT’, ‘Rmpi’, ‘rpvm’ are not available When I tried to install one o...

R is plotting labels off the page.

Hi. i'm running the following: png(filename="figure.png", width=900, bg="white") barplot(c(1.1, 0.8, 0.7), horiz=TRUE, border="blue", axes=FALSE, col="darkblue") axis(2, at=1:3, lab=c("elephant", "hippo", "snorkel"), las=1, cex.axis=1.3) dev.off() and the labels on the left are appearing off the page. I can't seem to figure out how t...

Concatenate Row and Column names from Data.Frame

Is there a way to concatenate the row and column names from an existing data.frame into a new data frame. For example, I have column names of (A, B, C) and row names of (1, 2, 3) and I would like to combine these into a 3x3 matrix [A1, B1, C1; A2, B2, C2; A2, B2, C2]. Thanks for your help ...

aggregate over several variables in r

Dear overflowers, I have a rather large dataset in a long format where I need to count the number of instances of the ID due to two different variables, A & B. E.g. The same person can be represented in multiple rows due to either A or B. What I need to do is to count the number of instances of ID which is not too hard, but also count t...

Parallel processing in R 2.11 Windows 64-bit using SNOW not quite working

I'm running R 2.11 64-bit on a WinXP64 machine with 8 processors. With R 2.10.1 the following code spawned 6 R processes for parallel processing: require(foreach) require(doSNOW) cl = makeCluster(6, type='SOCK') registerDoSNOW(cl) bl2 = foreach(i=icount(length(unqmrno))) %dopar% { (Some code here) } stopCluster(cl) When I run the...

xts problem with dynlm

Hello, I am trying to use xts as much as possible in my time series work as it seems to be the suggested way of doing things. However, I have getting a strange error. CPI.NSA and INT are xts objects. library(dynlm) CPI.NSA.x <- CPI.NSA[dr1] INT.x <- INT[dr1] CPI.NSA.z <- as.zoo(CPI.NSA.x) INT.z <- as.zoo(INT.x) > dynlm(CPI.NSA.z ~ I...

.NET coupled with MATLAB or R?

I'm writing a program in .NET that will need to utilize the statistical and data analysis functions of R or MATLAB. I have used R but am now contemplating moving to MATLAB since it has a .Net compiler while R can only interface via COM objects. Can anyone recommend going either way? I know MATLAB is infinitely more expensive than R (sinc...

Interactive Charts for web application

We are working on a web based application (implemented in JAVA) on commodity prices and one part of it is interactive charting. I provide a simplified example here. We have a table in Mysql database where we have information on commodity prices in US states and counties. One aspect of the application is to create interactive plots based ...

What's the best way to annotate this ggplot2 plot? [R]

Here's a plot: library(ggplot2) ggplot(mtcars, aes(x = factor(cyl), y = hp, group = factor(am), color = factor(am))) + stat_smooth(fun.data = "mean_cl_boot", geom = "pointrange") + stat_smooth(fun.data = "mean_cl_boot", geom = "line") + geom_hline(yintercept = 130, color = "red") + annotate("text", label = "130 hp", x = ...

Repeat elements of vector in R

Hi, I'm trying to repeat the elements of vector a, b number of times. That is, a="abc" should be "aabbcc" if y = 2. Why doesn't either of the following code examples work? sapply(a, function (x) rep(x,b)) and from the plyr package, aaply(a, function (x) rep(x,b)) I know I'm missing something very obvious ... ...

select rows with largest value of variable within a group in r

a.2<-sample(1:10,100,replace=T) b.2<-sample(1:100,100,replace=T) a.3<-data.frame(a.2,b.2) r<-sapply(split(a.3,a.2),function(x) which.max(x$b.2)) a.3[r,] returns the list index, not the index for the entire data.frame Im trying to return the largest value of b.2 for each subgroup of a.2. How can I do this efficiently? ...

How can I neatly clean my R workspace while preserving certain objects?

Suppose I'm messing about with some data by binding vectors together, as I'm wont to do on a lazy sunday afternoon. x <- rnorm(25, mean = 65, sd = 10) y <- rnorm(25, mean = 75, sd = 7) z <- 1:25 dd <- data.frame(mscore = x, vscore = y, caseid = z) I've now got my new dataframe dd, which is wonderful. But there's also ...

How to mutate rows of data frame - replacing one value with another

I'm having trouble with what I think is a basic R task. Here's my sample dataframe named 'b' Winner Color Size Tom Yellow Med Jerry Yellow Lar Jane Blue Med where items in the Winner column are factors. I'm trying to change "Tom" in the dataframe to "Tom LLC" and I can't get it done. Here's what I tried: Simple way: b$winner[b$win...

trying to append a list, but something breaks

I'm trying to create an empty list which will have as many elements as there are num.of.walkers. I then try to append, to each created element, a new sub-list (length of new sub-list corresponds to a value in a. When I fiddle around in R everything goes smooth: list.of.dist[[1]] <- vector("list", a[1]) list.of.dist[[2]] <- vector("list...