r

Resources for learning SAS if you already familiar with R

I would like to learn some SAS because I am interested in a few industries that tend to use it exclusively. However, I don't want to get stuck with a resource that assumes I know nothing about statistical programming. Is there a good guide for programmers with statistics experience in R? Thanks, Steven ...

csv file with multiple time-series

Hi I've imported a csv file with lots of columns and sections of data. v <- read.csv2("200109.csv", header=TRUE, sep=",", skip="6", na.strings=c("")) The layout of the file is something like this: Dataset1 time, data, ..... 0 0 0 <NA> 0 0 Dataset2 time, data, ..... 00:00 0 0 <NA> 0 0 (The headers o...

Does R have quote-like operators like Perl's qw()?

Anyone know if R has quote-like operators like Perl's qw() for generating character vectors? ...

Writing a GUI for the BRCAPRO Cancer Gene Risk Calculation Engine

I think this is a completely unique question on Stack Overflow. First some background: I've been asked to write a new GUI on top of a calculation engine called BRCAPRO (brack-a-pro). BRCAPRO implements a Mendelian computational model based on a piece of software called BayesMendel. BRCAPRO calculation are used by doctors and surgeons...

How can I sort the X axis in a Barplot in R?

Hi, I have binned data that looks like this: (8.048,18.05] (-21.95,-11.95] (-31.95,-21.95] (18.05,28.05] (-41.95,-31.95] 81 76 18 18 12 (-132,-122] (-122,-112] (-112,-102] (-162,-152] (-102,-91.95] 6 6 6 ...

Learning Applied Statistics with a focus on R

I know MIT and Stanford have placed many videos online of their courses. Does anybody know of a course (with videos available online) of Applied Statistics? I've been playing with R and the tool (from a technical side) is pretty straightforward. However, I'm quite clueless when it comes to the statistical side (regressions, recursive p...

what is the best practice of handling time in R?

I am working with a survey dataset. It has two string vectors, start and finish, indicating the time of the day when the interview was started, and finished, respectively. They are character strings that look like: "9:24 am", "12:35 pm", and so forth. i am trying to calculate the duration of the interview based on these two. what is the...

What is the best way to avoid passing a data frame around?

I have 12 data frames to work with. They are similar and I have to do the same processing to each one, so I wrote a function that takes a data frame, processes it, and then returns a data frame. This works. But I am afraid that I am passing around a very big structure. I may be making temporary copies (am I?) This can't be efficient. Wha...

What is the Y function?

A friend of mine asked me if I understood the Y function. I didn't even know what it was. ? Y did not get me anywhere. What is it? ...

How can I remove an element from a list

I have a list and I want to remove a single element from it. How can I do this? I've tried looking up what I think the obvious names for this function would be in the reference manual and I haven't found anything appropriate. ...

Cumulative Plot with Given X-Axis in R

Dear all, I have data that looks like this. In which I want to plot accumulative value of dat1 with respect to x-axis. Also plot it together with dat2. #x-axis dat1 dat2 -10 0.0140149 0.0140146 -9 0.00890835 0.00891768 -8 0.00672276 0.00672488 -7 0.00876399 0.00879401 -6 0.00806...

Column Stores: Comparing Column Based Databases

I've really been struggling to make SQL Server into something that, quite frankly, it will never be. I need a database engine for my analytical work. The DB needs to be fast and does NOT need all the logging and other overhead found in typical databases (SQL Server, Oracle, DB2, etc.) Yesterday I listened to Michael Stonebraker speak a...

Library/tool for drawing ternary/triangle plots

I need to draw ternary/triangle plots representing mole fractions (x, y, z) of various substances/mixtures (x + y + z = 1). Each plot represents iso-valued substances, e.g. substances which have the same melting point. The plots need to be drawn on the same triangle with different colors/symbols and it would be nice if I could also conne...

How expensive is it to compute the eigenvalues of a matrix?

How expensive is it to compute the eigenvalues of a matrix? What is the complexity of the best algorithms? How long might it take in practice if I have a 1000x1000 matrix? I assume it helps if the matrix is sparse? Are there any cases where the eigenvalue computation would not terminate? In R, I can compute the eigenvalues as in t...

How do I color edges or draw rects correctly in an R dendrogram?

I generated this dendrogram using R's hclust(), as.dendrogram() and plot.dendrogram() functions. I used the dendrapply() function and a local function to color leaves, which is working fine. I have results from a statistical test that indicate if a set of nodes (e.g. the cluster of "_+v\_stat5a\_01_" and "_+v\_stat5b\_01_" in the lower...

R Random Forests Variable Importance

I am trying to use the random forests package for classification in R. The Variable Importance Measures listed are: -mean raw importance score of variable x for class 0 -mean raw importance score of variable x for class 1 -MeanDecreaseAccuracy -MeanDecreaseGini Now I know what these "mean" as in I know their definitions. What I wa...

Plots without titles/labels in R

In R is there any way to produce plots which have no title and which use the space the title would otherwise have taken up? In plot(), main, sub, xlab, and ylab all default to NULL, but this just leaves blank space where they would have been, ditto for setting them to ''. It would be nice if not including them meant that the entire plo...

Finding row index containing maximum value using R

Given the following matrix lets assume I want to find the maximum value in column two. [1,2,3; 7,8,9; 4,5,6] I know max(matrix[,2]) will return 8. How can I return the row index, in this case row two? ...

Calculating moving average in R

I'm trying to use R to calculate the moving average over a series of values in a matrix. The normal R mailing list search hasn't been very helpful though. What function in R or that is available as a package will allow me to calculate moving averages? Thanks ...

Suppressing "null device" output with R in batch mode

I have a number of bash scripts which invoke R scripts for plotting things. Something like: #!/bin/bash R --vanilla --slave <<RSCRIPT cat("Plotting $1 to $2\n") input <- read.table("$1") png("$2") plot(as.numeric(input[1,])) dev.off() RSCRIPT The problem is that despite --slave, the call to dev.off() prints the message null device ...