r

Controlling the number of panels in a lattice plot with R

How do I limit the number of panels shown on a single page using lattice? I am graphing the results of a regression for multiple states and putting 50 of these on a single page makes them unreadable. I would like to limit the output to 4 wide and as many tall as needed. Here's my lattice code: xyplot(Predicted_value + Actual_value ~ x...

plotting two vectors of data on a GGPLOT2 scatter plot using R

I've been experimenting with both GGPLOT2 and lattice to graph panels of data. I'm having a little trouble wrapping my mind around the GGPLOT2 model. In particular, how do I plot a scatter plot with two sets of data on each panel: in Lattice I could do this: xyplot(Predicted_value + Actual_value ~ x_value | State_CD, data=dd) and tha...

Reselling an R project

I started working on a project for data analysis. It is becoming quite powerful and I am interested in making it available for sale. However, one of the many components of the analysis process includes the use of R. R has a GPL License (not LGPL). What are the best options for making the system available for sale? I have considered ...

Why doesn't "+" operate on characters in R?

Call me lazy, but I just hate typing things like paste("a","b",sep='') all the time. I know that "(t)his is R. There is no if, only how." (library(fortunes);(fortune(109)). So, my follow up question is: Is it possible to easily change this behavior? ...

Most used data wrangling methods/commands in R

This morning I was reading an interview with Bradford Cross of FlightCaster over on the DataWrangling blog. Bradford mentions in the interview that one of his biggest challenges has been the wrangling of their data. Even though I am an economist I find that I spend more than 70% of my time doing reformatting, ETL, cleaning, etc. I spend ...

Using COM in R language

I am trying to get the rcom package for R working. It seems to have installed ok: > install.packages("rcom"); --- Please select a CRAN mirror for use in this session --- trying URL 'http://mira.sunsite.utk.edu/CRAN/bin/windows/contrib/2.9/rcom_2.2-1.zip' Content type 'application/zip' length 204632 bytes (199 Kb) opened URL downloaded...

What does eg %+% do? in R

This is a very basic question - but apparently google is not very good at searching for strings like "%+%". So my question is - what and when is "%+%" and simlar used. I guess its a kind of merge?. EDIT: Ok - I beleive my question is answered. %X% is binary operator of some kind. So know I think I will google around for knowledge about ...

How do I make a matrix from a list of vectors in R?

Goal: from a list of vectors of equal length, create a matrix where each vector becomes a row. Example: > a <- list() > for (i in 1:10) a[[i]] <- c(i,1:5) > a [[1]] [1] 1 1 2 3 4 5 [[2]] [1] 2 1 2 3 4 5 [[3]] [1] 3 1 2 3 4 5 [[4]] [1] 4 1 2 3 4 5 [[5]] [1] 5 1 2 3 4 5 [[6]] [1] 6 1 2 3 4 5 [[7]] [1] 7 1 2 3 4 5 [[8]] [1] 8 1 2 3...

Speed of R Programming Language

R is a scripting language, but is it fast? It seems like a lot of the packages used by R are actually compiled from C. Does anyone have an example of how using languages like C or Java instead of R caused a noticeable increase in speed? Is there a fair benchmark somewhere? Is there a C (or any other compiled) library that has many of...

Rotating and spacing axis labels in ggplot2

I have a plot where the x-axis is a factor whose labels are long. While probably not an ideal visualization, for now I'd like to simply rotate these labels to be vertical. I've figured this part out with the code below, but as you can see, the labels aren't totally visible. data(diamonds) diamonds$cut <- paste("Super Dee-Duper",as.cha...

Renaming large IDs in R

Suppose I have a data.frame with N rows. The id column has 10 unique values; all those values are integers greater than 1e7. I would like to rename them to be numbered 1 through 10 and save these new IDs as a column in my data.frame. Additionally, I would like to easily determine 1) id given id.new and 2) id.new given id. For example...

Why can't R's ifelse statements return vectors?

I've found R's ifelse statements to be pretty handy from time to time. For example: > ifelse(TRUE,1,2) [1] 1 > ifelse(FALSE,1,2) [1] 2 But I'm somewhat confused by the following behavior. > ifelse(TRUE,c(1,2),c(3,4)) [1] 1 > ifelse(FALSE,c(1,2),c(3,4)) [1] 3 Is this 1) a bug or 2) a design choice that's above my paygrade? ...

How to make topographic map from sparse sampling data?

I need to make a topographic map of a terrain for which I have only fairly sparse samples of (x, y, altitude) data. Obviously I can't make a completely accurate map, but I would like one that is in some sense "smooth". I need to quantify "smoothness" (probably the reciprocal the average of the square of the surface curvature) and I wan...

maintaining an input / output log in R

Is there an easy way to have R record all input and output from your R session to disk while you are working with R interactively? In R.app on Mac OS X I can do a 'File->Save...', but it isn't much help in recovering the commands I had entered when R crashes. I have tried using sink(...,split=T) but it doesn't seem to do exactly what I...

R package installation question.

I had basically two questions. 1) How do I locate the default Rprofile which is running? I have not setup a Rprofile yet, so I am not sure where it is running from? 2) I am trying to install a few packages using the command (after doing a SUDO in the main terminal) install.packages("RODBC","/home/rama/R/i486-pc-linux-gnu-library/2.9")...

Getting foreach() and ggplot2 to get along

I have a set of survey data, and I'd like to generate plots of a particular variable, grouped by the respondent's country. The code I have written to generate the plots so far is: countries <- isplit(drones, drones$v3) foreach(country = countries) %dopar% { png(file = paste(output.exp, "/Histogram of Job Satisfaction in ", country$ke...

How to avoid a loop in R: selecting items from a list

I could solve this using loops, but I am trying think in vectors so my code will be more R-esque. I have a list of names. The format is firstname_lastname. I want to get out of this list a separate list with only the first names. I can't seem to get my mind around how to do this. Here's some example data: t <- c("bob_smith","mary_jane...

Tricks to manage the available memory in an R session?

What tricks do people use to manage the available memory of an interactive R session? I use the functions below [based on postings by Petr Pikal and David Hinds to the r-help list in 2004] to list (and/or sort) the largest objects and to occassionally rm() some of them. But by far the most effective solution was ... to run under 64-bit ...

Finding a curve to match data

I'm looking for a non-linear curve fitting routine (probably most likely to be found in R or Python, but I'm open to other languages) which would take x,y data and fit a curve to it. I should be able to specify as a string the type of expression I want to fit. Examples: "A+B*x+C*x*x" "(A+B*x+C*x*x)/(D*x+E*x*x)" "sin(A+B*x)*exp(C+D*x)+...

Howto Superimpose Multiple Density Curves Into One Plot in R

I have a data that looks like this. And I intend to create multiple density curve into one plot, where each curve correspond to the unique ID. I tried to use "sm" package, with this code, but without success. library(sm) dat <- read.table("mydat.txt"); plotfn <- ("~/Desktop/flowgram_superimposed.pdf"); pdf(plotfn); sm.density.compare...