r

How to control the dimension / size of a plot with ggplot2

Dear all, I am using ggplot2 (respectively qplot) to generate a report with Sweave. Now I need some help with the adjustment of the size of the plot. I use the following Sweave code to include it. \begin{figure}[htbp] \begin{center} <<fig=true,echo=false>>= print(mygraph) @ \caption{MyCaption} \end{center} \end{figure} If I add a wi...

In R, What is the difference between df["x"] and df$x

Where can I find information on the differences between calling on a column within a data.frame via: df <- data.frame(x=1:20,y=letters[1:20],z=20:1) df$x df["x"] They both return the "same" results, but not necessarily in the same format. Another thing that I've noticed is that df$x returns a list. Whereas df["x"] returns a data.fram...

Does R follow BEDMAS - strictly?

Perhaps this is a ridiculous question. Has anyone ever experienced a situation where R does not follow BEDMAS (Brackets, Exponents, Division, Multiplication, Addition, Subtraction) ...

R: ddply() Possible to reuse generated columns?

I have a script where I'm using ddply, as in the following example: ddply(df, .(col), function(x) data.frame( col1=some_function(x$y), col2=some_other_function(x$y) ) ) Within ddply, is it possible to reuse col1 without calling the entire function again? For example: ddply(df, .(col), function(x) data.frame( col1=some_function(x$y)...

Moving columns within a data.frame() without retyping

Is there a method for moving a column from one position in a data.frame to the next - without typing an entirely new data.frame() For example: a <- b <- c <- d <- e <- f <- g <- 1:100 df <- data.frame(a,b,c,d,e,f,g) Now let's say I wanted "g" in front of "a" I could retype it, as df <- data.frame(g,a,b,c,d,e,f) But is there not...

Creating multiple subsets all in one data.frame (possibly with ddply)

I have a large data.frame, and I'd like to be able to reduce it by using a quantile subset by one of the variables. For example: x <- c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10) df <- data.frame(x,rnorm(100)) df2 <- subset(df, df$x == 1) df3 <- subset(df2, df2[2] > quantile(df2$rnorm.100.,0.8)) What I would like to end up wi...

How to use ggplot2 graphics inside minipage with Sweave?

Here's my code that's supposed to display to graphics next to each other, but fails to do so. In fact the sweave part is not interpreted. \begin{figure}[h] \begin{center} \begin{minipage}[t]{.485\linewidth} % <<fig=true,echo=false>>= print(graph2) @ \newline{\color{red}{\caption{\label{idx}Graph one}}} \end{minipage} \hspace...

R heights graph - relative to birth date

I have a group of people, with a space-separated text file for each person. In these files, the right value indicates the height of that person in cm and the left value indicates the date in %d/%m/%Y format: 09/05/1992 0 17/03/1993 50 02/08/1994 65.5 03/12/1995 72 A height of 0 marks the birth date of the person. This R script draws ...

Overlay multiple stat_function calls in ggplot2

I have two data.frames, one containing raw data and the other containing modelling coefficients that I have derived from the raw data. More detail: The first data.frame "raw" contains "Time" (0s to 900s) and "OD" for many Variants and four runs. The second data.frame "coef" contains one row per Variant/run combination, with the individ...

Iterating over the big matrix containing 3000 rows and calculate the correlation.

Hello, I am trying to loop over the a matrix and do the correlation coefficiency of each two-row and print out the correlation matrix. ID A B C D E F G H I Row01 0.08 0.47 0.94 0.33 0.08 0.93 0.72 0.51 0.55 Row02 0.37 0.87 0.72 0.96 0.20 0.55 0.35 0.73 0.44 Row03 0.19 0.71 0.52 0.73 0.03 0.18 0.13 0.13 0.30 Row04 0.08 0.77 0.89 0.12 0...

how to rescale two-dimensional plot for printing

Here's my code: x <- rnorm(1000) y <- rnorm(1000) plot(x,y) I just created two standard normal vectors of size 1,000 and plotted them in the xy-plane. When I look at the GUI, the scatter looks "spherical" and the axes are scaled equivalently. Cool. But when I print the image, the x-axis elongates, and so the scatter no longer is sph...

Iterating over the big matrix containing 3000 rows and calculate the correlation.-Follow-up!

Thanks Nico! Almost got there after I corrected small bugs. Here I attache my script: datamatrix=read.table("ref.txt", sep="\t", header=T, row.names=1) correl <- NULL for (i in 1:nrow(datamatrix)) { correl <- apply(datamatrix, 1, function(x) {cor(t(datamatrix[, i]))}) write.table(correl, paste(row.names(datamatrix)[i], ".txt", sep...

Learning R. Where does one Start?

I've been using R for a little over a year now and it's been a successful venture. But all to often, I find that there is something that I can't figure out for lack of knowing how to find it or an example of it. Stackoverflow, Could you recommend a pathway for learning R in a manner that provides one with a toolset at their disposal t...

How to adjust line size in geom_line without obtaining another (useless) legend?

Dear all, i´d like to adjust the size of my lines (both of them), because i feel they're too skinny. The following code does so, but creates a legend for size, which is useless since size has no variable that can be mapped to it. qplot(date,value,data=graph1,geom="line",colour=variable,xlab="",ylab="",size=1) + scale_y_continuous(lim...

Searching R help for "for" and "repeat" loop(s) help file

I'm trying to load the page that describes these 'functions'. However, R console in windows seems to hate me, it just returns + ?for ?repeat ...

Adding labels to data with ddply while subsetting.

Let's say I have a data.frame like: x <- c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10) df <- data.frame(x=x,y=rnorm(100)) and I want to label values that are sorted (descending) in the 80th percentile for each value of x (1:10). I can get the quantiles and order the data, without issue like this: df <- ddply(df, .(x), subset, ...

How to break a large CSV data file into individual data files using R?

I have a CSV file the first row of which contains the variables names and the rest of the rows contains the data. What's a good way to break it up into files each containing just one variable in R? Is this solution going to be robust? E.g. what if the input file is 100G in size? The input files looks like var1,var2,var3 1,2,hello 2,5,...

How do I set up rpy2?

Hi I just download rpy2 and Python 2.6. When I try to run some of example code I found on the internet, I got this error. Can anyone explain why this is happening and how can I fix it? Thanks. import rpy2.robjects as RO Traceback (most recent call last): File "<pyshell#0>", line 1, in <module> import rpy2.robjects as RO File "C...

Debian/Ubuntu r-base-*, r-cran-*, revolution-r packages: porting to ArchLinux

Recently I've migrated to Arch Linux, after ~4 years being loyal to Ubuntu. Everything works like a charm, it's noticeably faster than Ubuntu, IMHO it's easier to customise, but when it is to do with support for R, well, Ubuntu takes a medal. I'm not willing to do another distro-shuffle and switch back to Ubuntu, while Debian is just "to...

Generate new time-lagged variable in R

I'm creating a time-series object with new variables using the transform() function in R and cannot find the proper function to calculate the difference in variable C between today and yesterday. This is what I've got so far: O H L C Typical Range 2010-07-23 1092.17 1103.73 1087.88 1102.66 1098.090 ...