r

How can I alter the appearance of nodes in igraph?

I would like to lay out graphs (trees) with two types of nodes: boxes and circles. Is this possible with igraph and how would a minimal example look like? ...

How can I declare a thousand separator in read.csv?

The dataset I want to read in contains numbers with and without a comma as thousand separator: "Sudan", "15,276,000", "14,098,000", "13,509,000" "Chad", 209000, 196000, 190000 and I am looking for a way to read this data in. Any hint appreciated! ...

Can't draw Histogram, 'x' must be numeric

I have a data file with this format: Weight Industry Type 251,787 Kellogg h 253,9601 Kellogg a 256,0758 Kellogg h .... I read the data and try to draw an histogram with this commands: ce= read.table("file.txt", header= T) we = ce[,1] in = ce[,2] ty = ce[,3] hist(we) But I get this error: Error en hist.default(we) : '...

Combining 2 columns into 1 column many times in a very large dataset in R

Combining 2 columns into 1 column many times in a very large dataset in R The clumsy solutions I am working on are not going to be very fast if I can get them to work and the true dataset is ~1500 X 45000 so they need to be fast. I definitely at a loss for 1) at this point although have some code for 2) and 3). Here is a toy example o...

Using R in Processing through rJava/JRI?

Hi guys, Is it possible to run R in Processing through rJava/JRI? If I deployed a Processing app on the web, would the client need R on their system? I'm looking to create an interactive information dashboard that I can deploy on the web. It seems that Processing is probably my best bet for the interactive/web part of things. Unfortun...

R interview questions..

Hi, What are some common R interview questions? I'm not sure what are the must-know for someone who claims to have working knowledge of R so I'd like to test myself. Also, if you were an interviewer and looking for an R person, what would you ask? Thanks.. -k ...

Subset a data.frame by list and apply function on each part, by rows

This may seem as a typical plyr problem, but I have something different in mind. Here's the function that I want to optimize (skip the for loop). # dummy data set.seed(1985) lst <- list(a=1:10, b=11:15, c=16:20) m <- matrix(round(runif(200, 1, 7)), 10) m <- as.data.frame(m) dfsub <- function(dt, lst, fun) { # check whether dt is `...

Insert line breaks in long string -- word wrap

Here is a function I wrote to break a long string into lines not longer than a given length strBreakInLines <- function(s, breakAt=90, prepend="") { words <- unlist(strsplit(s, " ")) if (length(words)<2) return(s) wordLen <- unlist(Map(nchar, words)) lineLen <- wordLen[1] res <- words[1] lineBreak <- paste("\n", prepend, sep...

How (and why) do you use contrasts (in R) ?

I am sorry for asking such a basic question, but I can't seem to put my head around this or find a satisfying answer. I checked ?contrasts and ?C - both lead to "Chapter 2 of Statistical Models in S", which is not readily available to me. Under what cases do you create contrasts (in R) in your analysis? How is it done and what is it us...

How to merge two data.frames together in R, referencing a lookup table.

I am trying to merge two data.frames together, based on a common column name in each of them called "series_id". Here is my merge statement: merge(test_growth_series_LUT, test_growth_series, by = intersect(series_id, series_id)) The error I'm getting is "Error in as.vector(y) : object 'series_id' not found" The help gives this de...

Fixed effects regression in R (with a very large number of dummy variables)

Is there an easy way to do a fixed-effects regression in R when the number of dummy variables leads to a model matrix that exceeds the R maximum vector length? E.g., > m <- lm(log(bid) ~ after + I(after*score) + id, data = data) Error in model.matrix.default(mt, mf, contrasts) : cannot allocate vector of length 905986769 where id is...

In R, how do I set an S4 class based on another object's class

I need to create an object of type ShortReadQ from Bioconductor's ShortRead library. ShortReadQ 'signature(sread = "DNAStringSet", quality = "QualityScore", id = "BStringSet")' The quality slot needs to be an object inheriting from QualityScore, of which I can easily determine from another ShortReadQ object that I wish to em...

How to add a title to a ggplot when the title is a variable name?

At the end of a ggplot, this works fine: + opts(title = expression("Chart chart_title...")) But this does not: chart_title = "foo" + opts(title = expression(chart_title)) nor this: chart_title = "foo" + opts(title = chart_title) How can I add a title to a ggplot when the title is a variable name? ...

R: Applying nlminb to subsets of data (by index or label) and store what the program returns as a new data frame

I was wondering if anyone could kindly help me on this seemingly easy task. I'm using nlminb to conduct optimization and compute some statistics by index. Here's an example from nlminb help. > x <- rnbinom(100, mu = 10, size = 10) > hdev <- function(par) { + -sum(dnbinom(x, mu = par[1], size = par[2], log = TRUE)) + } > nlminb(c(9,...

How to add a condition to the geom_point size?

I am trying to add a condition to geom_point size and I've pasted my example below. I would like geom_point size to be 2 when n_in_stat is 4 or more, and size = 5 when n_in_stat is less than 4. I tried putting an ifelse statement inside the geom_point, but this failed. Perhaps I can't include logical operators here and I have to crea...

ggplot2 geom_area overlapping instead of stacking

I'm trying to generate a stacked area plot, but instead, ggplot makes overlapping areas. I've tried other examples that seems analogous to me, but they work and mine doesn't. > cx date type visitors 1 2009-11-23 A 2 2 2010-01-07 A 4 3 2010-01-09 A 6 4 2010-02-07 A 8 5 2009-12-02 B...

How to change current Plot Window Size (in R)

For example. Assume I do: dev.new(width=5, height=4) plot(1:20) And now I wish to do plot(1:40) But I want a bigger window for it. I would guess that the way to do it would be (assuming I don't want to open a new window) to do plot(1:40, width=10, height=4) Which of course doesn't work. The only solution I see to it would be t...

Melting a cast data frame gives incorrect output (Hadley Wickham's reshape package in R)

I've encountered a strange behaviour in cast/melt from Hadley Wickham's reshape package. If I cast a data frame, and then try to melt it, the melt comes out wrong. Manually unsetting the "df.melt" class from the cast dataframe lets it be melted properly. Does anyone know if this is intended behaviour, and if so, what is the use case whe...

How to get row index number in R?

Suppose I have a list or data frame in R, and I would like to get the row index, how do I do that? ...

modify lm or loess function to use it within ggplot2's geom_smooth

I need to modify the lm (or eventually loess) function so I can use it in ggplot2's geom_smooth (or stat_smooth). For example, this is how stat_smooth is used normally: > qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm')` I would like to define a custom 'lm2' function to use as value for the 'method' para...