r

In R, how to count TRUE values in a logical vector

In R, what is the most efficient/idiomatic way to count the number of TRUE values in a logical vector? I can think of two ways: > z<-sample(c(TRUE,FALSE),1000,rep=TRUE) > sum(z) [1] 498 > table(z)["TRUE"] TRUE 498 Which do you prefer? Is there anything even better ...

Extract a regular expression match in R version 2.10

Hi, I'm trying to extract a number from a string. And do something like this [0-9]+ on this string "aaaa12xxxx" and get "12". I thought it would be something like: > grep("[0-9]+","aaa12xxx", value=TRUE) [1] "aaa12xxx" And then I figured... > sub("[0-9]+", "\\1", "aaa12xxxx") [1] "aaa12xxx" But I got some form of response doin...

Library/package development - message when loading

Hello all, is there any way to display a message when a user loads library(myCustomLibrary)? Upon loading, I want to display a message that tells the user how to run all the test functions. Kind regards, Yannick ...

Execution Efficiency vs Programmer Efficiency in R

The classic and brilliant Programming Perl reference book has a section in which the authors provide a list of advice for how to write Perl that is maximally computationally efficient, followed by a list of advice for how to write Perl that is maximally programmer efficient, followed by more advice for maintainer efficient, porter effici...

Ways to read only select columns from a file into R? (A happy medium between `read.table` and `scan`?)

I have some very big delimited data files and I want to process only certain columns in R without taking the time and memory to create a data.frame for the whole file. The only options I know of are read.table which is very wasteful when I only want a couple of columns or scan which seems too low level for what I want. Is there a bette...

converting R code snippet to use the Matrix package?

Hello, I am not sure there are any R users out there, but just in case: I am a novice at R and was kindly "handed down" the following R code snippet: Beta <- exp(as.matrix(read.table('beta.transpose'))) WordFreq <- read.table('freq-matrix') WordProbs <- WordFreq$V1 / sum(WordFreq) infile <- file('freq-matrix') outfile <- file('doc_to...

Information Dashboards in R with ggplot2

I'm looking to create a static dashboard viewable in a web browser. And I'd like to create something like what Stephen Few does in his book Information Dashboard Design. (see example at bottom) Ggplot2: Shouldn't be any issue producing the graphs below, right? Dashboard Layout: Is grid suitable? Or should I lay things out in html/css? ...

What programming languages are good for statistics?

I'm doing a bit more statistical analysis on some things lately, and I'm curious if there are any programming languages that are particularly good for this purpose. I know about R, but I'd kind of prefer something a bit more general-purpose (or is R pretty general-purpose?). What suggestions do you guys have? Are there any languages o...

Merge several data.frames into one data.frame with a loop

I am trying to merge several data.frames into one data.frame. Since I have a whole list of files I am trying to do it with a loop structure. So far the loop approach works fine. However, it looks pretty inefficient and I am wondering if there is a faster and easier approach. Here is the scenario: I have a directory with several .csv fi...

How do you compare the "similarity" between two dendrograms (in R) ?

I have two dendrograms which I wish to compare to each other in order to find out how "similar" they are. But I don't know of any method to do so (let alone a code to implement it, say, in R). Any leads ? Thanks, Tal ...

Using ggplot, how to have the x-axis of time series plots set up automatically?

Is there a way of plotting a univariate time series of class "ts" using ggplot that sets up the time axis automatically? I want something similar to plot.ts() of base graphics. Also it seems to me that the coarsest time granularity is a day. Is that right? In my work I have to work with monthly and quarterly data and assigning each obse...

Identifying unique terms from list of character vectors

I have a list of character vectors in R that represents sets of cooccuring words. From this, I would like to extract a character vector capturing all the words that appear in the list of character vectors. I think I know how to efficiently go from a character vector of words to a unique character vector of the words that appeared. What I...

Generating dendrograms from genealogy data in R

Is there any way to generate a dendrogram where each level of the graph represents a generation and only sons of the same father are connected at each level? I'm attempting to use R's hclust and plot functions to generate a dendrogram of father-son lineage. The desired result is a dendrogram where each generation of sons is placed on t...

Can R read from a file through an ssh connection?

R can read files on a web server using convenient syntax such as data <- read.delim("http://remoteserver.com/file.dat") I wonder if there is a way to do something similar with a file on an ssh server with passwordless-ssh already in place? ...

In R, what is the difference between these two?

0.9 == 1-0.1 >>> TRUE 0.9 == 1.1-0.2 >>> FALSE ...

higher level functions in R - is there an official compose operator or curry function?

I can create a compose operator in R: `%c%` = function(x,y)function(...)x(y(...)) To be used like this: > numericNull = is.null %c% numeric > numericNull(myVec) [2] TRUE FALSE but I would like to know if there is an official set of functions to do this kind of thing and other operations such as currying in R. Largely this is ...

Merging two Data Frames using Fuzzy/Approximate String Matching in R

DESCRIPTION I have two datasets with information that I need to merge. The only common fields that I have are strings that do not perfectly match and a numerical field that can be substantially different The only way to explain the problem is to show you the data. Here is a.csv and b.csv. I am trying to merge B to A. There are three...

[R] how to do a data.table merge operation

I've been digging through the documentation for the data.table package (a replacement for data.frame that's much more efficient for certain operations), including Josh Reich's presentation on SQL and data.table at the NYC R Meetup (pdf), but can't figure this totally trivial operation out. > x <- DT(a=1:3, b=2:4, key='a') > x a b [...

Does R have an assert statement as in python?

a statement that checks if something is true and if not prints a given error message and exits ...

How can I plot multiple functions in R?

Using ggplot, is there a way of graphing several functions on the same plot? I want to use parameters from a text file as arguments for my functions and overlay these on the same plot. I understand this but I do not know how to add the visualized function together if I loop through. ...