data.frame

How to sort a dataframe by column(s) in R

I want to sort a dataframe by multiple columns in R. For example, with the data frame below I would like to sort by column z (descending) then by column b (ascending): dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"), levels = c("Low", "Med", "Hi"), ordered = TRUE), x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9), ...

On Data Frame: Writing to File and Naming Binded Vector in R

I have a data that looks like this. And my code below simply compute some value and binds the output vector to the original data frames. options(width=200) args<-commandArgs(trailingOnly=FALSE) dat <- read.table("http://dpaste.com/89376/plain/",fill=T); problist <- c(); for (lmer in 1:10) { meanl <- lmer; stdevl <- (0.17*sqrt(l...

How to make an R function return multiple columns and append them to a data frame?

Starting with this data frame myDF = structure(list(Value = c(-2, -1, 0, 1, 2)), .Names = "Value", row.names = c(NA, 5L), class = "data.frame") Suppose I want to run this function on every row of myDF$Value getNumberInfo <- function(x) { if(x %% 2 ==0) evenness = "Even" else evenness="Odd" if(x > 0) positivity = "Positive" else posi...

Reshaping data frame in R

I'm running into difficulties reshaping a large dataframe. And I've been relatively fortunate in avoiding reshaping problems in the past, which also means I'm terrible at it. My current dataframe looks something like this: unique_id seq response detailed.name treatment a N1 123.23 descr. of N1 T1 a ...

Working with Data.frames in R (Using SAS code to describe what I want)r

I've been mostly working in SAS of late, but not wanting to lose what familiarity with R I have, I'd like to replicate something basic I've done. You'll forgive me if my SAS code isn't perfect, I'm doing this from memory since I don't have SAS at home. In SAS I have a dataset that roughly is like the following example (. is equivalent o...

How to find top n% of records in a column of a dataframe using R

Hi there, I have a dataset showing the exchange rate of the Australian Dollar versus the US dollar once a day over a period of about 20 years. I have the data in a data frame, with the first column being the date, and the second column being the exchange rate. Here's a sample from the data: >data V1 V2 1 12/12/19...

rbind dataframes in a list of lists

I have a list of lists that looks like this: x[[state]][[year]]. Each element of this is a data frame, and accessing them individually is not a problem. However, I'd like to rbind data frames across multiple lists. More specifically, I'd like to have as output as many dataframes as I have years, that is rbind all the state data frames ...

Filtering a data frame in R

Hi, let's suppose that I have data frame like expr_value cell_type 1 5.345618 bj fibroblast 2 5.195871 bj fibroblast 3 5.247274 bj fibroblast 4 5.929771 hesc 5 5.873096 hesc 6 5.665857 hesc 7 6.791656 hips 8 7.133673 hips 9 7.574058 hips 10 7.2080...

Quickly reading very large tables as dataframes in R

Hello, I have very large tables that I would like to load as a dataframes in R. read.table() has a lot of convenient features, but it seems like there is a lot of logic in the implementation that would slow things down. In my case, I am assuming I know the types of the columns ahead of time, the table does not contain any column heade...

count of entries in data frame in R

I'm looking to get a count for the following data frame: > Santa Believe Age Gender Presents Behaviour 1 FALSE 9 male 25 naughty 2 TRUE 5 male 20 nice 3 TRUE 4 female 30 nice 4 TRUE 4 male 34 naughty of the number of children who believe. What command would I use to...

Optimising R function that adds a new column to a data.frame

I have a function that at the moment programmed in a functional model and either want to speed it up and maybe solve the problem more in the spirit of R. I have a data.frame and want to add a column based on information that's where every entry depends on two rows. At the moment it looks like the following: faultFinging <- function(hear...

How to transform XML data into a data.frame?

I'm trying to learn R's XML package. I'm trying to create a data.frame from books.xml sample xml data file. Here's what I get: library(XML) books <- "http://www.w3schools.com/XQuery/books.xml" doc <- xmlTreeParse(books, useInternalNodes = TRUE) doc xpathApply(doc, "//book", function(x) do.call(paste, as.list(xmlValue(x)))) xpathSApply(d...

Generating interaction variables in R dataframes

Is there a way - other than a for loop - to generate new variables in an R dataframe, which will be all the possible 2-way interactions between the existing ones? i.e. supposing a dataframe with three numeric variables V1, V2, V3, I would like to generate the following new variables: Inter.V1V2 (= V1 * V2) Inter.V1V3 (= V1 * V3) Inter....

R: Manipulating a data frame with contents from a different data frame

Say I have a data frame with the contents: Trial Person Time 1 John 1.2 2 John 1.3 3 John 1.1 1 Bill 2.3 2 Bill 2.5 3 Bill 2.7 and another data frame with the contents: Person Offset John 0.5 Bill 1.0 and I want to modify the original frame based on the appropriate value from the second. I co...

R: Performing binary function to a column in a data frame.

Say I have a data frame with the contents: Trial Person 1 John 2 John 3 John 4 John 1 Bill 2 Bill 3 Bill 4 Bill and I want to transform this to Trial Person Day 1 John 1 2 John 1 3 John 2 4 John 2 1 Bill 1 2 Bill 1 3 Bill 2 4 Bill 2 I can ver...

Vector vs. Data frame in R

What is the difference between a vector and a data frame in R? Under what circumstances vectors should be converted to data frames? ...

Best way to store variable-length data in an R data.frame?

I have some mixed-type data that I would like to store in an R data structure of some sort. Each data point has a set of fixed attributes which may be 1-d numeric, factors, or characters, and also a set of variable length data. For example: id phrase num_tokens token_lengths 1 "hello world" 2 ...

How to reference columns of a data.frame within a data.frame?

I have a data.frame called series_to_plot.df which I created by combining a number of other data.frames together (shown below). I now want to pull out just the .mm column from each of these, so I can plot them. So I want to pull out the 3rd column of each data.frame (e.g. p3c3.mm, p3c4.mm etc...), but I can't see how to do this for al...

Subset a data.frame by list and apply function on each part, by rows

This may seem as a typical plyr problem, but I have something different in mind. Here's the function that I want to optimize (skip the for loop). # dummy data set.seed(1985) lst <- list(a=1:10, b=11:15, c=16:20) m <- matrix(round(runif(200, 1, 7)), 10) m <- as.data.frame(m) dfsub <- function(dt, lst, fun) { # check whether dt is `...

How to merge two data.frames together in R, referencing a lookup table.

I am trying to merge two data.frames together, based on a common column name in each of them called "series_id". Here is my merge statement: merge(test_growth_series_LUT, test_growth_series, by = intersect(series_id, series_id)) The error I'm getting is "Error in as.vector(y) : object 'series_id' not found" The help gives this de...