data.frame

converting uneven hierarchical list to a data frame

I don't think this has been asked yet, but is there a way to combine information of a list with multiple levels and uneven structure into a data frame of "long" format? Specifically: library(XML) library(plyr) xml.inning <- "http://gd2.mlb.com/components/game/mlb/year_2009/month_05/day_02/gid_2009_05_02_chamlb_texmlb_1/inning/inning_5....

data.frame rows to a list

I have a data.frame which I would like to convert to a list by rows, meaning each row would correspond to its own list elements. In other words, I would like a list that is as long as the data.frame has rows. So far, I've tackled this problem in the following manner, but I was wondering if there's a better way to approach this. xy.df <...

Getting rid of the "hidden column" in data frames of R

I have a data frame that I am outputting to MS Word. Lets say I'm trying to output the data frame mtcars, then using the R2wd package, I get: #install.packages("R2wd") require(R2wd) wdGet() wdTable(mtcars) '>ncol(mtcars) 11 however a count shows that there are actually 12 columns. R doesn't include the model of car. I have a...

Wrap text in dataframe in R- or in output cell to Word

I am working in R, trying to export a dataframe to MS Word. I am using R2wd and would like a dataframe to export to MSWORD, and wrap a long string of text within a cell. Is that even possible? Bare minimum at least pass a command from R to set the height of each row to fit the contents of the cell... I don't see any demos or document...

Create a new column in data.frame using conditions of each row

I have an R data frame: > tab1 pat t conc 1 P1 0 788 2 P1 5 720 3 P1 10 655 4 P2 0 644 5 P2 5 589 6 P2 10 544 I am trying to create a new column for conc as a percentage of conc at t=0 for each patient. As well as many other things, I have tried: tab1$conct0 <- tab1$conc / tab1$conc[tab1$t == 0 & tab1$pat == tab1...

R: Count number of entries in a row based on external criteria

Dear R-wizards, I have the following data frame: Date1 Date2 Date3 Date4 Date5 1 25 April 2005 10 May 2006 28 March 2007 14 November 2007 1 April 2008 2 25 April 2005 10 May 2006 28 March 2007 14 November 2007 1 April 2008 3 29 January 2008...

Redefine Data Frame in R

Hello. I have a data frame database$VAR which has values of 0's and 1's. How can I redefine the data frame so that the 1's are removed? Thanks! ...

basic R question on manipulating dataframes

I have a data frame with several columns. rows have names. I want to calculate some value for each row (col1/col2) and create a new data frame with the original row names. If I just do something like data$col1/data$col2 I get a vector with the results but lose the row names. i know it's very basic but I'm quite new to R. ...

Creating an R dataframe row-by-row

I would like to construct a dataframe row-by-row in R. I've done some searching, and all I came up with is the suggestion to create an empty list, keep a list index scalar, then each time add to the list a single-row dataframe and advance the list index by one. Finally, do.call(rbind,) on the list. While this works, it seems very cumber...

How to attach a simple data.frame to a spatialpolygondataframe in R?

Hi, I have (again) a problem with combining data frames in R. But this time, one is a spatial.polygon.data.frame(SPDF) and the other one is usual data.frame (DF). The SPDF has around 1000 rows the DF only 400. Both have a common column, QDGC Now, I tried oo <- merge(SPDF,DF, by="QDGC", all=T) but this only results in a normal data...

Applying a function on each row of a data frame in R

I would like to apply some function on each row of a dataframe in R. The function can return a single-row dataframe or nothing (I guess 'return ()' return nothing?). I would like to apply this function on each of the rows of a given dataframe, and get the resulting dataframe (which is possibly shorter, i.e. has less rows, than the orig...

Adding a column to a dataframe in R

I have the following dataframe (df) start end 1 14379 32094 2 151884 174367 3 438422 449382 4 618123 621256 5 698271 714321 6 973394 975857 7 980508 982372 8 994539 994661 9 1055151 1058824 . . . . . . . . . And a long vector with numeric values (vec). I would like to add to eac...

Filtering a dataframe in R

I have the following dataframe (df) start end 1 14379 32094 2 151884 174367 3 438422 449382 4 618123 621256 5 698271 714321 6 973394 975857 7 980508 982372 8 994539 994661 9 1055151 1058824 . . . . . . . . . And a long boolean vector with boolean values (vec). I would like to fi...

How to handle with empty dataframes in R?

I noticed that sometimes I get errors in my R scripts when I forget checking whether the dataframe I'm working on is actually empty (has zero rows). For example, when I used apply like this apply(X=DF,MARGIN=1,FUN=function(row) !any(vec[ row[["start"]]:row[["end"]] ])) and DF happened to be empty, I got an error about the subscripts. ...

Can one use 'subset' on data from .csv files of known structure but varying details?

I'm trying to work on data from .csv files of known general format but varying group and measure names. I can get a data.frame using: mydata=read.csv(file.choose(),header=T) mydata GroupNames Measure1 Measure2 Measure3 etc 1 group1 value1 value1 2 group1 value2 value2 3 group2 value3 ...

In R, how can I take a subset of columns of a data frame and then eliminate duplicate rows?

Imagine I have a data frame with data like this: A | B | C ---+---+--- 1 | 2 | a 1 | 2 | b 5 | 5 | a 5 | 5 | b I want to take only columns A and B, and I want to remove any rows that have become duplicates as a result of eliminating all other columns (that is, column C). So my desied result for the table above would be: A | B -...

Mapping result of psycopg2 into dataframe for R with RPY2

Hello Guys, With psycopg2, i get result of query in this form : [(15002325, 24, 20, 1393, -67333094L, 38, 4, 493.48763257822799, 493.63348372593703), (15002339, 76, 20, 1393, -67333094L, 91, 3, 499.95845909922201, 499.970048093743), (15002431, 24, 20, 1394, -67333094L, 38, 4, 493.493464900383, 493.63348372593703), (150024...

How can I use assign to change variables within a dataframe in R?

I tried to do something like this: x <- data.frame(1:20) attach(x) assign("x2",1:20,pos="x") However, x$x2 gives me NULL. With x2 I get what I want but it is not part of the data.frame. Attaching x2 to x manually would work in this simple case but not in the more complex one I need. I try to assign in a loop where I loop over...

Getting the mean value for every Id in a data frame

Imagine I have a data frame with 2 columns Id Value 12 13 32 3 6022 11 9142 231 12 23 119 312 ... and I want to get the mean value for each "Id". Do you know of any fast way of doing this? ...

Removing a particular category from a data frame in R

Hello nice people, I have a single column in a data frame in R that looks something like this: blue green blue yellow black blue green How do I remove all the rows that indicate blue? Please keep in mind that I don't want a NULL value represented in that row: I want the entire row removed. Thank you :) ...