r

How to scrape "table like" data from stackexchange homepage? (in R)

Hello all, I wish to scrape the home page of one of the new stackexchange websites: http://webapps.stackexchange.com/ (just once, and for only several pages, nothing that should bother the servers). If I had wanted it from stackoverflow, I know there is a database dump, but for the new stackexchange, they don't exist yet. Here is wha...

rm(list=ls()) doesn't completely clear the workspace

This is a very minor issue, but I would like to understand exactly what is going on here. Say I do the following: library(RMySQL) con <- dbConnect(MySQL(), host="some.server.us-east-1.rds.amazonaws.com",user="aUser", password="password", dbname="mydb") values1 <- dbGetQuery(con,"select x,y from table1") attach(values1) At this point...

What's the R equivalent of SQL's LIKE 'description%' statement?

Not sure how else to ask this but, I want to search for a term within several string elements. Here's what my code looks like (but wrong): inplay = vector(length=nrow(des)) for (ii in 1:nrow(des)) { if (des[ii] = 'In play%') inplay[ii] = 1 else inplay[ii] = 0 } des is a vector that stores strings such as "Swinging Strike", "In pla...

Relative Time Series

I am looking for a standardized method for arranging data in relative time. Applications include accounting data such as FY1,FY2,etc... and economic data such as the term structure of interest rates using the 1 year, 2 year, 3 year, etc... I would like to be able to compare a set of time series data that is current and several historic ...

error in reading R file while submitting a r job to condor

Hi Everyone, I have a R Job that is submitted to the condor, The R file(one.R) which is submitted to the condor is reading another R file(two.R), when I submit the job to the condor its is failed and the reason for that is the submitted R(one.R) file is not reading the called R file(two.R) Error in text file is : Error in file(file, "r...

How to plot two histograms together in R?

I am using R and I have two data frames: carrots and cucumbers. Each data frame has a single numeric column which lists the length of all measured carrots (total: 100k carrots) and cucumbers (total: 50k cucumbers). I wish to plot two histogram - carrot length and cucumbers lengths - on the same plot. They overlap, so I guess I also need...

How do I convert a list with non-unique rownames to a (nested) list with unique rownames?

I have a long list e2i, which "maps" rownames to values, and has duplicate rownames: > head(e2i) $`679594` [1] "IPR019956" $`679594` [1] "IPR019954" $`679594` [1] "IPR019955" $`679594` [1] "IPR000626" $`682397` [1] "IPR019956" $`682397` [1] "IPR019954" I need to convert it into a list with unique rownames, where each named elemen...

Easier way to plot the cumulative frequency distribution in ggplot?

I'm looking for an easier way to draw the cumulative distribution line in ggplot. I have some data whose histogram I can immediately display with qplot (mydata, binwidth=1); I found a way to do it at http://www.r-tutor.com/elementary-statistics/quantitative-data/cumulative-frequency-graph but it involves several steps and when explor...

Numpy for R user?

Hi, long-time R and Python user here. I use R for my daily data analysis and Python for tasks heavier on text processing and shell-scripting. I am working with increasingly large data sets, and these files are often in binary or text files when I get them. The type of things I do normally is to apply statistical/machine learning algorith...

Using Multicore in R for a pentium 4 HT machine

I am using a Pentium 4 HT machine at office for running R, some of the code requires plyr package, which I usually need to wait for 6-7 minutes for the script to finish running, while I saw my processor is only half utilized. I have heard of using Multicore package in R for better utilizing the multicore processor, is my case suitable f...

How to get a BATCH file executed from a web page?

Hi All, I have a one line batch file that I want to call from a web page via a button what is the best way to achieve this? The BATCH File is as follows: c:\R\bin\Rscript.exe "c:\Users\user\Desktop\Shares.R" Or is it possible just to call the R script straight from the web page and skip the BATCH file altogether, Can this be done? I...

how to list a portion of objects in R?

I want to list all objects in R that start with something, like starts with character "A", I only know how to use ls(), is there a way to do so? Thanks! ...

Create a new column in data.frame using conditions of each row

I have an R data frame: > tab1 pat t conc 1 P1 0 788 2 P1 5 720 3 P1 10 655 4 P2 0 644 5 P2 5 589 6 P2 10 544 I am trying to create a new column for conc as a percentage of conc at t=0 for each patient. As well as many other things, I have tried: tab1$conct0 <- tab1$conc / tab1$conc[tab1$t == 0 & tab1$pat == tab1...

R: Date format when writing a zoo object to a file?

Hello I've been playing for a while with zoo package. I can read files using the format="%Y-%m-%d %H:%M" option But how can I use this option when writing the results back to the disk? I mean, the default format seems to be "%m/%d/%Y %H:%M" and I need to be "%Y-%m-%d %H:%M" Where can I change it? cheers ...

different behavior when using different number of multicoring workers

I am playing around a bit with my program (trying to multicore a few parts) and I've noticed the "CPU history" looks a bit different, depend on how many workers I start. 2-4 workers seems to produce a "stable" workflow, however pegging 5-8 workers produces erratic behavior (from zero to max, see pictures). I should point out that all run...

facet_grid problem : input string 1 is invalid in this locale?

Dear all, I am trying to create facet grid with the following code p <- ggplot(melted,aes(factor(country))) + geom_bar() +opts(axis.text.x = theme_text(angle = 90,hjust = 1)) p + facet_grid(. ~ provider) but I always get the following error: Error in sub("^[^:]+: ([^\n]+)\n[0-9]+:(.*)$", "\1\2", expr) : input string 1 is in...

How to barplot frequencies with ggplot2?

Dear all, probably I just need a break :). I have a melted dataset containing a column "value" which represent an absolute number which varies with every row of the dataset. I want to display this number in a barchart by country. p <- ggplot(melted,aes(factor(country),y=as.numeric(value))) + geom_bar() +opts(axis.text.x = theme_text(...

l_ply: how to pass the list's name attribute into the function?

Say I have an R list like this: > summary(data.list) Length Class Mode aug9104AP 18 data.frame list Aug17-10_acon_7pt_dil_series_01 18 data.frame list Aug17-10_Picro_7pt_dil_series_01 18 data.frame list Aug17-10_PTZ_7pt_dil_series_01 18 data.frame list Aug17...

HT index RSI value at end of each month in an xts time series object

First I read-in a csv and create an xts object. require(quantmod) sugar <- as.xts(read.zoo("SUGAR.CSV", sep=",", format ="%m/%d/%Y", header=TRUE)) Then I create a new series of RSI values using TTR (loads with quantmod) sugarRSI <- RSI(sugar) Now I'd like to get a new series that only includes the value of the last day of each mo...

Reorder one data.frame using two columns from another data.frame in R

I have two data.frames in R, one of which has two columns and of the other of each has three columns, and where two columns are common between the two frames. The frame have the same number of rows. An example of the frames, a and b, is provided below. What I need to do is reorder the rows of b using the order of rows in a. Note that in ...