I'm working in an R script that uses a long SQL string, and I would like to keep the query relatively free of other markup so as to allow copying and pasting between editors and applications. I'd also like the ability to split the query across lines for better readability.
In the RODBC documentation, the paste function is used to build...
I have a large dataframe (14552 rows by 15 columns) containing billing data from 2001 to 2007. I have used sqlFetch to get 2008 data. In order to append the 2008 data to the data of the preceding 7 years one would do as follows
alltime <-rbind(alltime,all2008)
Unfortunately that generates
Warning message:
In [<-.factor(*tmp*, ri,...
I would like to save a whole bunch of relatively large data frames while minimizing the space that the files take up. When opening the files, I need to be able to control what names they are given in the workspace.
Basically I'm looking for the symantics of dput and dget but with binary files.
Example:
n<-10000
for(i in 1:100){
...
Hi All,
I ran JAGS with runjags in R and I got a giant list back (named results for this example). Whenever I access results$density, two lattice plots (one for each parameter) pop up in the default quartz device. I need to combine these with par(mfrow=c(2, 1)) or with a similar approach, and send them to the pdf device. Nothing I tried...
I have a list of hclust objects resulting from slight variations in one variable (for calculating the distance matrix)
now I would like to make a consensus tree from this list.
Is there a generic package to do this? I am hacking my way through
some code from maanova and it seems to work - but it's ugly and it
needs a lot of hacking s...
I've been using F# for a while now to model algorithms before coding them in C++, and also using it afterwards to check the results of the C++ code, and also against real-world recorded data.
For the modeling side of things, it's very handy, but for the 'data mashup' kind of stuff, pulling in data from CSV and other sources, generating ...
I'm using ggplot2 to create panels of histograms, and I'd like to be able to add a vertical line at the mean of each group. But geom_vline() uses the same intercept for each panel (i.e. the global mean):
require("ggplot2")
# setup some sample data
N <- 1000
cat1 <- sample(c("a","b","c"), N, replace=T)
cat2 <- sample(c("x","y","z"), N, ...
I've got a data frame with 2 character columns. I'd like to find the rows which one column contains the other, however grepl is being strange. Any ideas?
> ( df <- data.frame(letter=c('a','b'),food = c('apple','pear','bun','beets')) )
letter food
1 a apple
2 b pear
3 a bun
4 b beets
> grepl(df$letter,df$food)...
Hello,
After installing RPy2 from
http://rpy.sourceforge.net/rpy2.html
I'm trying to use it in Python 2.6 IDLE but I'm getting this error:
>>> import rpy2.robjects as robjects
>>> robjects.r['pi']
<RVector - Python:0x0121D8F0 / R:0x022A1760>
What I'm doing wrong?
...
I have a list of lists that looks like this: x[[state]][[year]]. Each element of this is a data frame, and accessing them individually is not a problem.
However, I'd like to rbind data frames across multiple lists. More specifically, I'd like to have as output as many dataframes as I have years, that is rbind all the state data frames ...
Hi ,
Im trying to compute the median vector of a data set s with column A1 and B1 ,
The median vector is the median for each observation from both columns.
I tried to do this and it didnt work .
median(s[c("A1","B1")])
Is there another way to do it ?
...
Suppose we have the contents of tables x and y in two dataframes in R. Which is the suggested way to perform an operation like the following in sql:
Select x.X1, x.X2, y.X3
into z
from x inner join y on x.X1 = y.X1
I tried the following in R. Is there a better way?
Thank you
x<-data.frame(cbind('X1'=c(5,9,7,6,4,8,3,1,10,2),'X2'=c(5,...
Is there a way in R to build a new dataset consisting of a given set of vectors -- median1, median2, median3, median4 -- which are median vectors from a previous dataset s?
median1 = apply(s[,c("A1","B1","C1","D1","E1","F1","G1","H1","I1")],1,median)
median2 = apply(s[,c("A2","B2","C2","D2","E2","F2","G2","H2","I2")],1,median)
median3 ...
Is there a variant of lag somewhere that keeps NAs in position? I want to compute returns of price data where data could be missing.
Col 1 is the price data
Col 2 is the lag of price
Col 3 shows p - lag(p) - the return from 99 to 104 is effectively missed, so the path length of the computed returns will differ from the true.
Col 4 shows...
I'm trying to add a title at the top of the page scatterplots, however whenever I use the command title it doesn't add the title at the top of page and overwrites my plots. Is there a way to fix this ?
plot(median, pch = ".")
title(main = "Scatterplot of the median vectors ",line = 0,font=2)
...
How does one go about drawing an hyperplane (given the equation) in 3D in R ?
(i.e. 3d equivalent to "abline")
Thanks in advance,
...
Let's say I have two columns of data. The first contains categories such as "First", "Second", "Third", etc. The second has numbers which represent the number of times I saw "First".
For example:
Category Frequency
First 10
First 15
First 5
Second 2
Third 14
Third 20
Second 3
I want ...
Hi, I regularly make figures (the exploratory data analysis type) in R. I also program in Python and was wondering if there are features or concepts in matplotlib that would be worth learning. For instance, I am quite happy with R - but its image() function will produce large files with pixelated output, whereas Matlab's equivalent figur...
When programming in Stata I often find myself using the loop index in the programming. For example, I'll loop over a list of the variables nominalprice and realprice:
local list = "nominalprice realprice"
foreach i of local list {
summarize `i'
twoway (scatter `i' time)
graph export "C:\TimePlot-`i'.png"
}
This will plot the t...
I have a data set of comic book unit sales by volume (ex. Naruto v10) that I need to reduce to sales by series (so all Naruto volume unit sales would be added together into a single observation). I have a variable "series" that identifies the series of each observation. The equivalent code in Stata would be:
by series, sort:replace u...