tags:

views:

623

answers:

4

Is there a simple way to make a nice plot of the following data in R, without using many commands?

 Region1 Region2
2007 17 55
2008 26 43
2009 53 70
2010 96 58

I do know how to plot the data, but it uses too many commands and parameters, and the result still looks absolutely terrible (see here):

> test <- read.table("/tmp/data.txt")
> png(filename="/tmp/test.png", height=1000, width=750, bg="white", res=300)
> plot(test$Region1, type="b", col="blue", ylim=c(0,100), lwd=3)
> lines(test$Region2, type="b", col="red", lwd=3)
> dev.off()

It took me a while to figure out all the commands, and I still have to get the x axis labels (2007, 2008, ...), using the axis command (but how do I access the test x axis labels?), etc.

In Keynote (or Powerpoint) I can just give it the same table (transposed) and it produces a nice graph from it (see here).

My question is really: Is there a higher-level command that draws such typical data nicely? Also, how can I separate the drawing logic (draw 2 lines from that specific data, etc.) from the layout (use specific colors and line types for the graph, etc.)? Ideally, I'd hope there were different libraries for different layouts of the graph, e.g. called NiceKeynoteLayout, which I just could use like this (or similar):

> d <- read.table("/tmp/data.txt")
> png <- png(filename="/tmp/test.png", height=1000, width=750)
> myLayout <- loadPredefinedLayout("NiceKeynoteLayout")
> coolplot(d, layout=myLayout, out=png)
A: 

You may want to read up on help(par) which is a very useful source of information for customizing standard R graphs. This allows you to

  • have tighter outer margins (eg par(mar=c(3,3,1,1))
  • change fonts (eg par(cex=0.7) or some of the more specific cex alternatives
  • set colors or linetypes
  • ...

all of which comes close to your desired loadPredefinedLayout() functionality you desire.

Lastly, for the axes you are better off to either use a time-aware class like zoo, or to explicit give the x-axis argument as in the example below:

R> data <- data.frame(Year=seq(as.Date("2007-01-01"), \
                   as.Date("2010-01-01"), by="year"), \
                 Region1=c(17,26,53,96), Region2=c(55,43,70,58))
R> data
        Year Region1 Region2
1 2007-01-01      17      55
2 2008-01-01      26      43
3 2009-01-01      53      70
4 2010-01-01      96      58
R> par(mar=c(3,4,1,1)) 
R> plot(data$Year, data$Region1, type='l', col='blue', ylab="Values")
R> lines(data$Year, data$Region2, col='red')
R> 
Dirk Eddelbuettel
+2  A: 

Yes, and in my biased opinion, you're best off using the ggplot2 package for creating graphics. Here's how you might do so with your data (thanks to Dirk for providing a sample datset)

data <- data.frame(Year=seq(as.Date("2007-01-01"), 
                   as.Date("2010-01-01"), by="year"), 
                 Region1=c(17,26,53,96), Region2=c(55,43,70,58))

library(ggplot2)

# Convert data to a form optimised for visualisation, not
# data entry
data2 <- melt(data, measure = c("Region1", "Region2"))

# Define the visualisation you want
ggplot(data2, aes(x = Year, y = value, colour = variable)) + 
  geom_line()
hadley
Thanks for the tip, I installed it. While it simplifies the commands somewhat and has some cool functionalities it seems still hard to make the graph look as nice as in the Keynote figure link I gave above ... I guess I'll have to keep playing around with it.
Actually, I think with a harder look into ggplot2 you'll be able to create anything you need. I use it for publication quality reporting as well.
Brandon Bertelsen
If your emphasis is on prettiness, just stick to keynote? R graphs are always going to be more work to make beautiful than keynote graphs. The advantage of R (and ggplot) is the speed at which you can try out many different possibilities to find the graph that best reveals the interesting and important features of your data.
hadley
A: 

A (in my opinion) slightly improved version of the graphic suggested by Hadley. I think now it is pretty much like the original graphic you tried to replicate (even better, actually, with direct labels).

After converting the data as suggested by Hadley,

plot <- ggplot(data2, aes(Year, value, group = variable,
     colour = variable)) + geom_line(size = 1) +
     opts(legend.position = "none")
plot <- plot + geom_point () + opts(legend.position = "none")
plot + geom_text(data = data2[data2$year == 2010,
     ], aes(label = variable), hjust = 1.2, vjust = 1)
Manoel Galdino
A: