Thursday, June 28, 2012

Automate Visualizations Using R and ggplot2

R is a pretty nifty tool that allows you to do tons of statistical analysis on your data. ggplot2 is a package inside R developed by Hadley Wickham that allows you to create visualizations. I had the honor to see Hadley today at a meetup in San Francisco. Hadley was going over the basics of ggplot2 and it struck me - "wouldn't it be cool to automate visualization creation so you can create visualizations without ever opening up R". I did some research and found it is possible.

Basically, you can use R scripts to automate your work. Then call the script on command line. Here's the code to store the graph in a png file using ggplot2. You need to have R installed for this. I'm using the standard dataset "diamonds" that came with ggplot2.

R Console can be called from the command line by simply typing r. To quit out of the console type q().



Now create a script file test.r with the following

library(ggplot2)
mygraph <- qplot(price, ..density.. ,data = diamonds, binwidth = 250,geom="freqpoly")
ggsave(mygraph,file="/Users/yasha/Desktop/delme.png")

After creating the script file run the file using R CMD BATCH on the command line.

R CMD BATCH /Users/yasha/Desktop/test.r 

This creates delme.png on the Desktop. 

You can later embed this file in a document to create a report, email it or whatever. Neat stuff! This process allows you to call R externally. You can probably automate the creation of script using Ruby or Shell or Java or whatever you like to use. Then run the script and enjoy the sweet visualizations.

Hadley Wickham has a great set of slides for introductory ggplot2. Check it out at https://dl.dropbox.com/u/41902/ggplot2-intro.pdf

No comments:

Post a Comment