Thursday, July 5, 2012

Using R as a Reporting Engine for MySQL using ggplot2 and knitr

R has a package called ggplot2 which can be used with knitr to generate HTML reports from a MySQL database.

You'll require the knitr, RMySQL, markdown packages. You can get them by typing the following in R Studio

install.packages("knitr")
install.packages("RMySQL")
install.packages("markdown")

The HTML will be created by the knit2html command in the knitr package. knit2html converts a R Markdown file to HTML. R Markdown is extension of markdown that allows you to write embedded R code in your markdown file. Consider this sample R markdown file DataReport.Rmd that uses RMySQL to get report data from a MySQL database, stores it in a R data frame, uses ggplot2 to plot the graph and finally converts to HTML.

DataReport.Rmd
 ![alt text](http://i279.photobucket.com/albums/kk147/SOCTONE/R_logo.jpg)  
 Title  
 ========================================================  
 ### Top 10 Countries  
 ```{r echo=FALSE, include=FALSE}  
 library(RMySQL)  
 library(ggplot2)  
 con2 <- dbConnect(MySQL(), user="user", password="pwd",dbname="schemaname", host="host")  
 counts_query <- dbSendQuery(con2, "select country, count(*) count from table group by country limit 10")  
 counts_data = fetch(counts_query,-1)  
 ```  
 ```{r echo=FALSE}  
 ggplot(counts_data, aes(country, count, fill=country)) + geom_bar() + coord_flip() + opts(legend.position="none")  
 ```  

To generate the HTML from the R Studio console run
 knit2html("DataReport.Rmd")  


You can drive the whole process of a shell script so that reports can be run on demand or periodically via crontab. See my post on how to drive R scripts off shell. You can later copy the resulting html file to a webserver so that it can be hosted and shared on the web/intranet.

No comments:

Post a Comment