ggplot2 and the grammar of graphics

September 22, 2009
150 Views

I really enjoyed Hadley Wickham’s talk for the Bay Area UseR group last week. I’ve really been getting into ggplot2 lately, but it was Hadley’s example of plotting housing sales data for the San Francisco area really made it click for me. When you first start using ggplot2, the syntax can seem a little, well, arcane: rather than using separate commands to build up your plot as with traditional R graphics, you string elements together with the ‘+‘ operator. The example Hadley showed, plotting time-series data, laying on a line, and then paneling the data by city, really made sense of the “grammar of graphics” concept for me.  

It looks like the folks at Biogeeks have had a similar epiphany recently:

Say you want to make 12 dose-response plots of various compounds tested in various cell lines. With basic R this would require writing a for-loop and fidling around a lot with axis and plot labeling and the par()-function to make them fit on one page. With basic R you would have be extremely careful to make the code general and reusable for next time when you have different compounds and different cell lines. Enter ggplot2 and the grammar of graphics. ggplot2 is a package for

I really enjoyed Hadley Wickham’s talk for the Bay Area UseR group last week. I’ve really been getting into ggplot2 lately, but it was Hadley’s example of plotting housing sales data for the San Francisco area really made it click for me. When you first start using ggplot2, the syntax can seem a little, well, arcane: rather than using separate commands to build up your plot as with traditional R graphics, you string elements together with the ‘+‘ operator. The example Hadley showed, plotting time-series data, laying on a line, and then paneling the data by city, really made sense of the “grammar of graphics” concept for me.  

It looks like the folks at Biogeeks have had a similar epiphany recently:

Say you want to make 12 dose-response plots of various compounds tested in various cell lines. With basic R this would require writing a for-loop and fidling around a lot with axis and plot labeling and the par()-function to make them fit on one page. With basic R you would have be extremely careful to make the code general and reusable for next time when you have different compounds and different cell lines. Enter ggplot2 and the grammar of graphics. ggplot2 is a package for implementing the grammar of graphics, which allows you to write extremely succinct and natural languages like code that produces stunning visualizations.

Here’s 6 lines of code in ggplot2, and the graph it creates:

p = qplot(Concentration, Percent.of.control, 
data=screening_data,
geom=c("point", "smooth"), colour=Response.type) +
scale_x_log10() +
facet_grid(Compound ~ Cell.line) +
coord_cartesian(ylim=c(-10, 110))
print(p)

Ggplot2_ex1

So how does the grammar of graphics help, here? I liked the way the Biogeeks summed it up: “If you compare the code and the plot you will realize that the code contains about the words that you would use if you were told to briefly describe the plot using English.” Indeed!

Biogeeks: Power plotting with ggplot2

Link to original post