How to use a Google Spreadsheet as data in R

September 28, 2009
108 Views

One of the great strengths of R is that it promotes reproducible research: as an open-source system, you can easily send a script file to a colleague with the confidence that they’ll be able to get the same results using R on their own system. Provided they have the same data, that is. If that’s your goal, you can always send along a data file, but that can add some complications. You have to take care with the filenames in your script, and dealing with data that changes regularly can be annoying.

Google Docs offers a solution. As long as you have a Google account, you can store your data as a Google Spreadsheet, and then create a special URL for that spreadsheet that can be used as a CSV file source by R. The process is a bit complex, but it only needs to be done once, and then your data are freely available to anyone who wants to access it via R. 

First, change the permissions for your spreadsheet on the main Google Docs page (the one that lists all of your Google Documents: spreadsheets, documents, presentations, etc) and configure your spreadsheet so that it can be viewed by everyone. (I’m not sure if this is strictly necessary for the export, but it does give your



One of the great strengths of R is that it promotes reproducible research: as an open-source system, you can easily send a script file to a colleague with the confidence that they’ll be able to get the same results using R on their own system. Provided they have the same data, that is. If that’s your goal, you can always send along a data file, but that can add some complications. You have to take care with the filenames in your script, and dealing with data that changes regularly can be annoying.

Google Docs offers a solution. As long as you have a Google account, you can store your data as a Google Spreadsheet, and then create a special URL for that spreadsheet that can be used as a CSV file source by R. The process is a bit complex, but it only needs to be done once, and then your data are freely available to anyone who wants to access it via R. 

First, change the permissions for your spreadsheet on the main Google Docs page (the one that lists all of your Google Documents: spreadsheets, documents, presentations, etc) and configure your spreadsheet so that it can be viewed by everyone. (I’m not sure if this is strictly necessary for the export, but it does give your collaborators the ability to view the data directly as a spreadsheet.)

  • Select your spreadsheet in the Google Docs page by marking the checkbox next to its name.
  • Click the “Share” menu in the toolbar, and choose “See who has access…”
  • Click the People With Access Tab.
  • Next to “Sign-in is required to view this item,” click Change, and select “Let people view without signing in.”

Now anyone can view your spreadsheet in Google Docs using the link given in the “Share > Get the link to share…” menu. But we want a link for the CSV export version of the spreadsheet, not the spreadsheet itself. Here’s how to get that.

  • Open your spreadsheet in Google Docs.
  • Click the blue Share button (in the upper-right corner of the spreadsheet) and choose “Publish as a Web Page”.
  • For “Sheets to Publish” choose “All Sheets,” and check the box “Automatically republish when changes are made” if you want to dynamically update the data for R when you edit it.
  • Click “Start Publishing”. This will activate the options in the box “Get a link to the published data”, below.
  • Change the export type from “Web Page” to “CSV (comma-separated values)”.
  • Change “All sheets” to “Sheet1” (or select the sheet you want to export)
  • Change “All Cells” to the specific range you want to export, beginning with the header row. Use Excel-style notation, like “A1:C6” for the first 3 columns and the first 6 rows.
  • Click “Republish now.”

The Publish box should look something like this.

Spreadsheet export

You can now use the URL in the bottommost box above directly with read.csv in R:

> read.csv(“http://spreadsheets.google.com/pub?key=tCA0HtNtIlmhW-GLzFLLbZg
 &single=true&gid=0&range=A1%3AC6&output=csv”)

  x   y  z

1 1 0.3 10

2 2 0.5 14

3 3 1.1 12

4 4 0.1  1

5 5 1.9  0

Better yet, if I ever change the data in the Google spreadsheet, the command above will always retrieve the updated data (provided I chose the “Automatically republish” option, above).

I believe you can do something similar with the RGoogleDocs package (currently in beta and not yet on CRAN), but this process requires no additional packages for you or the recipient of your data. 

Link to original post