Geocoding Data in a Google Docs Spreadsheet

A very common need is to geocode data in a Google Spreadsheet (for example, in creating TimeMaps with the Timeliner project). There are several options here:

By hand – use a Geocoding service (see the course on geocoding) and then copy and paste by hand. Use the ImportXML (or ImportCSV) formulae to grab data from a geocoding service – great but with limitations on the number of rows you can code at one time (~50). Use a Google App Script – the most powerful but requires installation of an App Script in your spreadsheet. In this tutorial I’m going to cover the latter two automated options and specifically focus on option 2.

Using Formulas All of the following is illustrated live in this google spreadsheet.

We start with a formula like the following:

=ImportXML("http://open.mapquestapi.com/nominatim/v1/search?format=xml&q=London", "//place[1]/@lat")

This formula uses the ImportXML function to look up XML data from the Mapquest Nominatim geocoding service (see the previous tutorial for more about geocoding services). The first argument to ImportXML is the URL to fetch (in this case the results from querying the geocoding service) and the second part is an XPath expression to select data from that returned XML. In this case, the XPath looks up the first place object in the results: place[1] and then gets the lat (latitude) attribute. To understand this more clearly, here’s the XML returned by that XML query:

http://i.imgur.com/9ZCchXY.png

In reality we want both latitude and longitude, so let’s change it to:

=ImportXML("http://open.mapquestapi.com/nominatim/v1/search?format=xml&q=London", "//place[1]/@lat | //place[1]/@lon")

This uses an “or” || expression in XPath and the result will now be an array of results that Google Docs will put in 2 cells (one below another). You can see this in Column C of the example spreadsheet.

What happens if we wanted the data in just one cell, with the two values separated by commas, for example? We could use the JOIN function:

=JOIN(",", ImportXML("http://open.mapquestapi.com/nominatim/v1/search?format=xml&q=London", "//place[1]/@lat | //place[1]/@lon"))

Lastly, we’d like to geocode based on a place name in an another cell in the spreadsheet. To do this we just need to add the place name to our API request to MapQuest’s Nominatim service using the CONCATENATE function (this example assures the value is in cell A2):

=ImportXML(CONCATENATE("http://open.mapquestapi.com/nominatim/v1/search?format=xml&q=", A2), "//place[1]/@lat")

=JOIN(",", ImportXML(CONCATENATE("http://open.mapquestapi.com/nominatim/v1/search?format=xml&q=",A2), "//place[1]/@lat | //place[1]/@lon"))

App Script

If you want an even more powerful approach you can use a Google App Script. In particular, Development Seed’s MapBox team have prepared a great ready-made Google AppScript that will do geocoding for you.

Find the script plus instructions online.

Any questions? Got stuck? Ask School of Data!

Last updated on Sep 02, 2013.

  • http://ausserhofer.net/ Julian Ausserhofer

    I think the formula with the ImporXML function does not work any more. Could it be that the Mapquest API has changed? Just a guess from a novice though…

    • Michael Bauer

      You’re right the API used doesn’t seem to exist anymore…

    • Michael Bauer

      Hi the API just moved – I updated the links in the recipe – it should now work again.

  • Anna Daniel

    works for me, but Google spreadsheets only handles a max of 50 XML formulas per spreadsheet. So what I’ve been doing is 50, then copy>paste special as values>the next 50. A bit tedious but far better than the manual process (and yes I’ve done the manual process before for 3000+ locations….. ugh).

  • harry_wood

    This is a very neat trick. I also came across this just now: https://github.com/mapbox/geo-googledocs/ which tackles the same problem but with a script add-on for google spreadsheets. So more complicated perhaps, but maybe easier once you’re set up.