Data roundup, June 19

June 19, 2013 in Data Roundup

We’re rounding up data news from the web each week. If you have a data news tip, send it to us at [email protected].

Photo credit: Mike S

Photo credit: Mike S


The G8 Open Data Charter was unveiled this past Tuesday at the 39th G8 Summit. The charter reaffirms the G8 countries’ commitment to open data and sets an “open by default” policy for government data. The Telegraph discusses the significance of the charter.

There is, however, still much to be done. The Open Knowledge Foundation launched a preview of the 2013 Open Data Census in time for the G8 Summit, and this preview suggests that “G8 countries still have a long way to go in releasing essential information as open data”.

Also released in time for the G8 Summit is a pilot of the open company data index from OpenCorporates, supported by the World Bank. The index highlights global corporate participation in the move towards transparency in data.

The Open Knowledge Foundation has announced the launch of the Panton Fellowships, awards valued at £8,000 per annum which will reward scientists who actively promote scientific open data. Applications are now being accepted.

The Open Data Institute has announced the beta of Open Data Certificates, a website which allows data publishers to self-report and certify their data’s adherence to openness standards.

The State of New York has published a provisional open data handbook—on GitHub. The handbook is “a general guide for government entities participating in OPEN-NY”. Comments from the public are invited.

Got a spatiotemporal dataset with several billion data points? Want to visualize it interactively in a web browser? Nanocubes are a new, fast data structure that can be used to do exactly that. Nanocubes use so little memory “that you can run a nanocube in a modern-day laptop”. Check out a demo of nanocubes applied to some two hundred million tweets.

There is now a Go library for spatial data operations, and it’s called gogeos. Fans of Google’s programming language can now take advantage of a wide range of powerful spatial data manipulations, as detailed in the announcement blog post.

Learn how a small team of journalists can tackle a big data-journalistic project with David Bauer’s account of how TagesWoche investigated migrant remittances. Bauer explains how teams were structured, data was found, and visualizations were constructed.

Hive is a “data warehouse system” that facilitates the analysis of large datasets. Learn how to process social science data with Hive in a new blog post by John Beieler that shows you how to query a 40+ gigabyte dataset in a matter of minutes.

Become a Knight-Mozilla OpenNews Fellow! Applications for the fellowships, which fund a journalistically inclined programmer for ten months of newsroom participation, will be open until August 17.


Treezilla aims to map every tree in Britain. It is a citizen science platform in the form of an app and a crowdsourced database of British trees, already tens of thousands of trees strong, contributing to awareness of trees and their central ecological importance.

Detention Logs publishes “data, documents and investigations that reveal new perspectives on conditions and events inside immigration detention” in Australia. Called “one of the largest data journalism projects in Australian history” (source), the project aims “to arm the public” with the facts necessary for informed public policy on asylum seekers and immigration. presents many perspectives on the lives of Parisiens—income, politics, sex…—all from the reference-point of the Métro stations that abut on their lives. It is among the newest, and perhaps richest, transit-oriented infographic takes on urban life; compare the New Yorker on inequality and NYC’s subway.

Social network analysis has now been applied to Homer. The social network of characters in the Odyssey has been analyzed by PJ Miranda and colleagues, who conclude that “this social network bears remarkable similarities to Facebook, Twitter and the like”.

How do cats spend their time? As a cat owner, I know very well how pressing this question is. The BBC, in collaboration with the Royal Veterinary College, has investigated, presenting a day in the life of nine cats in the form of a dynamic map.

Responding to the opening of the trial of Andre Cornet’s alleged killers on Monday, Arnaud Wéry has mapped 15 years of crime stories in the Huy-Waremme region and blogged about how he did it.

How will climate change affect flora and fauna in Spain? The World Wildlife Foundation has created an interactive application allowing the projections of two models of climate change to be compared on a map of Spain, showing how living areas for plant and animal species will change in the years to come.


No new data source this week is more exciting than the International Consortium of Investigative Journalists’ release of a database of over 100,000 offshore tax havens, the Offshore Leaks Database. The database is “part of a cache of 2.5 million leaked offshore files ICIJ analyzed with 112 journalists in 58 countries” (source). Learn how it was built on the ICIJ blog, and read more about why it matters.

UK Cabinet Office Minister Francis Maude has announced “new commitments on open data that will give citizens detailed information on the operations of charities and companies”. Data held by the UK Charity Commission is slated to be made freely available by March of next year.

In response to the G8’s open data charter, Canada has launched a new data portal. The usefulness of this new portal is likely to be compromised by the serious budget cuts suffered by Statistics Canada under the Harper administration.

Flattr this!