You are browsing the archive for John Murtagh.

Data Roundup – 5th June

- June 5, 2013 in Data Roundup

We’re rounding up data news from the web each week. If you have a data news tip, send it to us at [email protected].

Tools, Courses, and Events

Booking now is the UK Data Service one-day workshop on using large-scale survey data for research which is in Manchester 24 June 2013. This workshop is aimed at those with little or no experience in the secondary analysis of survey data available from the UK Data Service and will introduce attendees to the skills required to find and access survey data and to carry out basic secondary analyses with the survey data.

As part of European Open Data Week there are 3 conferences and 14 workshops, from Tuesday, June 25 to Friday 28 June 2013 in Marseille, France, specifically looking at harmonising open data policies.

On 28th May the 7th Open Data Ireland Meetup took place in Dublin, the theme of which was entitled “Give Us Our Health Data” , and was attended by around 40 people. More details form the blog of the event at,36932,en.aspx.

Open Nepal Week is running in Kathmandu from June 2 to June 6 and is a partnership driven series of event spread over five days that aims to raise awareness about open data in Nepal and to devise mechanisms to help citizens reach to such data. Check out the website at

Newly launched is the Policy RECommendations for Open Access to Research Data in Europe project (RECODE) – the work plan and deliverables are available from here #RECODE.

A succinct and useful presentation from Victoria Stodden at Columbia University entitled: “Why Public Access to Data is So Important (and why getting the policy right is even more so)” which is available on their website.

On June 18 there is a webinar from the Open Government Partnership (OGP) entitled: “Strengthening the Demand for and use of open data initiatives” which is from 10:00 – 11:00 AM EST | 14:00 – 15:00 GMT

New open government tools have been launched for the Oakland, California area by community technologists, see more details

A useful blog post by Siri Anderson advising those who have data sets and how to published them: “3 Guidelines for Publishing Your First Open Data Sets”.

The National Day of Civic Hacking is a national event that took place June 1-2, 2013, in cities across the United States. Civic Hackers: The Neighborland API is a resource for local ideas and actions:

Lastly, research data now and in conjunction with the 3rd International Conference on Theory and Practice of Digital Libraries (TPDL) the first workshop on ‘Linking and Contextualizing Publications and datasets:’ is on September 26 in Malta:

Data Stories

The story of how detailed real time data got released from UK’s rail infrastructure owner (PDF) #opendata

The BBC has reported a story on the stunning visualizations of flight paths across the globe produced by GIS (and in their spare time, no less).

At the Centre for Sustainable Energy their most popular news story of the past 12 months: ‘Energy Company Obligation data in a usable format’,

The World Bank Development Data Group (DECDG) and the aid data organization Development Gateway has unearthed data which looks at the question of whether 29 developing countries are meeting their education goals and their progress visualized here:

An important article has been published (April 4) that reasserts that research data and their used in journal articles leads to an “a open data citation advantage”. You can read the pre-print on their website

Jess Denham, an Interactive Journalism MA student at City University London has interviewed David Ottewell, Head of the Data Journalism Unit at the Trinity Mirror (Regionals) group of UK Newspapers.

Jonathan Stray has written a blog post on a two-day data journalism workshop he gave in Taiwan which asks “How does a country get to open data? What Taiwan can teach us about the evolution of access” He writes “Assumptions about government openness vary from country to country. Here are a few lessons a cross-national perspective can bring to the open data movement.”

“Fell in love with data”, is an interesting blog post by Enrico Bertini, Assistant Professor at NY-Poly (with equally interesting comments) on data visualisation success stories – which are often in short supply. Read it on their website.

And the New York Times in its Technophoria blog has a piece about the struggle to gain access  to your own data which is stored (and monetized) by commercial companies like telecoms and utilities. It also details who is making this data available to consumers. “If My Data Is an Open Book, Why Can’t I Read It?” is available on their website.

Data Sources

On Friday May 31 Germany released the first results of its 2011 census, the first in 24 years and the first since east and west were joined together again. See the announcement on their website.

In Canada the Government of Alberta has joined the open data movement by launching its Open Data Portal. There are already more than 280 data sets on the portal — found at and you can watch a TV News item on it here.

In Australia the Government of New South Wales (NSW) has drafted an Open Data Policy which is open for public which is part of the NSW Government ICT Strategy supporting government transparency, accountability and efficiency.They have also re-launched their Open Data platform using CKAN 2.0 at

The Registry of Research Data Repositories has launched which allows the easy identification of appropriate research data repositories, both for data producers and users. The registry covers research data repositories from all academic disciplines. Information icons display the principal attributes of a repository, allowing users to identify the functionalities and qualities of a data repository. These attributes can be used for multi-faceted searches, for instance to find a repository for geoscience data using a Creative Commons licence. By April 2013, 338 research data repositories were indexed in 171 of these are described by a comprehensive vocabulary, which was developed by involving the data repository community (

The EC Open Data Portal ( went online just before Christmas 2012. It is designed to be the open data hub for European Institutions, beginning with data from the European Commission. There’s a recent video lecture introducing the hub from Malte Beyer- Katzbenberger entitled Towards a European open data infrastructure and is a guide through the portal – and the policy that is behind it.

The U.S. Government’s new CKAN open data catalog has just  launched… and African governments are now opening open data portals too. See Kenya’s at and Ghana’s at

A database of worldwide private companies registries has been launched called the Open Database of the Corporate World which currently holds information on 54,196,924 companies. The database uses the Google Refine reconciliation service and allows access to the information as JSON or XML.

There are some new datasets using the the history of UK websites via the UK Web Archive: They have also made a few example tools available, showing how the open data might be used, and these are hosted in their GitHub repository.

Flattr this!

Data Roundup May 21

- May 22, 2013 in Data Roundup

Data Roundups contain news on new apps, libraries, and other tools for working with data; newly announced conferences, workshops, hackathons, MOOCs, etc. If you have a data news tip, send it to us at [email protected]. This week’s Data Roundup is brought to you by John Murtagh.

Tools, Courses, and Events

The Big Data in Biomedicine conference kicks off at Stanford today. The event, which will be held at the School of Medicine’s Li Ka Shing Center for Learning and Knowledge, is bringing together leading figures from academia, industry, government and philanthropic foundations to discuss the burgeoning opportunities for mining the vast amounts of biomedical data housed in public databases. Here’s a look at the schedule. The event will be webcasted via the Big Data in Biomedicine website.

The next OKF Open Data Meetup is in Amsterdam  on May 30th which is part of the wider community across the world. Spread the word & join!

The April workshop Open Data on the Web which was in London now has a list of all received papers now online. The main topics of the Workshop were discoverability; transformation (to other formats); combinations of data from different models (e.g. linked data and CSV); quality assessment and self-description; extracting human-readable “stories” from data.

The new Open Data Tourism Hack at home, part of the Open Cities project is offering prizes for the best app and best use of Open Data to help European cities find new ways to manage the big challenges, and improve their tourism.

The LinkedUp Project has 39.500 EUR in prize money for the LinkedUp Challenge; three consecutive competitions looking for interesting and innovative tools and applications that analyse and/or integrate open web data for educational purposes.

There is now an Open Data Stack Exchange site in public beta. It’s a question and answer site for developers and researchers interested in open data and is built and run by users as part of the Stack Exchange network of Q&A sites. It’s working together to build a library of detailed answers to every question about open data.

Data Stories

Our @iainh_z wrote a guest post on the state of #opendata in Medicine for the Open Knowledge Foundation (@OKFN) 

The @americangut project is a collective of research scientists interested in exploring global gut microbiota World’s LARGEST open-source, community driven effort (they’re asking for your poo) to characterize the microbial diversity of the Global Gut. It’s embraced openscience – open data, open software and analysis on GitHub.

On May 22, 2012 at the University of North Texas, a group of technologists and librarians, scholars and researchers, university administrators, and other stakeholders gathered to discuss and articulate best practices and emerging trends in research data management.  This Denton Declaration bridges the converging interests of these stakeholders and promotes collaboration, transparency, and accountability across organizational and disciplinary boundaries.

The Guardian’s Data Blog has done some analysis on the key points and recommendations of the Shakespeare review which looked at open government data in the UK. They also ask if his open data agenda is shrewd but stuck in the @guardian live discuss #OpenData agenda in practice… They’ve also done a piece on Rohan Silva. “…the man who turned David Cameron onto open data”.

@chris_whong has visualised NYC’s subway turnstiles for @NYCEDC and reveals the hard work (finger clicking) that goes into it. To help, also from NY @NYStateCIO have just released open data sets that reveal the ‘Wineries, Breweries, and Distilleries Map’ for the state of New York.

Data Sources

IBM Smarter Cities ‏@IBMSmartCities26m is a comprehensive new mapping tool from LSE Cities, an urban studies project of the London School of Economics, makes the varying fortunes of Europe’s urban areas clear. Using Oxford Economics’ European Cities and Regional Forecasts database, the Metromonitor measures the employment and economic growth of 150 of the continent’s largest metro areas against metrics like national growth, population size, and urban typology.

The World Bank launched a much-improved version of its Open Data Catalog such as all essential information is available in a one-page list, sorted by name, popularity, or date. You can access bulk downloads, APIs or query tools from the same page with a single click. And you can see all the available metadata without having to visit separate pages on various sites.

The Government of South Australia has made its data freely available by releasing more than 100 datasets from 16 organisations, covering a range of topics including transportation, environment, education, crime and even baby names. Also from Down Under the Australian Archaeological Association has compiled a freely available set of Microsoft® Excel® databases listing radiocarbon, luminescence and uranium series ages from archaeological sites across Australia.

From academics at Stanford University a free 437-page book PDF entitled “Mining of Massive Datasets” has been made available for download (hardcopy published by Cambridge University Press).

new service (ODIN) allows researchers to add their research datasets – and other content with DataCite DOIs, including all figshare content – to their ORCID profile by integrating with the DataCite Metadata Store. ORCID provides a persistent digital identifier that distinguishes researchers from each other).

Pan European datasets have been released via the ESPON Database Portal which supplies different users (researchers, policy makers and stakeholders at regional and local level) with data, indicators and tools that can be used for European territorial development and cohesion policy formulation, application and monitoring at different geographical levels. The data included in the ESPON Database is mainly coming from European institutions such as EUROSTAT and EEA, and from all ESPON projects.

Flattr this!