Data roundup, March 20
We’re rounding up data news from the web each week. If you have a data news tip, send it to us at [email protected].
TOOLS, COURSES, AND EVENTS
The DataKind DC DataDive with the World Bank took place on March 15 – 17, bringing together socially conscious data explorers to “mashup the most cutting-edge data from the World Bank, UN, and other sources to improve poverty monitoring and root out corruption”. Francis Gagnon has posted reflections on the event and its lessons (twice!), and Prasanna Lal Das has done the same.
The collaborative writing of the Iberoamerican Data Journalism Handbook has begun. The handbook will, as Miguel Paz explains, explore the state of data journalism in Latin America and work to push its frontiers and expand its networks of communication.
JSNetworkX is a JavaScript port of a popular Python library for “the creation, manipulation, and study of the structure, dynamics, and function of complex networks”. It can be used together with D3.js to create complex graph visualizations, as examples of the library’s functionality illustrate.
Many machine learning tasks can be usefully conceptualized as classification problems. Learn how to do probabilistic classification in Python by learning how to build a Python spam detector.
Yes, Python is a great language for almost any purpose, including data science. Data Community DC has written a great introduction to data science with Python.
Neural networks are once again back in vogue. Learn about neural networks for machine learning through a series of lectures by the University of Toronto’s Geoffrey Hinton.
The TrafficCOM is a piece of hardware that promises to make urban vehicular data collection easier. It is an Arduino-based device which can be used to collect data on car or bicycle traffic—data which you can then make openly available on the TrafficCOM data portal.
DATA STORIES
The votes are in for the 21st International Infographics Awards, which awarded 21 gold medals, 52 silver medals, and 74 bronze medals. The winners are listed in a PDF on the Malofiej website.
The 2012 Feltron Annual Report has been released. Nicholas Felton’s annual reports “weave numerous measurements into a tapestry of graphs, maps and statistics that reflect the year’s activities”.
What makes a news app successful? More to the point: how can engagement with a news interactive be measured? Brian Abelson sketches answers to these questions in a thoughtful blog post.
The OpenGenderTracking project showed its first results a couple weeks ago. The Open Gender Tracker is a Knight Foundation-supported project to build software to analyze the gender balance of news content. A new Q&A at the Source discusses the project’s origins and goals.
Code, data, and revealing interactive webapps resulting from a “wildly unethical and almost certainly illegal” (source) Internet-scale port scan using a botnet of hundreds of thousands of compromised devices have been posted as “Internet Census 2012”.
“Are you interested in data visualization?”, asks Andy Kirk. If so, then you are invited to participate in his seven-question dataviz census, “the first attempt to capture the size, shape and composition of the data visualisation field today”.
The beta of RechtesLand.de has been released. Rechtes Land is an interactive atlas of the far right and neo-Nazi movements in Germany. It tracks their associations, murderous activities, and their projects.
Epic urban visualization project ViziCities, which aims to create “an interactive 3D city visualisation platform […] at the intersection of data, art and play” has been underway for a month. Rob Hawkes outlines progress the project has made in that time.
DATA SOURCES
The African Development Bank has launched open data platforms for 20 African countries. The platforms are “user-friendly [tools] for extracting data, creating and sharing own customized reports, and visualizing data across themes, sectors and countries in tables, charts and maps”.
Responding to the French government’s refusal to open up data on autism, Des données pour l’autisme is a crowdsourcing platform to make data on autism in France more readily available. Damien Bruno discusses the project, its goals, and its origins.
A prototype data portal for Rheinland-Palatinate has been launched. According to epsiplatform.edu, much of the data is still only available in PDF form.
The City and County of Honolulu has launched a new and improved data portal with a variety of interactive content. Included is the Honolulu Artmapper, which can be used to locate public artwork throughout Honolulu, and an “adopt-a-stream” program.
The Global Buildings Performance Network is a worldwide portal for data on building energy performance. Besides providing access to data, the portal features a “Policy Comparative Tool” to facilitate comparative analysis of building best practices. Read about the project in an OKFN blog post.