DIY Data Journalism: How to quickly investigate publicly available data

October 30, 2013 in Uncategorized

This is a guest post by Jurian Baas from Silk, a platform for sharing collections of information.

Professional data journalists have to find stories in the enormous amounts of data being released every day. They rigorously combine narrative skills with mathematics, a bit of programming, and visual insight. This might seem complicated, and it is. For one, the learning curve for tools that can create Guardian-quality material is very steep.

The complexity of analyzing a huge dataset with complex programs might turn people off from data journalism. This is too bad, because simpler tools enable anyone to visualize and analyze a modest dataset. The ability to sift through a pile of data and find hidden connections is a skill that is valuable for many, like students, marketeers, people working at NGOs, and so on. In this post, I will give you an example of great do-it-yourself data journalism to show you that it isn’t hard.

America’s Worst Charities

Rutger is a product manager who works for an NGO. Interested in the financial practices of charities, he came across the America’s Worst Charities project. The project lists 50 charities that squander their money by paying large amounts to professional solicitation companies to raise donations. Several watchdog organizations say charities should spend no more than 35 percent of the money they raise on fundraising expenses, and these companies go well above that number.

America’s Worst Charities is far from an amateur project: both the Tampa Bay Times and the Center for Investigative Journalism worked hard to retrieve financial information about all charities and did a great job at analyzing it. But the presentation of the data itself was just a list. Rutger started to look at different ways to work with the data so that he could find more connections in it:

“I had the feeling that there were more interesting facts hidden in the data than was immediately apparent. I was not only interested in the absolute amount of money each charity spent and raised, but also in the people they paid, the solicitors. I wanted to know how much they exactly raised and kept. I believe that interactive visualizations, where you can change the parameters yourself, make it easier to see how numbers relate to each other.”

Rutger created a Silk site at americas-worst-charities.silk.co and imported two spreadsheet files with data from the America’s Worst Charities project.

Rutger: “Silk creates web pages from each row in the spreadsheet and tags the information, allowing me to create interactive tables and visualizations. I placed two tables on the homepage: one with the worst charities, and one with the worst solicitors. A lot of solicitors keep most of the money they raised. Three of them even seem to retain more money than they raise. That’s a big waste.”

“I created a map to see if there is any relation between overspending charities and location. Almost all offenders are located on the East-coast, Florida, New York, Washington and Georgia specifically. No surprise here: most charities are based in those states, so it makes sense that the worst offenders are also located there.”

“If we look at a detailed graph of charities that raised the most, you can see that the amount that is raised and paid to fundraiser dwarfs the amount that is actually being paid to the actual cause of the charity. Even worse, the amount of money that the charity keeps for itself is much, much higher.”

Rutger thinks his DIY data journalism experience was a success: “We have a lot of data to analyze and share, like most NGOs. It’s nice to dig into a dataset looking for interesting connections and stories. A great benefit from having the data on a website is that colleagues and other interested people can play around with the tables and visualizations themselves.”

Got you interested?

If Rutger’s story got you interested in the topic of data journalism, be sure to explore this blog for excellent posts like Data Journalist in a Day. If you already have some data and would like to check out the tool Rutger used, you can create a free Silk site at Silk.co.

Flattr this!