September 5, 2013 in Data Expeditions

Recently, we launched our started our first round of mentor training at the School of Data. During this, we’ll bring a group of volunteers up to shape to teach and use different concepts around the School of Data. One of the first things we did with them was a mini-data expedition.

We started out thinking about investigating corporations and corporate structures. Soon, the discussion shifted to the intricate details of public-private relationships: How are private companies related to government institutions. We discussed a project done by the la nacion data team on subsidies to transport companies and how companies are linked to current or former officials. Suddenly, one of the mentors asked: How about the NSA – we hear a lot about them and we know they work closely with companies. How about we look at them? Immediately, we jumped on it.

Who watches the watchmen?

While the NSA is very good at collecting data – where is the data collected about the NSA? In particular about officials and companies who worked with the NSA. We went out to scout. Next to interesting news stories about tech companies funded by the CIA we got the idea of looking at linkedin and Little Sis. You all know Linkedin – a business network. Little Sis is a project to record the links between business and politics in the USA – sounds like exactly what we need. We decided to take a stab at the API using refine. After registering with for a key – everything looked promising – we found the NSA entry and got a list of people and companies connected to them. Since we wanted to examine a network – we thought about adding one more level – companies and people connected to companies and people connected to the NSA. However, our approach at using openrefine for accessing the API failed – after searching around for a while, we realized: the little sis API checks what kind of software accesses it and doesn’t like Refine…

We decided to have a look at the people instead – in Little Sis, everyone is categorized whether you are a Business Person, a public official etc. Here’s what we found with the 24 people directly connected to Little Sis: Half of them are business(wo)men, and we also have some Lawyers and Lobbyists.


After we finished our expedition – some of us did not just want to give up there: I wrote a module to access the Little Sis API (python) and scraped the network around the NSA (1 hop). The resulting data can be imported to gephi and we’ve come up with some visualizations… Why don’t you download the data and let us know whether you spot something interesting?

