Jamming for Data – Open Data Day in the Philippines

Happy Feraren - March 23, 2015 in Community, Data for CSOs, Data Journalism

Some people spend Saturday afternoons going out with friends and watching a movie. Some spend it going to a park or working out. Some spend it in the house doing nothing. And some, spend Saturday afternoons wrangling government related data.

It was a joy to see people find people like themselves, “Met people who understand the horrors of PDF data sets and merged cells. I’m not crazy yay!” tweeted one participant. It was an even bigger joy to see people excited over the possibilities of Open Data. Though the OGP has been around in the Philippines for 4 years, the principles and ideas behind it are still not common knowledge in civil society.


We called, they answered: From magazine feature writers, to animators, and expert programmers who do data projects for fun -the event was a good mix of diverse personalities from different sectors as well (private sector, NGOs, government).IMG_0057

In celebration of the International Open Data Day  last February 21, we at Bantay.ph, together with our partner SEATTI, Open Knowledge Foundation, School of Data and Philippine Cyberpress conducted the very first citizen-led “Data Jam.” It was attended by people who were all interested in using their skills to find stories within a given data set and ultimately shed light on the problem of government accountability.

During the Data Jam, we opened up our organization’s primary dataset for the first time. The dataset contains findings from our citizen monitoring work on red tape in Metro Manila’s local government units. Actual survey questions and answers of on-site reports can be found in CSV format. It is the same data set used for the city scorecards published online through our website’s Red Tape Index. Users of the website can download the raw survey data of Bantay.ph and visualize it however they want. If, for example, a citizen wants to know more about how transparent processes are in different LGU’s, they can look through  the raw dataset and find answers. Moreover, if government offices themselves want to know how they are performing given a different set of indicators, they can use the dataset to help them identify their lapses.

This is a trend we hope to start with other NGOs and CSOs. There are so many valuable datasets in the development sector and if we start opening them up, we give the general public a clearer picture of our reality. This gives us a good baseline if we want to improve and change failing systems. We cannot rely on government data alone. We’re also proud to announce that Bantay.ph now has an added feedback feature in the website that allows citizens of Metro Manila to write a review and rate their experience in a given city hall.

The 2014-2015 data set of the Contact Center ng Bayan (the national government feedback mechanism under the Civil Service Commission) was also used in the Data Jam. Participants were able to get an idea of how responsive different government offices were to complaints and grievances, what the most common complaints were in terms of government services, and the most popular mode of feedback. Government offices like the Civil Service Commission spend a lot of time processing and releasing this kind of data. The people behind the feedback mechanism are the same people who answer the hotline, encode, analyze, and visualize the data. Moreover, government still relies on mainstream media to pick up the data findings before it reaches the general public. Through the event, participants realized that by simply opening up the data set, government can outsource the analysis and visualization part to the citizenry.

Sentiments Visualized: Mich Rama of Dakila (a local NGO) and one of the Data Jam participants, used the online tool Infogr.am to instantly create a word cloud and pie chart.

Sentiments Visualized: Mich Rama of Dakila (a local NGO) and one of the Data Jam participants, used the online tool Infogr.am to instantly create a word cloud and pie chart.1


I find that the whole event was a good way to reflect on my organization’s internal process. We all have our blind spots, and by keeping things transparent, we get people to collaborate on solutions for the common good. This Data Jam also helped me see how we can improve our organization’s data collection processes and thus has given me a better insight on how to proceed strategically. When we open up datasets, we get to increase participation and engagement. Citizens are the end users of government service, they shouldn’t be left out of the conversation. We are, after all, the biggest stakeholders of this country.

During the event, data became the common language for everyone. Opinions were formed based on hard evidence instead of emotions and political agendas. In a span of 4 hours, people of different backgrounds were able to work together and help create a culture of transparency and accountability.


Bantay.ph is a civil society organization that uses technology to mobilize citizens to demand good governance. They want to help citizens get good government service and they do this primarily through awareness campaigns and performance monitoring of the different LGUs in Metro Manila. Currently, they have covered 9 out of 17 cities but target  to finish all of them by this year.

flattr this!

Fellowship programme update: applications now closed!

Zara Rahman - March 11, 2015 in Fellowship

The call for 2015 Fellowships has now closed – thank you all for your applications! We have received many more applications than we had hoped for – in total, we have received 563 fellowships applications from 82 different countries. We’re humbled and grateful that you all took the time to apply, and we wish that we had more places available!

Map of Fellowship applicants

Map of where Fellowship applicants are living

We want to dedicate enough time to read through all of the great applications; because of this, we’ve decided to postpone the start of the Fellowship to give us, and our local partners, enough time to dedicate to reviewing the applications with the time and energy they deserve.

Our new proposed start date for the fellowship is April 13th, and we will let you know of our decision by April 7th. We really hope this doesn’t affect anybody’s plans negatively, or impact their availability for the fellowship – if it does, please drop us an email to info[at]schoolofdata.org and we’ll do our best to accommodate you.

We will be updating the Fellowships FAQ with further information and in answer to your questions as they come in – anything else, please email us as above, and keep an eye on the FAQ page.

A huge thank you to all of our partners and the School of Data network who shared the call for Fellows very widely – we really appreciate your help on spreading the word!

flattr this!

May we Fetch you Some Data Skills? Exactly, at the Open Data Day in Abuja

olubabayemi - March 8, 2015 in Events

While everyone else whines about more training time for tools to can improve their data skills at our last open data party in Abuja, we did something about it at the Open Data Day in Abuja, on Saturday, February 21, 2015: 2 hours, 10 skill shares, 10 facilitators, 72 registered participants, and view the screenshot below to read what some of our participants said at the end of the event.


Tagged a skill share day that kicked off at 11a.m. it could not have been a wrong day to chose for this event, as 30 minutes before start of event, power was interrupted, and to make it worse, some part of the facility was going through renovation, Nevertheless, it began with an insanely charming activist from Devnovate/ Social Good Nigeria, Esther Agbarakwe, handing each of the participants a list of what to expect, and how the minutes are going to roll by in a flash – 2 hours will soon be over!

Surprisingly, the participants remain excited as we chose the first five skill shares that will run for the first hour (in a world cafe system). No thanks to our facilitators, who quickly had their laptops to run for next one hour without power. The skill shares for the hour were – Data Analysis using Excel Spreadsheet, Data Coding, using Google Spreadsheet for collaboration, Facilitation Skills and Community Building Skills. “It was quite intriguing to learn some simple skills around analysis with excel, been having headache around this all this while until this 15 minutes session” said one of the participants.


While the heat of the sun started settling in, it was quite obvious that the every 15 minutes change to another skill share for the participants became an exercise that will soon be gone. A new set of skill shares were introduced after 1 hour 15 minutes. “How come we already run through an hour and 15 minutes, that was in a dash” exclaimed Oladotun Fadeyiye. New skill shares included using infographics, using twitter for good, funding mechanisms, using ArCGIS for mapping, creating great blogging website.


Yusuf Suleiman taking on ARCGIS for mapping


Two hours fifteen minutes went by, and we were impressed at some of the ingenious ways our facilitators came up with, to teach without power! Did BRCK come to the rescue? No, not yet, it was a short time, so we got all laptops fully charged, so that by the time we finished, all laptops battery would be beeping. Internet came from shared sticks, and also I saw a facilitator using practically online screen shots for demonstration, making look like he was browsing – how intelligent!

As it was a short time, participants were instructed to scribble their questions on sticky notes, to be read at plenary, after the “party’ was over. We had questions like why is it important to code data, how can a stammer facilitate, how secured is it to share files with Google Doc, Can I use the Google Drive if I do not have a Gmail account. Had suggestions like can we more time for the skill share next time – like dedicating one hour to a particular topic, thus taking us to the advanced skills.


School of Data stickers up for grabs in a Thank you format to all Participants


At the end of the event, despite the fact that I had to fix laptops battery that were running out, collect sticky notes, shared sweeties and stickers, make sure sugar and energy drinks were available, tried blowing out the heat from participants sweat, Little said I would make a great concierge. We kept our cool and bantered with participants, and have useful connections, admittedly largely at getting feed backs. When it is time for another data skill share, I will be prepared to serve. My first question is going to be whether participants miss their homes during the skill share hours ;)

flattr this!

School of Data in South Africa

Jason Norwood-Young - March 5, 2015 in Fellowship


The School of Data Fellowship Programme in South Africa was a six-month programme funded by Indigo Trust, managed by Code for South Africa (Code4SA), in partnership with School of Data. The objective of the programme was to recruit two fellows in South Africa – one in Johannesburg and one in Cape Town – in order to promote data literacy in partner organisations in each city. The programme ran from July 2014 to December 2014.

“We’re definitely more aware of data in our projects, and more aware of the process. We have a clearer idea of what you want to get out of a project at the beginning, and we’re not just putting data in a spreadsheet. … Now we think about the data pipeline first — getting, cleaning and analysing the data”, Sean Russell – Ndifuna Nkwazi

Our Fellows

Hannah Williams

Hannah Williams is a Cape Town-based designer who scored well for her overall impression, and had been exposed to the School of Data training course as a participant in a course earlier in the year where she excelled in ability and enthusiasm. She proved to be a perfect choice, and as a designer, she brought a much-needed skill to the global School of Data fellowship.

Siyabonga Africa

Siyabonga Africa has a great enthusiasm for data journalism, and has long been the Hacks/Hackers organiser in Johannesburg. His career has its roots in public administration and journalism from the University of Pretoria and Stellenbosch University respectively. He completed his masters in new media design at Indiana University before returning to South Africa in 2012.

Highlights of their work

Our fellows were committed to work directly with partner organisations and help them achieve concrete outputs. The selection of partner organisations was based on their work and their interest or capacity to commit to a two-day workshop and six-month programme. The organistions we worked with were: The ConMedia Monitoring Africa (MMA), Ndifuna Nkwazi, International Budget Partnership and the Social Justice Coalition who teamed up to work on a project on the City of Cape Town’s budget and Black Sash.


Initially we held two workshops in Johannesburg and Cape Town to get our partner organisations the basics they need to produce great data stories. Our fellows led the workshops with the support of Michael from School of Data and Jason from Code for South Africa. They guided the participants through a series of exercises that helped them define their product, their audience, and their implementation plans. They also ran hands-on, practical sessions on data cleaning, teaching the ever useful Open Refine, data analysis with an emphasis on spreadsheets, and specialised tools like budget tools and Wazimap.


Cape Town Budget, Ndifuna Nkwazi, International Budget Partnership and the Social Justice Coalition

As part of their Imali iYethu campaign NU are working on making The City of Cape Town’s budget more accessible to promote public participation (currently very low) and a more equitable budget. Currently the City of Cape Town releases its budget as a series of lengthy PDF documents so it’s practically impossible to get a clear overview of how the budget is being spent. We worked with Ndifuna Ukwazi to firstly extract the budget data from these PDFs, clean and analyse the budget data (geographically as well as according to specific local government service sectors and possibly according to other factors as they become apparent) and then present this data in a clear and understandable manner so that the public can know how their tax money is being spent in their local area and it can be determined whether the current budget allocation is equitable, and if not, how it could be improved.

Cape Town Budget

We hope that this project will encourage the City to reconsider how it releases its budget and encourage more active citizenship and engagement on all sides for future budgeting processes.

Hannah has created a map of the City’s budget, which will form an integral part of the campaign. http://capetownbudgetproject.org.za/

Social Audit Posters – Black Sash

Making All Voices Count (MAVC), managed by Hivos, is part of a global initiative which aims to empower Community Based Organisations (CBOs) to take ownership of and participate actively in citizen-based monitoring of three government services (Health, SASSA and Local Government) in 20 service sites across South Africa. A working relationship with the Department of Performance Monitoring and Evaluation is in place to pilot the integration and institutionalisation of citizen feedback into government’s performance monitoring and evaluation system. Black Sash conducted surveys with communities around these 20 pilot sites. Their challenge was how to effectively analyse and present this data in a manner that promotes dialogue between communities and service centres so that poor areas of service can be improved and good service acknowledged. As these communities have a low level of internet and technology penetration we decided to use low-tech methods of communicating the monitoring data to them.

Hannah turned that data into posters, using a simple design language, so that the data can be used to promote dialogue between the stakeholders at the monitored stations.


Read more at http://www.blacksash.org.za/index.php/sash-in-action/making-all-voices-count

The result of the data analysis will feed into Black Sash’s Hands Off Our Grants project, which protects the poorest of the poor from having their grant payments illegally garnished.

The Black Sash project was the most successful of the projects, resulting in multiple concrete outputs, and a very happy partner who uses the project and the School of Data programme as an example of data-driven advocacy when speaking to other civic society organisations.

AgendaSetter – Media Monitoring Africa

Agenda Setter is another project from MMA that uses Dexter as a data source. MAgenda SetterMA aims to match the newsmakers with publishers to see who has the most influence in the news of the day. Using Dexter, MMA plans on mining news that is published to extract the sources who are the most prevalent to see if they match up with the top Twitter profiles in the fields of business, politics and news. They also plan on using this platform to analyze the content to find which stories get mainstream attention and from which sources.


Some of the parts of the AgendaSetter application have been completed, including a visualisation of the top newsmakers in the country using Dexter data (http://code4sa.org/sa-newsmakers), as well as monitoring Twitter for the top South African journalists’ tweets (http://journotweets.code4sa.org/).

Lessons learned

Proximity matters

Managing a project and a fellow remotely is very difficult. Siyabonga didn’t always let us know when he was struggling, and with what, and tried to make everything seem rosy when we he needed help. If he was closer, and we saw him physically more often, I think it wouldn’t have been as much of an issue. Even though we didn’t see Hannah every day or even every week, the occasional sit-down was much more valuable than our regular Skype stand-ups. We also regularly had our Cape Town partners at our offices, often for the entire day. As a result, I feel that the Cape Town projects benefited from the entire Code4SA team, and from extra hands-on attention.

Technical solutions aren’t always the right solutions

The outcome I’m most proud of isn’t a website – it’s a poster. For the audience, it is clearly the right medium, and communicates in a way that a website cannot.

Making the most of a short amount of time

Hannah suggested in her report that we could have selected the partners before the programme started, which would have let us organise the workshop much earlier. It took us until September to get through the selection phases, and by then we had very little time to produce outputs.

You’ve got to get stuff out the door

Siyabonga quotes the following from Twitter in his report: “#ThingsToLiveBy: You are judged by the work you complete, not how much work you take on”

We encouraged our participants to come up with one grand project, which was often extremely ambitious. Rather, working with data every day and using the tools every day (such as NU is now doing) creates a much greater organisation change that will have long-term benefits. NU now reports that they use Open Refine extensively, and that it has saved them weeks of work.

flattr this!

Learn data visualization and data-driven journalism in a real “Data Jam”

Happy Feraren - February 17, 2015 in Events

[Cross posted from GMA News Online; Press release by Banthay.ph]

On February 21 (International Open Data Day), Bantay.ph, a platform that uses technology in mobilizing citizens to demand good governance, will host the very first citizen-initiated “Data Jam.”

Done in partnership with the Southeast Asia Technology and Transparency Initiative (SEATTI), the Data Jam aims to get citizens to participate in governance via data analysis and visualization.
The event also aims to teach the general public and journalists alike the fundamentals of data visualization and data-driven journalism through a real hands-on experience.
“Data Journalism in the Philippines is still a wide-open field,” notes TJ Dimacali, Philippine Cyberpress president. “It’s an exciting frontier, especially for tech-savvy journalists. But it’s also something anyone can do, given the right tools.”
Bantay.ph co-founder and 2014 School of Data Fellow Happy Feraren explains: “The information that we hope to mine from the activity can give us an insight on how and where exactly our systems of governance are failing. It can help us identify what exactly is going wrong and instead of pointing fingers,  we can use this information to improve the lapses of the bureaucracy.”
The Data Jam hopes to introduce the frontier of using data to raise awareness and give feedback to government. Feraren adds, “We will group writers, graphic designers, and data analysts together to come up with questions and find the answers together.”
The program and activity flow will be based on the international School of Data toolkit. Using the open datasets of Bantay.ph and the Civil Service Commission, the event wants to get people with the right skillsets to work together and discover new stories from the raw datasets provided. Overall, it’s a new way to shed light on national issues and is a slicker and more efficient way to give feedback to government.
The Philippines has been a signatory of the global Open Government Partnership (OGP) since 2011, which essentially encourages participatory governance and openness in the bureaucracy. One way that the OGP suggests is the use and application of open data provisions. Given the amount of public data, there should be a conscious effort to make these datasets available and easily accessible. And at the same time, citizens should make use of these datasets to ensure transparency is met.
“We want to promote that kind of culture where we make data-driven decisions, especially when it comes to matters of governance. There is so much we can do to track what government is doing and how they are performing. It’s one concrete way to tell them, as citizens, that ‘we are watching you,’ ” says Feraren. “It’s one way we can promote a culture of active citizenship – where we don’t just rely on mainstream media to know what’s really happening. There’s a whole lot of data out there that we don’t look at and given the right training and awareness, citizens CAN mine their own insights out of publicly available data.”
The Data Jam is organized by Bantay.ph and SEATTI, co-sponsored by the Open Knowledge Foundation, The School of Data – Philippines, and the Philippine Cyberpress.
Interested data analysts, storytellers, and graphic designers can RSVP via info@bantay.ph – LIMITED SLOTS ONLY and RSVP is a must. It will be held in on February 21, 1-5pm at the AIM Conference Center. Full details will be sent to confirmed participants.

flattr this!

Highlights from the Ask Us Anything hangouts (part I)

Zara Rahman - February 17, 2015 in Fellowship

We carried out a couple of video hangouts with 2014 fellows to talk more about last year’s fellowship programme, and about the upcoming programme which has an open Call for Applications, closing on March 10th. For those who prefer reading to watching, here are some highlights and questions that came up during the video hangouts!

Question: What is the typical day of a School of Data fellow?

Happy: As a fellow, I spent a lot of time learning! The Fellowship really helped me to be brave and dive into the data… other than the events you have to do, a lot of it is a learning experience. It really never stops!

Yuandra: Usually, I meet with a lot of people – working with data is very new in my country, Indonesia, so there was lots of interest. I spent lots of time going from organisation to organisation, raising awareness of what they can do with data. Then, planning training – the materials, preparing them, thinking about how to package the materials in a way that people will understand.

Question: What skills are needed to be a School of Data fellow?

Milena: We’re looking for a diversity of skills among the fellows, we’re hoping each fellow will have a strong skill that they’ll be able to teach others, as well as be able to identify gaps in their own knowledge. We only have 7 spaces this year, which is fewer than last year, so it will (hopefully!) be a competitive process.

Codrina: It’s important to have some connections in your region, because the Fellowship (and School of Data) is not just about learning things for yourself, but then to take what you have learned and what you know, and spread it in your own geographical context. Or if you don’t already – be prepared to go around and meet lots of new organisations and build the community around you!

Yuandra: Community building is really important, you’ll be working with other organisations around you who definitely have the need for data. So is communication: my background is very technical, but this Fellowship taught me how to put my technical jargon aside, and explain issues in a simple way for newcomers to the topic.

Question: What kinds of projects did Fellows carry out?

Yuandra: I worked with Publish What You Pay (who work on extractive industries transparency), who previously only used data in Excel, and for reports. When I went there, one of my main points was to show them how they can use data in other ways, for example in visualisations and infographics. They’re still in an early stage of working with data, but they’ve come a long way!

Codrina: I’m a mapping person, so much of the work I did involved either building maps or teaching people how to use them, and how to stay away from usual map problems. I went to Bosnia & Herzegovina, and worked on election maps. If you’re ever curious about the most horrible election system in the world – take a look! We spent a week trying to work out how it works, we ended up asking people to explain the system in a 3 minute video, which worked really well.

Happy: I found that it’s hard to ‘sell’ open data to different CSOs just by explaining – so, I wanted to use my own organisation as a model, to demonstrate what exactly people can do with open data. It was a really good way actually for us to engage with government – you build trust, and partnerships with them, by teaching them what they can do with data. Now, the government are opening up datasets that they’ve never opened before – so this is really exciting for me.

Nisha: We did a data journalism workshop for people who are really not very technologically savvy – it was really rewarding because after a while of working with people who want to know more advanced stuff, you can forget there’s lots of people who still want to know the basics, so you get to open this whole new world to them. We also did a data expedition with an organisation that’s working in the urban space in Hyderabad, with data that they’d collected.

If you like the sound of what last year’s fellows got up to – why not apply yourself and join us as one of the 2015 Fellows? More details are available here, and if you have any further questions please drop us a line on info[at]schoolofdata.org or on @SchoolofData. Applications close on March 10th, and we look forward to hearing from you!

flattr this!

Ask Us Anything – watch it online now

Zara Rahman - February 16, 2015 in Fellowship

To talk through the fellowship programme and hear from last year’s fellows, we held a couple of online hangouts: you can watch them here, and if you have any further questions, feel free to drop us a line on info@schoolofdata.org, or tweet us @SchoolOfData

On Monday 16th February, our 2014 Fellows Codrina, Happy and Yuandra, from Romania, the Philippines, and Indonesia respectively, joined myself and Milena to talk through their experiences in last year’s fellowship.

Here’s the video online (just under an hour long):

And on Tuesday 17th February, Olu and Nisha, from Nigeria and India respectively, joined us to discuss their fellowship. Here’s their video, which is just over 30 minutes long:

flattr this!

It’s time to get data-savvy: host a School of Data fellow in 2015!

Zara Rahman - February 10, 2015 in Fellowship

Applications to host a School of Data fellow in 2015 are now closed! Thank you to everyone who applied – we will endeavour to get back to you as soon as possible

We’re looking for local NGOs based in countries classified as low income, lower-middle income or upper-middle income to host our School of Data fellows.


We have funding for 7 School of Data fellows to take part in our 2015 Fellowship Programme, and from previous experience, we’ve found that the fellowships work best when there is an established local host.

Who are we looking for?

School of Data is promoting data literacy by working with local partners to create impactful data-driven projects. We’re looking organisations that need support in using data more effectively and that are willing to work closely with one of our School of Data fellows over a 9 month period.

If you are selected, you’ll welcome a School of Data fellow in your office on a regular basis, to work on concrete projects and provide you with custom trainings and support, depending on what you need most. You’ll open up your data to the fellow, and allow them to see how you work with data now, help you guide your organisation towards being more data-savvy and using data to strengthen your work, be that in the field of advocacy, campaigning, journalism, or elsewhere within the civil society space. You’ll support the growth of the data-literate community, by inviting those within your network to attend trainings, and organising your own data expeditions, supported closely by the School of Data fellow.

What we expect you to contribute

This programme involves a great deal of resources and commitment from us, and we expect an equal amount of resources and commitment from our partners.

The ideal partner would be able to commit:

  • To support the fellow’s work, objectives and their overall work with your organisation without overburdening them or putting them in difficult situations
  • A good data driven project idea for what you want to achieve together with the fellow. This could be a specific data driven application (a web application, a website, an addition to an existing project or site, a mobile app), broader organisational support to use data, or any other feasible use for open data. The project must hold the potential to engage a large audience, to create a positive change for a community, region or country, and directly promote your organisation goals and objectives.
  • A team to the project. We want to create sustainable projects, and work with you to achieve systemic change within your organisation. We can’t do this with only one person. We would like to work with the relevant team of people in your organisation, depending on your needs and capacity.
  • In addition, we also welcome in kind or financial support for our fellowship programme. Our programme funds the work on the fellow, including a part time equivalent monthly stipend and some travel support but we appreciate additional support that can complement our programme. Get in touch to understand more about the type of support you can provide.

What you’ll get in return

If you are accepted as our local partner, we’ll ask for your assistance in selecting the best applicant to be the School of Data fellow who will work with you. The fellow will support you by:

  • Evaluating your organisational capacity to work with data
  • Delivering custom training and support for your organisation depending on your needs
  • Working with you on a concrete data driven project,

Here is just an example of what our 2014 fellow Hannah Williams worked on together with local partners from South Africa: http://capetownbudgetproject.org.za/

Interested? Get in touch.

Deadline: March 10th

You are also welcome to contact us on info@schoolofdata.org while you are preparing your application; we’d be happy to answer your questions and help you put together a good application.

flattr this!

Call for Applications: School of Data 2015 Fellowship programme now open!

Zara Rahman - February 10, 2015 in Fellowship

Applications are now closed! – thank you all for your applications, we are now reviewing them together with our partners and will get back to you as soon as possible!

We’re very happy to open today our 2015 Call for School of Data Fellowships!


Following our successful 2014 School of Data Fellowships, we’re opening today our Call for Applications for the 2015 Fellowship programme. As with last year’s programme, we’re looking to find new data trainers to spread data skills around the world.

As a School of Data fellow, you will receive data and leadership training, as well as coaching to organise events and build your community in your country or region. You will also be part of a growing global network of School of Data practitioners, benefiting from the network effects of sharing resources and knowledge and contributing to our understanding about how best to localise our training efforts.

As a fellow, you’ll be part of a nine-month training programme where you’ll work with us for an average of ten working days a month, including attending online and offline trainings, organising events, and being an active member of the thriving School of Data community.

Get the details

Our 2015 fellowship programme will run from April-December 2015. We’re asking for 10 days a month of your time – consider it to be a part time role, and your time will be remunerated. To apply, you need to be living in a country classified as lower income, lower-middle income or upper-middle income categories as classified here.

Who are we looking for?

People who fit the following profile:

  • Data savvy: has experience working with data and a passion for teaching data skills.
  • Social change: understands and interested in the role of Non-Governmental Organizations (NGOs) and the media in bringing positive change through advocacy, campaigns, and storytelling.
  • Has some facilitation skills and enjoys community-building (both online and offline) – or, eager to learn and develop their communication and presentation skills
  • Eager to learn from and be connected with an international community of data enthusiasts
  • Language: a strong knowledge of English – this is necessary in order to communicate with other fellows, to take part in the English-run online skillshares and the offline Summer Camp

To give you an idea of who we’re looking for, check out the profiles of our 2014 fellows – we welcome people from a diverse range of backgrounds, too, so people with new skillsets and ranges of experience are encouraged to apply.

This year, we’d love to work with people with a particular topical focus, especially those interest in working with extractive industries data, financial data, or aid data.

There are 7 fellowship positions open for the April to December 2015 School of Data training programme.

Geographical focus

We’re looking for people based in low-, lower-middle, and upper-middle income countries as classified by the World Bank, and we have funding for Fellows in the following geographic regions:

  • One fellow from Macedonia
  • One fellow from Central America – focus countries Costa Rica, Guatemala, Honduras, Nicaragua, but we welcome applications from all Central American countries.
  • One fellow from South America – focus countries Bolivia, Peru, Ecuador, and again, applications are really welcome from all South American countries.
  • Two fellows based in African countries (ie. two different countries)
  • Two fellows based in Asian countries (ie. two different countries)

What does the fellowship include?

As a School of Data fellow, you’ll be part of our 9-month programme, which includes the following activities:

  • guided and independent online and offline skillshares and trainings, aimed to develop data and leadership skills,
  • individual mentoring and coaching;
  • an appropriate stipend equivalent to a part time role;
  • Participation in the annual School of Data Summer Camp, which will take place in May 2015 – location to be confirmed.
  • Participation in activities within a growing community of School of Data practitioners to ensure continuous exchange of resources, knowledge and best practices;
  • Training and coaching of the fellow in participatory event management, storytelling, public speaking, impact assessment etc;
  • Opportunities for paid work – often training opportunities arise in the countries where the fellows are based.
  • Potential work with one or more local civil society organisations to develop data driven campaigns and research.

What did last year’s fellows have to say?

Check out the Testimonials page to see what the 2014 Fellows said about the programme, or watch our Summer Camp video to meet some of the community.


This year’s fellowships will be supported by the Partnership for Open Development (POD) OD4D, Hivos, and the Foreign and Commonwealth Office in Macedonia. We welcome more donors to contribute to this year’s fellowship programme! If you are a donor and are interested in this, please email us at info@schoolofdata.org.

Got questions? See more about the Fellowship Programme here and have a looks at this Frequently Asked Questions (FAQ) page.- or, watch the Ask Us Anything Hangouts that we held in mid-February to take your questions and chat more about the fellowship.

Not sure if you fit the profile? Have a look at our 2013 and 2014 fellows profiles.. Women and other minorities are encouraged to apply.

Convinced? Apply now to become a School of data fellow. The application will be open until March 10th and the programme will start in April 2015.

flattr this!

Data expedition tutorial: UK and US video game magazines

Cédric Lombion - February 3, 2015 in Data Cleaning, HowTo, Spreadsheets, Storytelling, Workshop Methods

Data Pipeline

This article is part tutorial, part demonstration of the process I go through to complete a data expedition alone, or as a participant during a School of Data event. Each of the following steps will be detailed: Find, Get, Verify, Clean, Explore, Analyze, Visualize, Publish

Depending on your data, your source or your tools, the order in which you will be going through these steps might be different. But the process is globally the same.


A data expedition can start from a question (e.g. how polluted are european cities?) or a data set that you want to explore. In this case, I had a question: Has the dynamic of the physical video game magazine market been declining in the past few years ? I have been studying the video game industry for the past few weeks and this is one the many questions that I set myself to answer. Obviously, I thought about many more questions, but it’s generally better to start focused and expand your scope at a later stage of the data expedition.

A search returned Wikipedia as the most comprehensive resource about video game magazines. They even have some contextual info, which will be useful later (context is essential in data analysis).

Screenshot of the Wikipedia table about video game magazines https://en.wikipedia.org/wiki/List_of_video_game_magazines


The wikipedia data is formatted as a table. Great! Scraping it is as simple as using the importHTML function in Google spreadsheet. I could copy/paste the table, but that would be cumbersome with a big table and the result would have some minor formatting issues. LibreOffice and Excel have similar (but less seamless) web import features.

importHTML asks for 3 variables: the link to the page, the formatting of the data (table or list), and the rank of the table (or the list) in the page. If no rank is indicated, as seen below, it will grab the first one.

Once I got the table, I do two things to help me work quicker:

  • I change the font and cell size to the minimum so I can see more at once
  • I copy everything, then go to Edit→Paste Special→Paste values only. This way, the table is not linked to importHTML anymore, and I can edit it at will.


So, will this data really answer my question completely? I do have the basic data (name, founding data, closure date), but is it comprehensive? A double check with the French wikipedia page about video game magazines reveals that many French magazines are missing from the English list. Most of the magazines represented are from the US and the UK, and probably only the most famous. I will have to take this into account going forward.


Editing your raw data directly is never a good idea. A good practice is to work on a copy or in a nondestructive way – that way, if you make a mistake and you’re not sure where, or want to go back and compare to the original later, it’s much easier. Because I want to keep only the US and UK magazines, I’m going to:

  • rename the original sheet as “Raw Data”
  • make a copy of the sheet and name it “Clean Data”
  • order alphabetically the Clean Data sheet according to the “Country” column
  • delete all the lines corresponding to non-UK or US countries.

Making a copy of your data is important

Tip: to avoid moving your column headers when ordering the data, go to Display→Freeze lines→Freeze 1 line.

Ordering the data to clean it

Some other minor adjustments have to be made, but they’re light enough that I don’t need to use a specialized cleaning tool like Open Refine. Those include:

  • Splitting the lines where 2 countries are listed (e.g. PC Gamer becomes PC Gamer UK and PC Gamer US)
  • Delete the ref column, which adds no information
  • Delete one line where the founding data is missing


I call “explore” the phase where I start thinking about all the different ways my cleaned data could answer my initial question[1]. Your data story will become much more interesting if you attack the question from several angles.

There are several things that you could look for in your data:

  • Interesting Factoids
  • Changes over time
  • Personal experiences
  • Surprising interactions
  • Revealing comparisons

So what can I do? I can:

  • display the number of magazines in existence for each year, which will show me if there is a decline or not (changes over time)
  • look at the number of magazines created per year, to see if the market is still dynamic (changes over time)

For the purpose of this tutorial, I will focus on the second one, looking at the number of magazines created per year Another tutorial will be dedicated to the first, because it requires a more complex approach due to the formatting of our data.

At this point, I have a lot of other ideas: Can I determine which year produced the most enduring magazines (surprising interactions)? Will there be anything to see if I bring in video game website data for comparison (revealing comparisons)? Which magazines have lasted the longest (interesting factoid)? This is outside of the scope of this tutorial, but those are definitely questions worth exploring. It’s still important to stay focused, but writing them down for later analysis is a good idea.


Analysing is about applying statistical techniques to the data and question the (usually visual) results.

The quickest way to answer our question “How many magazines have been created each year?” is by using a pivot table.

  1. Select the part of the data that answers the question (columns name and founded)
  2. Go to Data->Pivot Table
  3. In the pivot table sheet, I select the field “Founded” as the column. The founding years are ordered and grouped, allowing us to count the number of magazines for each year starting from the earliest.
  4. I then select the field “Name” as the values. Because the pivot tables expects numbers by default (it tries to apply a SUM operation), nothing shows. To count the number of names associated with each year, the correct operation is COUNTA. I click on SUM and select COUNT A from the drop down menu.

This data can then be visualized with a bar graph.

Video game magazine creation every year since 1981

The trendline seems to show a decline in the dynamic of the market, but it’s not clear enough. Let’s group the years by half-decade and see what happens:

The resulting bar chart is much clearer:

The number of magazines created every half-decade decreases a lot in the lead up to the 2000s. The slump of the 1986-1990 years is perhaps due to a lagging effect of the North american video game crash of 1982-1984

Unlike what we could have assumed, the market is still dynamic, with one magazine founded every year for the last 5 years. That makes for an interesting, nuanced story.


In this tutorial the initial graphs created during the analysis are enough to tell my story. But if the results of my investigations required a more complex, unusual or interactive visualisation to be clear for my public, or if I wanted to tell the whole story, context included, with one big infographic, it would fall into the “visualise” phase.


Where to publish is an important question that you have to answer at least once. Maybe the question is already answered for you because you’re part of an organisation. But if you’re not, and you don’t already have a website, the answer can be more complex. Medium, a trendy publishing platform, only allows images at this point. WordPress might be too much for your need. It’s possible to customize the Javascript of tumblr posts, so it’s a solution. Using a combination of Github Pages and Jekyll, for the more technically inclined, is another. If a light database is needed, take a look at tabletop.js, which allows you to use a google spreadsheet as a quasi-database.

Any data expedition, of any size or complexity, can be approached with this process. Following it helps avoiding getting lost in the data. More often than not, there will be a need to get and analyze more data to make sense of the initial data, but it’s just a matter of looping the process.

[1] I formalized the “explore” part of my process after reading the excellent blog from MIT alumni Rahoul Bhargava http://datatherapy.wordpress.com

flattr this!