You are browsing the archive for Event report.

“Not a scary concept”: Reflections from the Standard Group Data Conference

- October 10, 2018 in Event report, Fellowship

In his first piece for the School of Data blog, our 2018 Fellow, Kelvin Wellington, reflects on his experiences at the Standard Group Conference in Accra in July 2018.

To date, the conversation around open data has been firmly centred in its importance and the implications of championing the cause. Is it a cause worth fighting for, and are policymakers doing the right thing by opening up data to the public eye? As citizens, is it important to know the finer details of how our country is run? These are questions that were lingering in my mind during an open data presentation that was part of a data conference held by the Standard Group in Accra, Ghana a few days ago. I will attempt to dissect some findings from this session.

What can open data do?

Open data should not be a scary concept, and should be embraced. It should not be seen as a means of taking off ‘protective shields’ on data. When we talk about open data, we should be looking at the following:

  • Empowerment: open data can give citizens of a country a stronger voice on public services they use and create a channel of dialogue between the citizen and local authorities or government.
  • Transparency: open data should be the next frontier in citizens’ quest for transparency. Freedom of information enables citizens to make informed decisions regarding their government, and allows us to better understand our world.
  • Participation: open data should bring about inclusiveness; from data providers to users. Everyone has a part to play in innovating with data and making a difference through building data-driven solutions.

 

Who should be driving open data?

Ideally, policy makers should be the driving force for open data in any setting. Policy makers in Ghana are, however, driving at a turtle’s pace. The public should be weighing in on the conversation as well, but at the moment thoughts are too scattered to produce a collective force. The private sector should also be heavily involved in the Open Data push since they have access to huge amounts of data.

In addition, the policies and procedures should be open as well, not just data. Ultimately, it is up to governments, public bodies, community groups, citizens and businesses to facilitate the growth of open data, propagate its benefits and see that it achieves its full potential.

The Open Data Initiative in Ghana has stalled with the online platform lacking in up-to-date data, and data unavailable for a good number of industries. Financial constraints have been pointed out as a major issue, and as citizens, we owe it to our country to challenge authorities to resolve this.

Why is it important?

We spend our time talking about making decisions without focusing on making data-driven decisions. If data is not being processed into knowledge and that knowledge does not become wisdom, then the purpose of data in itself is dead. Opening data gives us all a chance to contribute to creating more knowledge and making wiser decisions. Data has become a gold standard, and keeping an ‘open culture’ makes for a healthier ecosystem for policymakers and citizens alike.

 

 

Flattr this!

2017 Summer Camp dispatch #1: Thank You

- October 3, 2017 in Event report

image alt text

School of Data’s 2017 Summer Camp has reached an end and was, by most metrics, a resounding success! This is especially when one considers the leap of faith we took on several aspects: the first three days included a new mix of sessions, which we tried to broadcast live; the last 2 days featured an Open Training section with 70 (!) participants, which required its own dedicated event planning to make it work; the full camp was documented on a live agenda allowing for remote following and contribution.

The secret in making it work was to rely both on the skills of our staff and the power of our network. A round of thanks is consequently in order:

Joachim Mangilima, who has been School of Data’s conductor on the ground throughout the Summer Camp, and was able to produce the work of a full event team on his own;

image alt text

Photo by Juan Casanueva, SocialTIC

Our own Meg Foulkes, who was the second magician working behind the scenes and who made sure, among other key contributions, that School of Data network members reached and left Tanzania safely;

SocialTIC, longtime School of Data network member and partner, who brought the Latin American team to the Summer Camp;

the IREX team, who has been involved from the very beginning and helped make the Open Training with YALI Fellows a reality;

image alt text

The Data Collaboratives for Local Impact teams, who took responsibility for a huge part of the logistics involved in the Open Training, from the set up of the training space to the amazing barbecue!

And finally, of course, the dozens of participants from the combined networks of School of Data, DCLI, dLab and Data Zetu who all contributed to make this event a success.

image alt text

A thousand times thank you!

Flattr this!

Data is a Team Sport: Government Priorities and Incentives

- August 13, 2017 in Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/337462183″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

The conversation in this episode focuses on the challenges of getting governments to prioritise data literacy both externally and internally, and incentives to produce open-data and features:

  • Ania Calderon, Executive Director at the Open Data Charter, a collaboration between governments and organisations working to open up data based on a shared set of principles. For the past three years, she led the National Open Data Policy in Mexico, delivering a key presidential mandate. She established capacity building programs across more than 200 public institutions.
  • Tamara Puhovskia sociologist, innovator, public policy junky and an open government consultant. She describes herself as a time traveler journeying back to 19th and 20th century public policy centers and trying to bring them back to the future.

Notes from the conversation:

The conversation focused on challenges they face pressuring governments to open up their data and make it available for public use.  In order for governments to develop and maintain open data programmes, there needs to be an ecosystem of data literate actors that includes knowledgeable civil servants and incentivised elected officials being held to account by a critical thinking citizenry supported by smart open data advocates. Governments incentivisation for open data can’t be solely based on budgetary or monetary savings, they need to be motivated to want to use data to improve the effectiveness of their programs.

Once elected officials are elected, it’s too late to educate and motivate them enough to push for open data programmes as they have too many other priorities and pressures. They have to have a level of data literacy that will provide enough knowledge, motivation and commitment to open data before they are  elected.

Making arguments for open data is not having adequate impact, but we need to be able to provide good solid stories and examples of its benefits. Those advocating for open data have perhaps been too optimistic that citizens would find the data useful once it has been released. A ‘supply and demand’ frame is still an important way to look at open data projects and assess their ability to have impact.

Access to government produced open data is critical for healthy functioning democracies, but government’s ability to release open data is heavily dependent on their own capacities to produce and work with data.  There is currently not enough technical support for public officials tasked with implementing open data projects.

Resources mentioned in the conversation:

Also, not mentioned, but be sure to check out Tamara’s work on Open Youth

View the full online conversation:

[youtube https://www.youtube.com/watch?v=kK8Pz4DZ06s]

Flattr this!

Data is a Team Sport: Mentors Mediators and Mad Skills

- August 7, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/336474972″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

This episode features:

  • Emma Prest oversees the running of DataKind UK, leading the community of volunteers and building understanding about what data science can do in the charitable sector. Emma sits on the Editorial Advisory Committee at the Bureau of Investigative Journalism. She was previously a programme coordinator at Tactical Tech, providing hands-on help for activists using data in campaigns. 
  • Tin Geber has been working on the intersection of technology, art and activism for most of the last decade. In his previous role as Design and Tech Lead for The Engine Room, he developed role-playing games for human rights activists; collaborated on augmented reality transmedia projects; and helped NGOs around the world to develop creative ways to combine technology and human rights.

Notes from the conversation

In this episode we discussed ways to move organisations beyond data literacy and to the point of data maturity, where organisations are able to manage data-driven projects on their own. Training in itself can be helpful with hard skills, such as how to do analysis, but in terms of learning how to run a data projects, Emma asserts that you have to run a project with them as it takes a lot of hand-holding. There needs to be commitment within the entire organisation to implement a data project, as it will take support and inputs from all parts.  The goal of DataKind UK’s long-term engagements is to help an organisation to build an understanding of what is good data practice.  

Tin points out how critical it is for organisations to be able to learn from others that are working in similar contexts and environments. While there are international networks and resources that are accessible, his biggest challenge is identifying local networks that his clients can connect with and receive peer support.

Another critical element for reaching data maturity, is the existence of champions striving to develop good data practice within an organisation. Tin and Emma both acknowledge that these types of individuals are rare, have a unique skill set, and are often not in senior management positions. There’s a need for greater support for these individuals in the form of: mentoring, networks of practice and training courses that focus on how other organisations have successfully run data projects.

Intermediaries are often focused on demystifying new technologies for civil society organisations. Currently a lot of emphasis on grappling with the implications of machine learning, but it tends to point out the negative impacts (i.e. Cathy O’Neil’s book on ‘Weapons of Math Destruction’), and there needs to be greater examination of positive impacts and stories of CSO’s using it well and contributing to social good.

DataKind UK’s resources:

Tin’s resources:

Resources that are inspiring Emma’s Work:

Resources that are inspiring Tin’s work:

  • DataBasic.io – A a suite of easy-to-use web tools for beginners that introduce concepts of working with data
  • Media Manipulation and Disinformation Online – Report from Data and Society on how false or misleading information is having real and negative effects on the public consumption of news.
  • Raw Graphs – The missing link between spreadsheets and data visualization

View the full online conversation:

[youtube https://www.youtube.com/watch?v=GmsPcLw4Mec&w=560&h=315]

Flattr this!

Data is a Team Sport: One on One with Friedhelm Weinberg

- July 29, 2017 in Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/335294348″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Friedhelm Weinberg is the Executive Director of Human Rights Information and Documentation Systems (HURIDOCS), an NGO that supports organisations and individuals to gather, analyse and harness information to promote and protect human rights. 

Notes from the Conversation

We discussed at what it takes to be both a tool developer and a capacity builder. While the two disciplines inform and build upon each other, Friedhelm strongly feels that the capacity building work needs to come first and be a foundation for tool development. The starting point for human rights defenders is to have a clear understanding of what they want to do with data before they start collecting it.

Where in the past they used external developers to create tools, they have recently hired developers to be on staff and work side by side with their capacity builders. They have also recently been building their own capacity to help human right defenders utilise machine learning for processing large amounts of documents and extracting information about human rights abuses.

Specific projects within Huridocs he talked about:

  • Uwazi is an open-source solution for building and sharing document collections
  • The Collaboratory is their knowledge sharing network for practitioners focusing on information management and human rights documentation.

Readings/Resources that are inspiring his work:

View the full online conversation:

[youtube https://www.youtube.com/watch?v=00WbqeDojP0&w=560&h=315]

Flattr this!

Data is a Team Sport: One on One with Heather Leson

- July 19, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/333725312″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

This episode features a one on one conversation with Heather Leson, the Data Literacy Lead at International Federation of Red Cross and Red Crescent Societies where her mandate includes global data advocacy, data literacy and data training programs in partnership with the 190 national societies and the 13 million volunteers. She is a past Board Member at the Humanitarian OpenStreetMap Team (4 years), Peace Geeks (1 year), and an Advisor for MapSwipe – using gamification systems to crowdsource disaster-based satellite imagery. Previously, she worked as Social Innovation Program Manager, Qatar Computing Research Institute (Qatar Foundation) Director of Community Engagement, Ushahidi, and Community Director, Open Knowledge (School of Data).

Notes from the Conversation:

Heather talked about the need for Humanitarian organisations to lead their data projects with a ‘do no harm’ approach, and how keeping data and individual information they collect safe was paramount. During her first 10 months developing a data literacy program for the Federation, she focused on identifying internal expertise and providing opportunities for peer exchange. She has relied heavily on external knowledge, expertise and resources that have been shared amongst data literacy practitioners through participating in networks and communities such as School of Data.

Heather’s Resources

Blogs/websites

Heather’s work

The full online conversation:
[youtube https://www.youtube.com/watch?v=Vq7JJE_U7sg&w=560&h=315]

Flattr this!

Data is a Team Sport: Advocacy Organisations

- July 12, 2017 in Community, Data Blog, Event report

Data is a Team Sport is our open-research project exploring the data literacy eco-system and how it is evolving in the wake of post-fact, fake news and data-driven confusion.  We are producing a series of videos, blog posts and podcasts based on a series of online conversations we are having with data literacy practitioners.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/332772865″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

In this episode we discussed data driven advocacy organisations with:

  • Milena Marin is Senior Innovation Campaigner at Amnesty International. She is currently leads Amnesty Decoders – an innovative project aiming to engage digital volunteers in documenting human right violations using new technologies. Previously she worked as programme manager of School of Data. She also worked for over 4 years with Transparency International where she supported TI’s global network to use technology in the fight against corruption.
  • Sam Leon, is Data Lead at Global Witness, focusing on the use of data to fight corruption and how to turn this information into change making stories. He is currently working with a coalition of data scientists, academics and investigative journalists to build analytical models and tools that enable anti-corruption campaigners to understand and identify corporate networks used for nefarious and corrupt practices.

Notes from the Conversation

In order to get their organisations to see the value and benefit of using data, they both have had to demonstrate results and have looked for opportunities where they could show effective impact. Advocates are often quick to see data and new technologies as easy answers to their challenges, yet have difficulty in foreseeing the realities of implementing complex projects that utilise them.

Data provides advocates with ways to reveal the extent of a problem and  provide depth to qualitative and individual stories.  Milena credits the work of School of Data for the fact that journalists now expect Amnesty to back up their stories with data. However, the term ‘fake news’ is used to discredit their work and as a result they work harder at presenting verifiable data.

Data projects also can provide additional benefit to advocacy organisations by engaging stakeholders. Amnesty’s decoder project has involved 45,000 volunteers, and along with being able to extract data from a huge amount of video, it has also provided those volunteers with a deeper understanding of Amnesty’s work.  Global Witness is striving to make their data publicly accessible so it can provide benefit to their allies. Global Witness acknowledges that they are still are learning about ethical and privacy considerations before open data-sets can be a default. Both organisations are actively learning

They also touched on how important it is for their organisations to learn from others. They  look to external consultants and intermediaries to help fill organisational gaps in expertise in using data. They find it critical for organisations like Open Knowledge and School of Data to convene practitioners from different disciplines to share methodologies and lessons learned. During the conversation, they offered to share their own internal curriculums with each other.

More about their work

Milena

Sam

Resources and Readings

From FabRiders

View the Full Conversation:
[youtube https://www.youtube.com/watch?v=Row0OtRhlao]

 

Flattr this!

Data is a Team Sport: One on One with Daniela Lepiz

- July 3, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/331054739″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

This episode features a one on one conversation with Daniela Lepiz, a Costa Rican data journalist and trainer, who is currently the Investigation Editor for CENOZO, a West African Investigative Journalism Project that aims to promote and support cross border data investigation and open data in the region. She has a masters degree in data journalism from the Rey Juan Carlos University in Madrid, Spain. Previously involved with OpenUP South Africa working with journalists to produce data driven stories.  Daniela is also a trainer for the Tanzania Media Foundation and has been involved in many other projects with South African Media, La Nacion in Costa Rica and other international organisations.

Notes from the conversation

Daniela spoke to us from Burkina Faso and reflected on the importance of data-driven journalism in holding power to accountability. Her project aims to train and support  journalists working across borders in West Africa to use data to expose corruption and human rights violation. To identify journalists to participate in the project, they seek individuals who are experienced, passionate and curious. The project engages media houses, such as Premium Times in Nigeria, to ensure that there are respected outlets to publish their stories. Daniela raised the following points:

  • As the media landscape continues to evolve, data literacy is increasing becoming a required competency
  • Journalists do not necessarily have a background in mathematics or statistics and are often intimidated by the idea of having to these concepts in their stories.
  • Data stories are best done in teams of people with complementary skills. This can go against a traditional approach to journalism in which journalists work alone and tightly guard their sources.
  • It is important that data training programmes also work with, and better understand the needs of journalists.

Resources she finds inspiring

Her blogs posts

The full online conversation:

[youtube https://www.youtube.com/watch?v=9l4SI6lm130]

Daniela’s bookmarks!

These are the resources she uses the most often.

.Rddj – Resources for doing data journalism with RComparing Columns in Google Refine | OUseful.Info, the blog…Journalist datastores: where can you find them? A list. | Simon RogersAidInfoPlus – Mastering Aid Information for Change

Data skills

Mapping tip: how to convert and filter KML into a list with Open Refine | Online Journalism Blog
Mapbox + Weather Data
Encryption, Journalism and Free Expression | The Mozilla Blog
Data cleaning with Regular Expressions (NICAR) – Google Docs
NICAR 2016 Links and Tips – Google Docs
Teaching Data Journalism: A Survey & Model Curricula | Global Investigative Journalism Network
Data bulletproofing tips for NICAR 2016 – Google Docs
Using the command line tabula extractor tool · tabulapdf/tabula-extractor Wiki · GitHub
Talend Downloads

Github

Git Concepts – SmartGit (Latest/Preview) – Confluence
GitHub For Beginners: Don’t Get Scared, Get Started – ReadWrite
Kartograph.org
LittleSis – Profiling the powers that be

Tableau customized polygons

How can I create a filled map with custom polygons in Tableau given point data? – Stack Overflow
Using Shape Files for Boundaries in Tableau | The Last Data Bender
How to make custom Tableau maps
How to map geographies in Tableau that are not built in to the product (e.g. UK postcodes, sales areas) – Dabbling with Data
Alteryx Analytics Gallery | Public Gallery
TableauShapeMaker – Adding custom shapes to Tableau maps | Vishful thinking…
Creating Tableau Polygons from ArcGIS Shapefiles | Tableau Software
Creating Polygon-Shaded Maps | Tableau Software
Tool to Convert ArcGIS Shapefiles into Tableau Polygons | Tableau and Behold!
Polygon Maps | Tableau Software
Modeling April 2016
5 Tips for Making Your Tableau Public Viz Go Viral | Tableau Public
Google News Lab
HTML and CSS
Open Semantic Search: Your own search engine for documents, images, tables, files, intranet & news
Spatial Data Download | DIVA-GIS
Linkurious – Linkurious – Understand the connections in your data
Apache Solr –
Apache Tika – Apache Tika
Neo4j Graph Database: Unlock the Value of Data Relationships
SQL: Table Transformation | Codecademy
dc.js – Dimensional Charting Javascript Library
The People and the Technology Behind the Panama Papers | Global Investigative Journalism Network
How to convert XLS file to CSV in Command Line [Linux]
Intro to SQL (IRE 2016) · GitHub
Malik Singleton – SELECT needle FROM haystack;
Investigative Reporters and Editors | Tipsheets and links
Investigative Reporters and Editors | Tipsheets and Links

SQL_PYTHON

More data

2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
advanced-sql-nicar15/stats-functions.sql at master · anthonydb/advanced-sql-nicar15 · GitHub
2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
Malik Singleton – SELECT needle FROM haystack;
Statistical functions in MySQL • Code is poetry
Data Analysis Using SQL and Excel – Gordon S. Linoff – Google Books
Using PROC SQL to Find Uncommon Observations Between 2 Data Sets in SAS | The Chemical Statistician
mysql – Query to compare two subsets of data from the same table? – Database Administrators Stack Exchange
sql – How to add “weights” to a MySQL table and select random values according to these? – Stack Overflow
sql – Fast mysql random weighted choice on big database – Stack Overflow
php – MySQL: Select Random Entry, but Weight Towards Certain Entries – Stack Overflow
MySQL Moving average
Calculating descriptive statistics in MySQL | codediesel
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, …
R, MySQL, LM and quantreg
26318_AllText_Print.pdf
ddi-documentation-english-572 (1).pdf
Categorical Data — pandas 0.18.1+143.g3b75e03.dirty documentation
python – Loading STATA file: Categorial values must be unique – Stack Overflow
Using the CSV module in Python
14.1. csv — CSV File Reading and Writing — Python 3.5.2rc1 documentation
csvsql — csvkit 0.9.1 documentation
weight samples with python – Google Search
python – Weighted choice short and simple – Stack Overflow
7.1. string — Common string operations — Python v2.6.9 documentation
Introduction to Data Analysis with Python | Lynda.com
A Complete Tutorial to Learn Data Science with Python from Scratch
GitHub – fonnesbeck/statistical-analysis-python-tutorial: Statistical Data Analysis in Python
Verifying the email – Email Checker
A little tour of aleph, a data search tool for reporters – pudo.org (Friedrich Lindenberg)
Welcome – Investigative Dashboard Search
Investigative Dashboard
Working with CSVs on the Command Line
FiveThirtyEight’s data journalism workflow with R | useR! 2016 international R User conference | Channel 9
Six issue when installing package · Issue #3165 · pypa/pip · GitHub
python – Installing pip on Mac OS X – Stack Overflow
Source – Journalism Code, Context & Community – A project by Knight-Mozilla OpenNews
Introducing Kaggle’s Open Data Platform
NASA just made all the scientific research it funds available for free – ScienceAlert
District council code list | Statistics South Africa
How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code – Cloudera Engineering Blog
GitHub – gavinr/geojson-csv-join: A script to take a GeoJSON file, and JOIN data onto that file from a CSV file.
7 command-line tools for data science
Python Basics: Lists, Dictionaries, & Booleans
Jupyter Notebook Viewer

PYTHON FOR JOURNALISTS

New folder

Reshaping and Pivot Tables — pandas 0.18.1 documentation
Reshaping in Pandas – Pivot, Pivot-Table, Stack and Unstack explained with Pictures – Nikolay Grozev
Pandas Pivot-Table Example – YouTube
pandas.pivot_table — pandas 0.18.1 documentation
Pandas Pivot Table Explained – Practical Business Python
Pivot Tables In Pandas – Python
Pandas .groupby(), Lambda Functions, & Pivot Tables
Counting Values & Basic Plotting in Python
Creating Pandas DataFrames & Selecting Data
Filtering Data in Python with Boolean Indexes
Deriving New Columns & Defining Python Functions
Python Histograms, Box Plots, & Distributions
Resources for Further Learning
Python Methods, Functions, & Libraries
Python Basics: Lists, Dictionaries, & Booleans
Real-world Python for data-crunching journalists | TrendCT
Cookbook — agate 1.4.0 documentation
3. Power tools — csvkit 0.9.1 documentation
Tutorial — csvkit 0.9.1 documentation
4. Going elsewhere with your data — csvkit 0.9.1 documentation
2. Examining the data — csvkit 0.9.1 documentation
A Complete Tutorial to Learn Data Science with Python from Scratch
For Journalism
ProPublica Summer Data Institute
Percentage of vote change | CARTO
Data Science | Coursera
Data journalism training materials
Pythex: a Python regular expression editor
A secure whistleblowing platform for African media | afriLEAKS
PDFUnlock! – Unlock secured PDF files online for free.
The digital journalist’s toolbox: mapping | IJNet
Bulletproof Data Journalism – Course – LEARNO
Transpose columns across rows (grefine 2.5) ~ RefinePro Knowledge Base for OpenRefine
Installing NLTK — NLTK 3.0 documentation
1. Language Processing and Python
Visualize any Text as a Network – Textexture
10 tools that can help data journalists do better work, be more efficient – Poynter
Workshop Attendance
Clustering In Depth · OpenRefine/OpenRefine Wiki · GitHub
Regression analysis using Python
DataBasic.io
DataBasic.io
R for Every Survey Analysis – YouTube
Git – Book
NICAR17 Slides, Links & Tutorials #NICAR17 // Ricochet by Chrys Wu
Register for Anonymous VPN Services | PIA Services
The Bureau of Investigative Journalism
dtSearch – Text Retrieval / Full Text Search Engine
Investigation, Cybersecurity, Information Governance and eDiscovery Software | Nuix
How we built the Offshore Leaks Database | International Consortium of Investigative Journalists
Liz Telecom/Azimmo – Google Search
First Python Notebook — First Python Notebook 1.0 documentation
GitHub – JasonKessler/scattertext: Beautiful visualizations of how language differs among document types

 

Flattr this!

Data is a Team Sport: Data-Driven Journalism

- June 20, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

Cut and paste this link into your podcast app to subscribe: http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/328987340″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

In this episode we speak with two veteran data literacy practitioners who have been involved with developing data-driven journalism teams.

Our guests:

  • Eva Constantaras is a data journalist specialized in building data journalism teams in developing countries. These teams that have reported from across Latin America, Asia and East Africa on topics ranging from displacement and kidnapping by organized crime networks to extractive industries and public health. As a Google Data Journalism Scholar and a Fulbright Fellow, she developed a course for investigative and data journalism in high-risk environments.
  • Natalia Mazotte is Program Manager of School of Data in Brazil and founder and co-director of the digital magazine Gender and Number. She has a Master Degree in Communications and Culture from the Federal University of Rio de Janeiro and a specialization in Digital Strategy from Pompeu Fabra University (Barcelona/Spain). Natalia has been teaching data skills in different universities and newsrooms around Brazil. She also works as instructor in online courses in the Knight Center for Journalism in the Americas, a project from Texas University, and writes for international publications such as SGI News, Bertelsmann-Stiftung, Euroactiv and Nieman Lab.

Notes from this episode

Our first conversation on Data-Driven Journalism featured Eva Constantaras, on her work in developing data-driven journalism teams in Afghanistan and Pakistan, and Natalia Mazotte on her work in Brazil. They discussed what they have learned helping journalists think through how they can use data to drive social change. They agreed that good journalism necessarily includes data-driven approaches in order uncover facts and the root causes of societal problems.

Eva strives to motivate journalists to look beyond the fact that corruption exists and dig deeper into the causes and impacts. She has seen data journalists in Europe and North America making a choice to focus, for example, on polling data rather than breaking down the data behind the candidates’ policies. Eva sees this as a mistake and is committed to making emerging data journalists understand why this is problematic. Finally, Eva made a critique of the approach funders take in the field of data literacy, often putting too much emphasis on short-term solution rather than investing in long-term data capacity building programmes. This is something that School of Data has long struggled with from third-party funders and clients alike. It’s clear that more work needs to be done to explaining what short term programmes can and, more importantly, cannot achieve.

Natalia primarily discussed School of Data Brazil’s Gender and Number project. The project was designed to use data to move the discussion on gender equality past arguments based on traditional roles. She is concerned about the growing ‘data literacy’ gap between those with power, government and corporations, and those without power, people living in the favelas. In Brazil, the media landscape is changing as the mainstream are reporting  on ‘what’s happened’ while independent media is doing the more investigative reporting on ‘why it’s happened’.

They wanted to plug:

Readings/Resources they find inspiring for their work.

Resources contributed from the participants:

View the online conversation in full:

[youtube https://www.youtube.com/watch?v=92kAsMxw-Q4]

Flattr this!

Data is a Team Sport: Enabling Learning

- June 6, 2017 in Community, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

Cut and paste this link into your podcast app to subscribe: http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store.

In this episode we speak with two veteran data literacy practitioners who have been involved with directly engaging learners to get beyond spreadsheets to build confidence and take agency in their own learning.

[soundcloud url=”https://api.soundcloud.com/tracks/326264327″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Our guests:

  • Rahul Bhargava is a researcher and technologist specializing in civic technology and data literacy. He creates interactive websites used by hundreds of thousands, playful educational experiences across the globe, and award-winning visualizations for museum settings. As a research scientist at the MIT Center for Civic Media, Rahul leads technical development on projects ranging from interfaces for quantitative news analysis to platforms for crowd-sourced sensing.
  • Lucy Chambers initially embarked on a career as a journalist, she took a few turns which lead to a career at Open Knowledge teaching journalists how and why to work with data. She was one of the editors of the Data Journalism Handbook. She later lead the highly successful School of Data programme which extended technical training to non-profit organisations. Lately, she has focussed on delivery of software projects as a product manager. Most recently, she has been working in West Africa on health related software.

Notes from this episode

Rahul talked to us about how his experience in leading workshops and data capacity building programmes led to the creation of DataBasics.io while Lucy discussed the rationale behind School of Data shifting its focus from online curriculum to fellowships.

In order for his participants to begin to understand how to work with data, Rahul asks them to  create drawings about what a dataset tells them. He then asks the participants create a gallery of their drawings and encourages them to critically assess each other’s work.. This process of having workshop participants create and examine data visualisations led to the creation of databasic.io, a set of tools specifically designed allow users to visualise datasets in a number of different ways.

Lucy, recalled School of Data’s initial struggles developing an online curriculum that attempted to provide ‘how-to’s’ on a diverse set of data tools. Eventually, through focus groups and testing of the curriculum, School of Data began to understand that this tool centred approach was unlikely to effectively serve School of Data’s target audience. As a result, the team shifted their strategy and began to focus their attention on developing the capacity of future data trainers. This ultimately lead School of Data to create the Fellowship Programme, a programmed designed to take high potential individuals and build their capacity to be data literacy trainers and leaders.

Finally, both shared their thoughts on building data literacy at the organisational level. They agreed that a common mistake made by organisations is thinking about data use as being exclusively part of the IT function of the organisation. Rather, when they both look at ways to build an active data culture within organisations, where staff converse about and engage with data, rather than take a passive approach such as producing dashboards.

They wanted to plug:

  • Rahul is building a co-hort around further development of databasics.io. Ping him via twitter to get more information on that.
  • Lucy’s blog is Tech to Human and she writes about her work and what she’s learning. She is working on a project for MySociety called EveryPolitician and writing about it on Medium.

Readings/Resources they find inspiring for data literacy work.

View the full online conversations:

[youtube https://www.youtube.com/watch?v=Cl7FGYNAmJc]

[youtube https://www.youtube.com/watch?v=We7CIGycT-Y]

Flattr this!