You are browsing the archive for Dirk Slater.

Data is a Team Sport

- December 20, 2017 in Announcement, Data Blog, Research

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system. Our aim in producing ‘Data is a Team Sport’ was to surface learnings and present them in formats that would be accessible to data literacy practitioners.

Thanks to the efforts of governments, organizations and agencies to make their information more transparent the amount of data entering the public domain has increased dramatically in recent years. As political and economic forces become more adept at using data, we enter a new era of data literacy were just being able to understand information is not enough. In this project, we aimed to engage data literacy practitioners to capture lessons learned and examine how their methodologies are shifting and adapting. We also wanted to better understand how they perceived the data literacy ecosystem: the diverse set of actors needed to enable and support the use of data in social change work.

Conversation Guests PodCast
Enabling Learning Rahul Bhargava & Lucy Chambers [soundcloud url=”https://api.soundcloud.com/tracks/326264327″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
Data Driven Journalism Eva Constantaras & Natalia Mazotte [soundcloud url=”https://api.soundcloud.com/tracks/328987340″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
One on One with Daniela Lepiz Daniela Lepiz [soundcloud url=”https://api.soundcloud.com/tracks/331054739″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
Advocacy Organisations Milena Marin & Sam Leon [soundcloud url=”https://api.soundcloud.com/tracks/332772865″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
One on One with Heather Leson Heather Leson [soundcloud url=”https://api.soundcloud.com/tracks/333725312″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
One on One with Friedhelm Weinberg Friedhelm Weinberg [soundcloud url=”https://api.soundcloud.com/tracks/335294348″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
Mentors, Mediators and Mad Skills Emma Prest & Tin Geber [soundcloud url=”https://api.soundcloud.com/tracks/336474972″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]
Government Priorities and Incentives Ania Calderon & Tamara Puhovski [soundcloud url=”https://api.soundcloud.com/tracks/337462183″ params=”color=#ff5500&inverse=false&auto_play=false&show_user=false” width=”100%” height=”20″ iframe=”true” /]

Read the rest of this entry →

Flattr this!

Data is a Team Sport: Government Priorities and Incentives

- August 13, 2017 in Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/337462183″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

The conversation in this episode focuses on the challenges of getting governments to prioritise data literacy both externally and internally, and incentives to produce open-data and features:

  • Ania Calderon, Executive Director at the Open Data Charter, a collaboration between governments and organisations working to open up data based on a shared set of principles. For the past three years, she led the National Open Data Policy in Mexico, delivering a key presidential mandate. She established capacity building programs across more than 200 public institutions.
  • Tamara Puhovskia sociologist, innovator, public policy junky and an open government consultant. She describes herself as a time traveler journeying back to 19th and 20th century public policy centers and trying to bring them back to the future.

Notes from the conversation:

The conversation focused on challenges they face pressuring governments to open up their data and make it available for public use.  In order for governments to develop and maintain open data programmes, there needs to be an ecosystem of data literate actors that includes knowledgeable civil servants and incentivised elected officials being held to account by a critical thinking citizenry supported by smart open data advocates. Governments incentivisation for open data can’t be solely based on budgetary or monetary savings, they need to be motivated to want to use data to improve the effectiveness of their programs.

Once elected officials are elected, it’s too late to educate and motivate them enough to push for open data programmes as they have too many other priorities and pressures. They have to have a level of data literacy that will provide enough knowledge, motivation and commitment to open data before they are  elected.

Making arguments for open data is not having adequate impact, but we need to be able to provide good solid stories and examples of its benefits. Those advocating for open data have perhaps been too optimistic that citizens would find the data useful once it has been released. A ‘supply and demand’ frame is still an important way to look at open data projects and assess their ability to have impact.

Access to government produced open data is critical for healthy functioning democracies, but government’s ability to release open data is heavily dependent on their own capacities to produce and work with data.  There is currently not enough technical support for public officials tasked with implementing open data projects.

Resources mentioned in the conversation:

Also, not mentioned, but be sure to check out Tamara’s work on Open Youth

View the full online conversation:

[youtube https://www.youtube.com/watch?v=kK8Pz4DZ06s]

Flattr this!

Data is a Team Sport: Mentors Mediators and Mad Skills

- August 7, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/336474972″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

This episode features:

  • Emma Prest oversees the running of DataKind UK, leading the community of volunteers and building understanding about what data science can do in the charitable sector. Emma sits on the Editorial Advisory Committee at the Bureau of Investigative Journalism. She was previously a programme coordinator at Tactical Tech, providing hands-on help for activists using data in campaigns. 
  • Tin Geber has been working on the intersection of technology, art and activism for most of the last decade. In his previous role as Design and Tech Lead for The Engine Room, he developed role-playing games for human rights activists; collaborated on augmented reality transmedia projects; and helped NGOs around the world to develop creative ways to combine technology and human rights.

Notes from the conversation

In this episode we discussed ways to move organisations beyond data literacy and to the point of data maturity, where organisations are able to manage data-driven projects on their own. Training in itself can be helpful with hard skills, such as how to do analysis, but in terms of learning how to run a data projects, Emma asserts that you have to run a project with them as it takes a lot of hand-holding. There needs to be commitment within the entire organisation to implement a data project, as it will take support and inputs from all parts.  The goal of DataKind UK’s long-term engagements is to help an organisation to build an understanding of what is good data practice.  

Tin points out how critical it is for organisations to be able to learn from others that are working in similar contexts and environments. While there are international networks and resources that are accessible, his biggest challenge is identifying local networks that his clients can connect with and receive peer support.

Another critical element for reaching data maturity, is the existence of champions striving to develop good data practice within an organisation. Tin and Emma both acknowledge that these types of individuals are rare, have a unique skill set, and are often not in senior management positions. There’s a need for greater support for these individuals in the form of: mentoring, networks of practice and training courses that focus on how other organisations have successfully run data projects.

Intermediaries are often focused on demystifying new technologies for civil society organisations. Currently a lot of emphasis on grappling with the implications of machine learning, but it tends to point out the negative impacts (i.e. Cathy O’Neil’s book on ‘Weapons of Math Destruction’), and there needs to be greater examination of positive impacts and stories of CSO’s using it well and contributing to social good.

DataKind UK’s resources:

Tin’s resources:

Resources that are inspiring Emma’s Work:

Resources that are inspiring Tin’s work:

  • DataBasic.io – A a suite of easy-to-use web tools for beginners that introduce concepts of working with data
  • Media Manipulation and Disinformation Online – Report from Data and Society on how false or misleading information is having real and negative effects on the public consumption of news.
  • Raw Graphs – The missing link between spreadsheets and data visualization

View the full online conversation:

[youtube https://www.youtube.com/watch?v=GmsPcLw4Mec&w=560&h=315]

Flattr this!

Data is a Team Sport: One on One with Friedhelm Weinberg

- July 29, 2017 in Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/335294348″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Friedhelm Weinberg is the Executive Director of Human Rights Information and Documentation Systems (HURIDOCS), an NGO that supports organisations and individuals to gather, analyse and harness information to promote and protect human rights. 

Notes from the Conversation

We discussed at what it takes to be both a tool developer and a capacity builder. While the two disciplines inform and build upon each other, Friedhelm strongly feels that the capacity building work needs to come first and be a foundation for tool development. The starting point for human rights defenders is to have a clear understanding of what they want to do with data before they start collecting it.

Where in the past they used external developers to create tools, they have recently hired developers to be on staff and work side by side with their capacity builders. They have also recently been building their own capacity to help human right defenders utilise machine learning for processing large amounts of documents and extracting information about human rights abuses.

Specific projects within Huridocs he talked about:

  • Uwazi is an open-source solution for building and sharing document collections
  • The Collaboratory is their knowledge sharing network for practitioners focusing on information management and human rights documentation.

Readings/Resources that are inspiring his work:

View the full online conversation:

[youtube https://www.youtube.com/watch?v=00WbqeDojP0&w=560&h=315]

Flattr this!

Data is a Team Sport: One on One with Heather Leson

- July 19, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/333725312″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

This episode features a one on one conversation with Heather Leson, the Data Literacy Lead at International Federation of Red Cross and Red Crescent Societies where her mandate includes global data advocacy, data literacy and data training programs in partnership with the 190 national societies and the 13 million volunteers. She is a past Board Member at the Humanitarian OpenStreetMap Team (4 years), Peace Geeks (1 year), and an Advisor for MapSwipe – using gamification systems to crowdsource disaster-based satellite imagery. Previously, she worked as Social Innovation Program Manager, Qatar Computing Research Institute (Qatar Foundation) Director of Community Engagement, Ushahidi, and Community Director, Open Knowledge (School of Data).

Notes from the Conversation:

Heather talked about the need for Humanitarian organisations to lead their data projects with a ‘do no harm’ approach, and how keeping data and individual information they collect safe was paramount. During her first 10 months developing a data literacy program for the Federation, she focused on identifying internal expertise and providing opportunities for peer exchange. She has relied heavily on external knowledge, expertise and resources that have been shared amongst data literacy practitioners through participating in networks and communities such as School of Data.

Heather’s Resources

Blogs/websites

Heather’s work

The full online conversation:
[youtube https://www.youtube.com/watch?v=Vq7JJE_U7sg&w=560&h=315]

Flattr this!

Data is a Team Sport: Advocacy Organisations

- July 12, 2017 in Community, Data Blog, Event report

Data is a Team Sport is our open-research project exploring the data literacy eco-system and how it is evolving in the wake of post-fact, fake news and data-driven confusion.  We are producing a series of videos, blog posts and podcasts based on a series of online conversations we are having with data literacy practitioners.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/332772865″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

In this episode we discussed data driven advocacy organisations with:

  • Milena Marin is Senior Innovation Campaigner at Amnesty International. She is currently leads Amnesty Decoders – an innovative project aiming to engage digital volunteers in documenting human right violations using new technologies. Previously she worked as programme manager of School of Data. She also worked for over 4 years with Transparency International where she supported TI’s global network to use technology in the fight against corruption.
  • Sam Leon, is Data Lead at Global Witness, focusing on the use of data to fight corruption and how to turn this information into change making stories. He is currently working with a coalition of data scientists, academics and investigative journalists to build analytical models and tools that enable anti-corruption campaigners to understand and identify corporate networks used for nefarious and corrupt practices.

Notes from the Conversation

In order to get their organisations to see the value and benefit of using data, they both have had to demonstrate results and have looked for opportunities where they could show effective impact. Advocates are often quick to see data and new technologies as easy answers to their challenges, yet have difficulty in foreseeing the realities of implementing complex projects that utilise them.

Data provides advocates with ways to reveal the extent of a problem and  provide depth to qualitative and individual stories.  Milena credits the work of School of Data for the fact that journalists now expect Amnesty to back up their stories with data. However, the term ‘fake news’ is used to discredit their work and as a result they work harder at presenting verifiable data.

Data projects also can provide additional benefit to advocacy organisations by engaging stakeholders. Amnesty’s decoder project has involved 45,000 volunteers, and along with being able to extract data from a huge amount of video, it has also provided those volunteers with a deeper understanding of Amnesty’s work.  Global Witness is striving to make their data publicly accessible so it can provide benefit to their allies. Global Witness acknowledges that they are still are learning about ethical and privacy considerations before open data-sets can be a default. Both organisations are actively learning

They also touched on how important it is for their organisations to learn from others. They  look to external consultants and intermediaries to help fill organisational gaps in expertise in using data. They find it critical for organisations like Open Knowledge and School of Data to convene practitioners from different disciplines to share methodologies and lessons learned. During the conversation, they offered to share their own internal curriculums with each other.

More about their work

Milena

Sam

Resources and Readings

From FabRiders

View the Full Conversation:
[youtube https://www.youtube.com/watch?v=Row0OtRhlao]

 

Flattr this!

Data is a Team Sport: One on One with Daniela Lepiz

- July 3, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/331054739″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

This episode features a one on one conversation with Daniela Lepiz, a Costa Rican data journalist and trainer, who is currently the Investigation Editor for CENOZO, a West African Investigative Journalism Project that aims to promote and support cross border data investigation and open data in the region. She has a masters degree in data journalism from the Rey Juan Carlos University in Madrid, Spain. Previously involved with OpenUP South Africa working with journalists to produce data driven stories.  Daniela is also a trainer for the Tanzania Media Foundation and has been involved in many other projects with South African Media, La Nacion in Costa Rica and other international organisations.

Notes from the conversation

Daniela spoke to us from Burkina Faso and reflected on the importance of data-driven journalism in holding power to accountability. Her project aims to train and support  journalists working across borders in West Africa to use data to expose corruption and human rights violation. To identify journalists to participate in the project, they seek individuals who are experienced, passionate and curious. The project engages media houses, such as Premium Times in Nigeria, to ensure that there are respected outlets to publish their stories. Daniela raised the following points:

  • As the media landscape continues to evolve, data literacy is increasing becoming a required competency
  • Journalists do not necessarily have a background in mathematics or statistics and are often intimidated by the idea of having to these concepts in their stories.
  • Data stories are best done in teams of people with complementary skills. This can go against a traditional approach to journalism in which journalists work alone and tightly guard their sources.
  • It is important that data training programmes also work with, and better understand the needs of journalists.

Resources she finds inspiring

Her blogs posts

The full online conversation:

[youtube https://www.youtube.com/watch?v=9l4SI6lm130]

Daniela’s bookmarks!

These are the resources she uses the most often.

.Rddj – Resources for doing data journalism with RComparing Columns in Google Refine | OUseful.Info, the blog…Journalist datastores: where can you find them? A list. | Simon RogersAidInfoPlus – Mastering Aid Information for Change

Data skills

Mapping tip: how to convert and filter KML into a list with Open Refine | Online Journalism Blog
Mapbox + Weather Data
Encryption, Journalism and Free Expression | The Mozilla Blog
Data cleaning with Regular Expressions (NICAR) – Google Docs
NICAR 2016 Links and Tips – Google Docs
Teaching Data Journalism: A Survey & Model Curricula | Global Investigative Journalism Network
Data bulletproofing tips for NICAR 2016 – Google Docs
Using the command line tabula extractor tool · tabulapdf/tabula-extractor Wiki · GitHub
Talend Downloads

Github

Git Concepts – SmartGit (Latest/Preview) – Confluence
GitHub For Beginners: Don’t Get Scared, Get Started – ReadWrite
Kartograph.org
LittleSis – Profiling the powers that be

Tableau customized polygons

How can I create a filled map with custom polygons in Tableau given point data? – Stack Overflow
Using Shape Files for Boundaries in Tableau | The Last Data Bender
How to make custom Tableau maps
How to map geographies in Tableau that are not built in to the product (e.g. UK postcodes, sales areas) – Dabbling with Data
Alteryx Analytics Gallery | Public Gallery
TableauShapeMaker – Adding custom shapes to Tableau maps | Vishful thinking…
Creating Tableau Polygons from ArcGIS Shapefiles | Tableau Software
Creating Polygon-Shaded Maps | Tableau Software
Tool to Convert ArcGIS Shapefiles into Tableau Polygons | Tableau and Behold!
Polygon Maps | Tableau Software
Modeling April 2016
5 Tips for Making Your Tableau Public Viz Go Viral | Tableau Public
Google News Lab
HTML and CSS
Open Semantic Search: Your own search engine for documents, images, tables, files, intranet & news
Spatial Data Download | DIVA-GIS
Linkurious – Linkurious – Understand the connections in your data
Apache Solr –
Apache Tika – Apache Tika
Neo4j Graph Database: Unlock the Value of Data Relationships
SQL: Table Transformation | Codecademy
dc.js – Dimensional Charting Javascript Library
The People and the Technology Behind the Panama Papers | Global Investigative Journalism Network
How to convert XLS file to CSV in Command Line [Linux]
Intro to SQL (IRE 2016) · GitHub
Malik Singleton – SELECT needle FROM haystack;
Investigative Reporters and Editors | Tipsheets and links
Investigative Reporters and Editors | Tipsheets and Links

SQL_PYTHON

More data

2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
advanced-sql-nicar15/stats-functions.sql at master · anthonydb/advanced-sql-nicar15 · GitHub
2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
Malik Singleton – SELECT needle FROM haystack;
Statistical functions in MySQL • Code is poetry
Data Analysis Using SQL and Excel – Gordon S. Linoff – Google Books
Using PROC SQL to Find Uncommon Observations Between 2 Data Sets in SAS | The Chemical Statistician
mysql – Query to compare two subsets of data from the same table? – Database Administrators Stack Exchange
sql – How to add “weights” to a MySQL table and select random values according to these? – Stack Overflow
sql – Fast mysql random weighted choice on big database – Stack Overflow
php – MySQL: Select Random Entry, but Weight Towards Certain Entries – Stack Overflow
MySQL Moving average
Calculating descriptive statistics in MySQL | codediesel
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, …
R, MySQL, LM and quantreg
26318_AllText_Print.pdf
ddi-documentation-english-572 (1).pdf
Categorical Data — pandas 0.18.1+143.g3b75e03.dirty documentation
python – Loading STATA file: Categorial values must be unique – Stack Overflow
Using the CSV module in Python
14.1. csv — CSV File Reading and Writing — Python 3.5.2rc1 documentation
csvsql — csvkit 0.9.1 documentation
weight samples with python – Google Search
python – Weighted choice short and simple – Stack Overflow
7.1. string — Common string operations — Python v2.6.9 documentation
Introduction to Data Analysis with Python | Lynda.com
A Complete Tutorial to Learn Data Science with Python from Scratch
GitHub – fonnesbeck/statistical-analysis-python-tutorial: Statistical Data Analysis in Python
Verifying the email – Email Checker
A little tour of aleph, a data search tool for reporters – pudo.org (Friedrich Lindenberg)
Welcome – Investigative Dashboard Search
Investigative Dashboard
Working with CSVs on the Command Line
FiveThirtyEight’s data journalism workflow with R | useR! 2016 international R User conference | Channel 9
Six issue when installing package · Issue #3165 · pypa/pip · GitHub
python – Installing pip on Mac OS X – Stack Overflow
Source – Journalism Code, Context & Community – A project by Knight-Mozilla OpenNews
Introducing Kaggle’s Open Data Platform
NASA just made all the scientific research it funds available for free – ScienceAlert
District council code list | Statistics South Africa
How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code – Cloudera Engineering Blog
GitHub – gavinr/geojson-csv-join: A script to take a GeoJSON file, and JOIN data onto that file from a CSV file.
7 command-line tools for data science
Python Basics: Lists, Dictionaries, & Booleans
Jupyter Notebook Viewer

PYTHON FOR JOURNALISTS

New folder

Reshaping and Pivot Tables — pandas 0.18.1 documentation
Reshaping in Pandas – Pivot, Pivot-Table, Stack and Unstack explained with Pictures – Nikolay Grozev
Pandas Pivot-Table Example – YouTube
pandas.pivot_table — pandas 0.18.1 documentation
Pandas Pivot Table Explained – Practical Business Python
Pivot Tables In Pandas – Python
Pandas .groupby(), Lambda Functions, & Pivot Tables
Counting Values & Basic Plotting in Python
Creating Pandas DataFrames & Selecting Data
Filtering Data in Python with Boolean Indexes
Deriving New Columns & Defining Python Functions
Python Histograms, Box Plots, & Distributions
Resources for Further Learning
Python Methods, Functions, & Libraries
Python Basics: Lists, Dictionaries, & Booleans
Real-world Python for data-crunching journalists | TrendCT
Cookbook — agate 1.4.0 documentation
3. Power tools — csvkit 0.9.1 documentation
Tutorial — csvkit 0.9.1 documentation
4. Going elsewhere with your data — csvkit 0.9.1 documentation
2. Examining the data — csvkit 0.9.1 documentation
A Complete Tutorial to Learn Data Science with Python from Scratch
For Journalism
ProPublica Summer Data Institute
Percentage of vote change | CARTO
Data Science | Coursera
Data journalism training materials
Pythex: a Python regular expression editor
A secure whistleblowing platform for African media | afriLEAKS
PDFUnlock! – Unlock secured PDF files online for free.
The digital journalist’s toolbox: mapping | IJNet
Bulletproof Data Journalism – Course – LEARNO
Transpose columns across rows (grefine 2.5) ~ RefinePro Knowledge Base for OpenRefine
Installing NLTK — NLTK 3.0 documentation
1. Language Processing and Python
Visualize any Text as a Network – Textexture
10 tools that can help data journalists do better work, be more efficient – Poynter
Workshop Attendance
Clustering In Depth · OpenRefine/OpenRefine Wiki · GitHub
Regression analysis using Python
DataBasic.io
DataBasic.io
R for Every Survey Analysis – YouTube
Git – Book
NICAR17 Slides, Links & Tutorials #NICAR17 // Ricochet by Chrys Wu
Register for Anonymous VPN Services | PIA Services
The Bureau of Investigative Journalism
dtSearch – Text Retrieval / Full Text Search Engine
Investigation, Cybersecurity, Information Governance and eDiscovery Software | Nuix
How we built the Offshore Leaks Database | International Consortium of Investigative Journalists
Liz Telecom/Azimmo – Google Search
First Python Notebook — First Python Notebook 1.0 documentation
GitHub – JasonKessler/scattertext: Beautiful visualizations of how language differs among document types

 

Flattr this!

Data is a Team Sport: Data-Driven Journalism

- June 20, 2017 in Community, Data Blog, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

Cut and paste this link into your podcast app to subscribe: http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher.

[soundcloud url=”https://api.soundcloud.com/tracks/328987340″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

In this episode we speak with two veteran data literacy practitioners who have been involved with developing data-driven journalism teams.

Our guests:

  • Eva Constantaras is a data journalist specialized in building data journalism teams in developing countries. These teams that have reported from across Latin America, Asia and East Africa on topics ranging from displacement and kidnapping by organized crime networks to extractive industries and public health. As a Google Data Journalism Scholar and a Fulbright Fellow, she developed a course for investigative and data journalism in high-risk environments.
  • Natalia Mazotte is Program Manager of School of Data in Brazil and founder and co-director of the digital magazine Gender and Number. She has a Master Degree in Communications and Culture from the Federal University of Rio de Janeiro and a specialization in Digital Strategy from Pompeu Fabra University (Barcelona/Spain). Natalia has been teaching data skills in different universities and newsrooms around Brazil. She also works as instructor in online courses in the Knight Center for Journalism in the Americas, a project from Texas University, and writes for international publications such as SGI News, Bertelsmann-Stiftung, Euroactiv and Nieman Lab.

Notes from this episode

Our first conversation on Data-Driven Journalism featured Eva Constantaras, on her work in developing data-driven journalism teams in Afghanistan and Pakistan, and Natalia Mazotte on her work in Brazil. They discussed what they have learned helping journalists think through how they can use data to drive social change. They agreed that good journalism necessarily includes data-driven approaches in order uncover facts and the root causes of societal problems.

Eva strives to motivate journalists to look beyond the fact that corruption exists and dig deeper into the causes and impacts. She has seen data journalists in Europe and North America making a choice to focus, for example, on polling data rather than breaking down the data behind the candidates’ policies. Eva sees this as a mistake and is committed to making emerging data journalists understand why this is problematic. Finally, Eva made a critique of the approach funders take in the field of data literacy, often putting too much emphasis on short-term solution rather than investing in long-term data capacity building programmes. This is something that School of Data has long struggled with from third-party funders and clients alike. It’s clear that more work needs to be done to explaining what short term programmes can and, more importantly, cannot achieve.

Natalia primarily discussed School of Data Brazil’s Gender and Number project. The project was designed to use data to move the discussion on gender equality past arguments based on traditional roles. She is concerned about the growing ‘data literacy’ gap between those with power, government and corporations, and those without power, people living in the favelas. In Brazil, the media landscape is changing as the mainstream are reporting  on ‘what’s happened’ while independent media is doing the more investigative reporting on ‘why it’s happened’.

They wanted to plug:

Readings/Resources they find inspiring for their work.

Resources contributed from the participants:

View the online conversation in full:

[youtube https://www.youtube.com/watch?v=92kAsMxw-Q4]

Flattr this!

Data is a Team Sport: Enabling Learning

- June 6, 2017 in Community, Event report

Data is a Team Sport is a series of online conversations held with data literacy practitioners in mid-2017 that explores the ever evolving data literacy eco-system.

Cut and paste this link into your podcast app to subscribe: http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store.

In this episode we speak with two veteran data literacy practitioners who have been involved with directly engaging learners to get beyond spreadsheets to build confidence and take agency in their own learning.

[soundcloud url=”https://api.soundcloud.com/tracks/326264327″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

Our guests:

  • Rahul Bhargava is a researcher and technologist specializing in civic technology and data literacy. He creates interactive websites used by hundreds of thousands, playful educational experiences across the globe, and award-winning visualizations for museum settings. As a research scientist at the MIT Center for Civic Media, Rahul leads technical development on projects ranging from interfaces for quantitative news analysis to platforms for crowd-sourced sensing.
  • Lucy Chambers initially embarked on a career as a journalist, she took a few turns which lead to a career at Open Knowledge teaching journalists how and why to work with data. She was one of the editors of the Data Journalism Handbook. She later lead the highly successful School of Data programme which extended technical training to non-profit organisations. Lately, she has focussed on delivery of software projects as a product manager. Most recently, she has been working in West Africa on health related software.

Notes from this episode

Rahul talked to us about how his experience in leading workshops and data capacity building programmes led to the creation of DataBasics.io while Lucy discussed the rationale behind School of Data shifting its focus from online curriculum to fellowships.

In order for his participants to begin to understand how to work with data, Rahul asks them to  create drawings about what a dataset tells them. He then asks the participants create a gallery of their drawings and encourages them to critically assess each other’s work.. This process of having workshop participants create and examine data visualisations led to the creation of databasic.io, a set of tools specifically designed allow users to visualise datasets in a number of different ways.

Lucy, recalled School of Data’s initial struggles developing an online curriculum that attempted to provide ‘how-to’s’ on a diverse set of data tools. Eventually, through focus groups and testing of the curriculum, School of Data began to understand that this tool centred approach was unlikely to effectively serve School of Data’s target audience. As a result, the team shifted their strategy and began to focus their attention on developing the capacity of future data trainers. This ultimately lead School of Data to create the Fellowship Programme, a programmed designed to take high potential individuals and build their capacity to be data literacy trainers and leaders.

Finally, both shared their thoughts on building data literacy at the organisational level. They agreed that a common mistake made by organisations is thinking about data use as being exclusively part of the IT function of the organisation. Rather, when they both look at ways to build an active data culture within organisations, where staff converse about and engage with data, rather than take a passive approach such as producing dashboards.

They wanted to plug:

  • Rahul is building a co-hort around further development of databasics.io. Ping him via twitter to get more information on that.
  • Lucy’s blog is Tech to Human and she writes about her work and what she’s learning. She is working on a project for MySociety called EveryPolitician and writing about it on Medium.

Readings/Resources they find inspiring for data literacy work.

View the full online conversations:

[youtube https://www.youtube.com/watch?v=Cl7FGYNAmJc]

[youtube https://www.youtube.com/watch?v=We7CIGycT-Y]

Flattr this!

Research Results Part 6: Data Literacy Research References and Resources

- February 11, 2016 in Uncategorized

Even though work in the field of data literacy can feel a bit lonely at times, truth is it is not entirely new and undocumented. During the research process that has been described over these blog posts, we have been lucky to come across valuable sources of information on the topic – researchers and practitioners have devoted writing time to data literacy in civil society and in academia.

To close off the blog posts sharing our main findings, we found it suitable to share a bit of information about the resources that informed the process.

A quick dive into the history of data literacy

Even though data literacy efforts in civil society might seem recent, they fit into a much longer history of numeracy, statistical literacy (and, of course, literacy in general). When looking into the broader literature, we found articles devoting time to narrow and define this field, especially as compared to others. We recommend taking a look at:

For a shorter (but comprehensive) account of broader research in this field, we found Data Pop Alliance’s Beyond Data Literacy: Reinventing Community Engagement and Empowerment in the Age of Data to be illuminating.

The origins of School of Data

If you were around in School of Data in 2012, the information in tis section might be redundant for you… but many of the newer School of Data community members haven’t had the chance to learn how it all started.

We also want to point out to Sam Leon’s blog post talking about his embedded fellowship in Global Witness – one of School of Data’s first experiments with longer term processes.

Academic research meets data literacy work

Data literacy training efforts in civil society are similar to some of those documented by academic researchers, and that’s why we decided to take a look at how they are being discussed in the literature. Sources that we recommend:

Data literacy in civil society

Perhaps not in journal articles, but civil society organizations and individuals around the world have also devoted efforts to the documentation of their work in the field. Some of the highlights:

Thank you for participating and following the data literacy research process we underwent! Our blog post series has now been completed and we encourage you to take a look at it. If you want to send feedback or get in touch, please do so at dataliteracy [at] fabriders.net.

Flattr this!