A Gentle Introduction to Data Cleaning
This School of Data course is a gentle introduction to reducing errors by cleaning data. It was created by the Tactical Technology Collective and gives you a clear overview what can go wrong in spreadsheets and how to fix it (if it does). Take this course if you want to learn why it is important to clean data and how to do it.
Curriculum
- 5 Sections
- 17 Lessons
- Lifetime
Expand all sectionsCollapse all sections
- IntroductionIn this section, learn about the “horror stories” of where data errors in spreadsheets have led to real consequences and how you can avoid them yourself.1
- Section 1: Nuts and chewing gumIn this section, you’ll gain practical knowledge of common formatting and layout features of spreadsheets and some ideas about how to present your raw data to others so they will love you!4
- Section 2: The Invisible Man is in your spreadsheet, messing with your dataIn this section, you’ll learn how even characters which you cannot see can cause havoc in data analysis. The section will teach you to ghostbust white space and other non-printable characters such as carriage returns, spaces and tabs. Once you have removed them, you can get on with your analysis in peace.4
- Section 3: Your data is a witch’s brewThree is not a number. Nor is a million. At least not when they are typed in as text in a cell in a spreadsheet. Your spreadsheet is pedantic and needs everything to be precise and consistent. Read on to learn more.4
- Section 4: Did you bring the wrong suitcase (again)?All our spreadsheet wants from us is each cell of data to be organised and neatly packed. In this section, you will analyse data to highlight problems in the way it has been structured so as to avoid them in your own work with data.4

