Sometimes you’re not hungry for a background-heavy lunch but like to snack
on something fast: here are some recipes! This section contains small snippets that
will help you in the process of data wrangling. They might be small useful tips or
full-blown tutorials on tools or topics. Come in and find out!
on something fast: here are some recipes! This section contains small snippets that
will help you in the process of data wrangling. They might be small useful tips or
full-blown tutorials on tools or topics. Come in and find out!
- How to find data
- Walkthrough: Downloading Data from the World Bank
- Liberating HTML Data Tables
- Scraping websites using the Scraper extension for Chrome
- A short introduction to HTML
- Scraping – Beyond the Basics
- Scraping multiple Pages using the Scraper Extension and Refine
- Extracting Data from PDFs using Tabula
- Cleaning up Data Scraped from the Web
- Sorting Data with Spreadsheets
- Filtering Data
-
Spreadsheet Formulae
-
A quick introduction to common spreadsheet symbols
- The symbols
- Tip: Get your symbols in the right order
- Walkthrough: Using spreadsheets to add up values.
- Walkthrough: Using a spreadsheet to subtract values.
- Walkthrough: Multiplication and Division
- Walkthrough: Copying formulae sideways
- Walkthrough: Minimum and Maximum Values
- Walkthrough: Dealing with empty cells
-
A quick introduction to common spreadsheet symbols
- Geocoding Data in a Google Docs Spreadsheet
-
Using a spreadsheet to clean up a dataset
- Introduction
- Problem 1: Showing the data plainly
- Problem 2: Whitespace and new lines – data that shouldn’t be there
- Problem 3: Blank cells – missing data that should be there
- Problem 4: Fixing numbers that aren’t numbers
- Problem 5: Structural problems – data in inconvenient places
- Problem 6: from “banabas” to “bananas” – dealing with inconsistencies in data
- Finishing touches
- Cleaning Data with Refine
- Creating Line Charts
- Walkthrough: Scatterplot
- Creating a Choropleth map
- Walkthrough: Presenting our information as a webpage
- Harvesting and Analyzing Tweets
Last updated on Dec 10, 2013.