The Webpage is the API – Scraping Resources

December 4, 2012 in HowTo


Webscraping is one of the most powerful techniques for obtaining data – it allows you to treat nearly any website out there as data sources. Thus allowing you to create your own data collections and work freely with the data. Webscraping is very basic for most of the people with programming knowledge – here we list some of the resources to go out and learn webscraping.

Propublica has a basic overview of tools and programs

Scraping without Programming:

Scraperwiki comes with Tutorials for Python,Ruby and PHP

Scraping for Journalists – A good ebook on Scraping


Javascript (Node.js):
Tilo Mitra: Web scraping with Node.js

If you have more Resources to share: Let us know in the comments.

Flattr this!