You are browsing the archive for Nisha Thompson.

Tool Review: WebScraper

- October 13, 2014 in Community, HowTo

Crosspost from DataMeet.org

Usually when I have any scraping to do I ask Thej  if he can do it and then take a nap. However, Thej is on vacation so I was stuck either waiting for him to come back or I could try to do it myself. It was basic text, not much html, no images, and a few pages, so I went for it with some non coder tools.

I checked the School of Data scraping section for some tools and they have a nice little section on using browser based scraping tools. I did a chrome store search and came across WebScraper.

I glanced through the video sort of paying attention got the gist of it and started to play with the tool.  It took awhile for me to figure out.  I highly recommend very carefully going through the tutorials.  The videos take you through the process but are not very clear for complete newbies like me so it took a few views to understand the hierarchy concept and how to adapt their example to the site I was scraping.

I got the hang of doing one page and then figuring out how to tell it to go to another page, again I had to spend quite a bit of time rewatching the tutorial. At the end of the day I got the data in neat columns in CSV without too much trouble.  I would recommend WebScraper for people who want to do some basic scraping.

It is as visual as you can get though the terminology is still very technical.   You have to do into the developer tools folder which can feel intimidating but ultimately satisfying in the end.

Though I’ll probably still call Thej.

Flattr this!

Data Playlists

- September 25, 2014 in HowTo

Finding ways to learn new ways to play and work with data is always a challenge. Workshops, courses, and sprints are always a really great way to learn from people, and at the School of Data Fellow are doing that all over the world. In India there are lots of languages, different levels of literacy and technology adaption so we want experiment with different ways of sharing data skills.  It can be difficult put on a workshop or do a course, so we thought let’s start creating videos that can be accessed by people . It was important that the videos be easy to replicate and bite size, and that the format was flexible enough to accommodate different ways of teaching.  So we and others can experiment with different types of videos.

So instead of a single 10 minute video on how to use Excel we are asking people to create playlists of videos that are between 2 to 5 minutes long that are one concept or process presented i neach video.

Our first video is about formatting in Excel:

Don’t like excel? Do one for Open Spreadsheets or Fusion Tables.  English is not useful for your audience? Translate each video or put in subtitles, or do your own version. Have a new way to do this same skill? Create a 2 minute video and we can add it to the playlist.  Sharing your favorite tools and tricks for working with data is the main goal of this project.

If you want to do one there a few rules:

  1. Introduce yourself
  2. Break up the lesson by technique and make each video no more than 2 to 5 minutes.
  3. Make sure they are a playlist.
  4. Upload them to youtube and tag them DataMeet and School of Data
  5. Let us know!

If you have any feedback or a video request please feel free to leave it in the comments. We will hopefully release 2 playlists every month.

Adapting from post on DataMeet.org

Flattr this!

Dispatch from India: Intro to Data Journalism Workshop in Bangalore

- September 12, 2014 in Uncategorized

This post is crossposted from DataMeet.org, an organization that promotes open data in India.

Sunday, August 31st, DataMeet worked with an Economic Times Journalist Jayadevan PK with support from School of Data, to design an intro to data journalism workshop. For a while now there has been quite a bit of interest and discussion of data journalism in India. Currently there are a few courses and events around promoting data journalism, we thought there was definitely room to start to build a few modules on working with data for storytelling. Given that we have not done too many of these we decided to do an introduction and leave it limited to a few people.

Datameet1

You can see the agenda with notes here and the resources we shared on the data journalism resource wiki page, as well as refer to the data catalog that DataMeet has been putting together.

Thanks to Knolby Media for hosting us and for School of Data for the support. Thank you to Vikras Mishra for volunteering and taking notes, pictures, and video.

We had four story tellers with us, from various backgrounds. We spent the morning doing introduction and what was their experience with data, what their definition of data journalism is and why they wanted to take this workshop. Then we had them put up some expectations so we can gauge what the afternoon should focus on.

20140831_155101

We then had Jaya go through the context of data journalism in terms of the world scale and the new digital journalism era.

Then we spent some time going over examples of good data journalism and bad.

After we went through resources people can use to get data. We touched upon the legal issues around using data and copyright issues. Then we discussed accuracy and how to properly attribute sources.

Then we demonstrated a few tools

Datameet 5

Tableau
CartoDB
Scraping tools
– Scraper wiki
– IMACROS
MapBox
QGIS

Visualization Roadmap
The participants thought understanding how to visualize would be helpful.  So we went through a sort of visualization roadmap.  Then went through stories they were working on to see how we would create a visualization and also how to examine the data and come up with a data strategy for each story.

Datameet 6 20140831_155126

Then showed some more tools to address the suggestions from the exercise.
– BHUVAN
– Timelines
– Odyssey
– Fusion Tables
– BUMP

Feedback session

Datameet2

People wanted another day to let the lessons be absorbed and some more time to actually have hands on time with the tools.  Also even at the intro level it is important to make people come prepared with stories, so they have something to apply the ideas to.

To say we learned a lot is an understatement. We will definitely be planning more intro workshops and hopefully more advanced workshops in the future, we hope to continue to learn what people think is important and will keep track and see what kinds of stories come out of these learning session.

 

Flattr this!