Scraping the web with Ruby

  • Event: 2016 CAR Conference
  • Speaker: Jeff Ernsthausen of ProPublica
  • Date/Time: Saturday, Mar. 12 at 3:30pm
  • Location: Matchless
  • Audio file: No audio file available.

This course will cover the basics of scraping websites in Ruby, with a focus on strategies for getting the data that you want out of different kinds of websites. It will start with a description of how to load and parse simple sites, and, time permitting, provide strategies for working with the ASP.NET sites that many government agencies use.

Suggested technical skills: Experience with Ruby is useful, but not necessary, as the strategies employed could easily be implemented in Python as well. However, attendees should have a basic familiarity with using a command line to execute commands (preferably in a Mac/Linux environment).

Speaker Bios

  • Jeff Ernsthausen is a data reporter at ProPublica. He joined ProPublica from the Atlanta Journal-Constitution, where he worked as a data reporter on the investigative team. Prior to his time in journalism, he worked as an economic analyst and researcher at the Federal Reserve.

Related Tipsheets

No tipsheets have yet been uploaded for this event.