Best practices for scraping: From ethics to techniques

  • Event: 2016 CAR Conference
  • Speakers: Ricardo Brom of La Nacion; David Eads of The Chicago Reporter; Amanda Hickman of Factful; Martin Burch of The Wall Street Journal
  • Date/Time: Thursday, Mar. 10 at 11:30am
  • Location: Colorado F
  • Audio file: Only members can listen to conference audio

**Moderated by Martin Burch, The Wall Street Journal

If you can see it online, you can download it to your computer, but should you? Our panelists share the consequences of their scraping efforts, good and bad, and invite your questions. We’ll review common ethical questions and show best technical practices as we walk through building a web scraper.

Read a recap of this session on the CAR Conference blog

Speaker Bios

  • Ricardo Brom is an Electronic Engineer that has been working for more than 25 years as IT Manager at La Nación (newspaper Argentina). In the last years, from LNDATA, he designed scraping tools for more than 200 sources (including updating and data transformation process), and provided data sets, technology and support to journalists to investigate and document findings, creating a journalistic knowledge base. Now he is in the newsroom, with full time dedication to DATA.

  • Martin Burch is a data developer at The Wall Street Journal. @seecmb

  • David Eads is design and delivery editor at The Chicago Reporter, where he combines journalism with software development. In the early 00’s, he helped found the Invisible Institute, where he ran a site about Chicago public housing called The View From The Ground. He later helped create FreeGeek Chicago, a community-based computer recycling organization. He has worked on visual journalism teams at the Chicago Tribune, NPR Visuals, and ProPublica Illinois.

  • Amanda Hickman teaches data investigation techniques at UC Berkeley Graduate School of Journalism and ran BuzzFeed News' Open Lab for Journalism, Technology, and the Arts.

Related Tipsheets

No tipsheets have yet been uploaded for this event.