Best practices for scraping: From ethics to techniques
**Moderated by Martin Burch, The Wall Street Journal
If you can see it online, you can download it to your computer, but should you? Our panelists share the consequences of their scraping efforts, good and bad, and invite your questions. We’ll review common ethical questions and show best technical practices as we walk through building a web scraper.
Ricardo Brom is an Electronic Engineer that has been working for more than 25 years as IT Manager at La Nación (newspaper Argentina). In the last years, from LNDATA, he designed scraping tools for more than 200 sources (including updating and data transformation process), and provided data sets, technology and support to journalists to investigate and document findings, creating a journalistic knowledge base. Now he is in the newsroom, with full time dedication to DATA.
Martin Burch is a data developer at The Wall Street Journal. @seecmb
David Eads is design and delivery editor at The Chicago Reporter, where he combines journalism with software development. In the early 00’s, he helped found the Invisible Institute, where he ran a site about Chicago public housing called The View From The Ground. He later helped create FreeGeek Chicago, a community-based computer recycling organization. He has worked on visual journalism teams at the Chicago Tribune, NPR Visuals, and ProPublica Illinois.
Amanda Hickman teaches data investigation techniques at UC Berkeley Graduate School of Journalism and ran BuzzFeed News' Open Lab for Journalism, Technology, and the Arts.
No tipsheets have yet been uploaded for this event.