Converting pdfs to data (repeat)

  • Event: 2019 IRE Conference
  • Speaker: Darla Cameron of The Texas Tribune
  • Date/Time: Sunday, Jun. 16 at 9:00am
  • Location: River Oaks A
  • Audio file: No audio file available.

This class will cover basic approaches for getting text out of PDF documents using powerful and freely available tools. We will introduce basic concepts and walk through tackling common challenges encountered with tricky PDF documents. 

This session is good for: People who are unfamiliar with PDF-to-text tools or would like to learn how these tools can be used for extracting difficult text from images embedded in a PDF document.

Speaker Bios

  • Darla Cameron is the data visuals editor at The Texas Tribune in Austin, where she leads a team of developers at the intersection of graphics and news applications. Previously, she was a graphics editor at The Washington Post. She began her career in Florida at the Tampa Bay Times after completing a fellowship at the Poynter Institute. Darla is a Colorado native with a degree in journalism from the University of Missouri. 

Related Tipsheets

No tipsheets have yet been uploaded for this event.