Intermediate PDF: Using OCR to extract data from PDFs

  • Event: 2017 IRE Conference
  • Speaker: Miguel Barbosa of CitizenAudit
  • Date/Time: Saturday, Jun. 24 at 4:30pm
  • Location: Pinnacle Peak 1
  • Audio file: No audio file available.

Learn how to use tools for extracting text from documents. The seminar will discuss the fundamentals of knowing the best tool for the job, a walk-through using online web applications, and an introduction to cracking tough cases using Optical Character Recognition (OCR)

This class is best for: People who are familiar with basic pdf extraction tools but would like to learn how to use OCR from the command line.

 

Speaker Bios

  • Miguel Barbosa is the Co-Founder & Ceo of CitizenAudit.org a tool designed to help journalists and investigators research nonprofits. Prior to this, Miguel worked as an analyst at a Hedgefund in Chicago.

Related Tipsheets

No tipsheets have yet been uploaded for this event.