The IRE website will be unavailable while we complete routine maintenance on Friday, September 17 from 8-10 am CT.
IRE favicon

Tips and tricks for creating your own data

Photo by Travis Hartman

By Diorlena Natera

 

So you have a great story idea, but no data to back it up? Sarah Cohen, The New York Times; Meghan Hoyer, USA TODAY, and Matt Waite, University of Nebraska, shared advice for gathering your own data.

Advantages

It's your product. You're in charge when you make your own database. No more missing fields, sloppy data, etc.

You're serving the public interest. By creating your own data you're providing people with information that was once out of reach.

Tips

You got your hands on the paper files, but now you need to organize them. Don't start from beginning to end. Try to randomly pick pieces of paper or electronic files. Your questions are going to change as you analyze the content. You'll also find that organizations start recording things differently over time.

If you're working with websites, scrape the pages daily for repeated occurrences. Take screenshots when you start and finish tracking the information. Site administrators might try to hide things once they realized they're being watched.

Know what's already been done on the subject and copy those methods to get the story. Get in touch with other scholars and copy existing code for databases. No need to start from scratch if similar databases already exist.

Ask yourself: Is this a one-time project or something you'll keep doing? Are you willing to commit to keeping up with it? Your organization has to support your project. What will it mean for them if you leave?

Warnings

It's all on you! If the data is inaccurate you have no one to blame. Double and triple check with experts and legal advisers if necessary.

Finding your own data requires serious time and effort. Make sure the project at hand is worth it. Don't over extend yourself. You want something that's manageable and doable.

Twitter is great, but be careful with crowdsourcing. It doesn't work well on stories that are time-sensitive. It's a really hard thing to get right because you're asking people to work with you for free and on a deadline.

 

Diorlena Natera is a Dominican-American born in the Bronx, NY and raised in Lee's Summit, MO. She is studying mass communications at Savannah State University. Diorlena is a 2014 CAR Conference Knight Scholar.

 

Photo: A standing room only crowd packs the talk entiteld "When data don't exist" featuring Matt Waite, of the University Lincoln Nebraska, Meghan Hoyer, of USA Today, and Sarah Cohen, of The New York Times at NICAR 2014 in Baltimore, on Friday, Feb 27.  The panel discussed potential advantages and pitfalls associated with constructing your own data sets. Photo by Travis Hartman.

141 Neff Annex   |   Missouri School of Journalism Columbia, MO 65211   |   573-882-2042   |   info@ire.org   |   Privacy Policy
crossmenu linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram