By Meredith McGrath
Want to make sure your data is bulletproof and fact-checked so there aren’t any holes? Arm yourself with these tips from Tisha Thompson, investigative reporter for ESPN, and Sandhya Kambhampati, data reporter for ProPublica Illinois.
When starting out, create a text file or a Word document and record basic information on the project. Make a file folder and name it as the story’s slug. Keep all your work related to the project in this folder, including PDFs of any emails you received from a FOIA officer. Save the raw file of the date here and make a copy of it. Don’t touch the original copy, so you’ll always have the pure data on hand.
Keep a data diary
While it may sound labor-intensive, keep a data log or journal and track the changes you make to your data set. This will help you reproduce your work, and if your data analysis is ever challenged, you’ll have a specific log of exactly what you did. Document how you clean your data before you start cleaning it.
Check for smelly data
When you first get a data file, check to see what’s wrong with it. There’s no such thing as a perfectly clean data set. Always look for holes. Check for totals hidden in the bottom of your Excel file and extra characters hidden in cells. Look for nulls and missing values. Are there any spelling mistakes?
Obtain a data dictionary
Ask the agency that gave you the data for a dictionary, which will define fields. Don’t assume you know what each field is. If the agency won’t give you one, call them out on it and let them know you need this for accuracy.
Slow down, and don’t take shortcuts
After you’re done interviewing a person, you know them inside and out. Know your data the same way. If something stands out as an outlier, be aware that it could be a hidden mistake. Check on this. Be meticulous.
Talk through your work
Find someone who doesn’t use data (maybe your mom or grandma) and show your work to them. Explain what you did and why. Talking out loud helps you identify mistakes. Ask them what they want to know about the data.
Check in with academics
Consult with experts and researchers about your data analysis and any gaps you find. Have them poke holes in your analysis. Keep going back to them, especially if your work is complex.
Refrain from overloading numbers
Don’t overload a reader with a lot of numbers in your story. There’s a point where the reader glazes over and doesn’t think them through. Pick the most important information to include.
Replicate your work
Overestimate the amount of time it will take you to check your work. Re-import your data from scratch. Mistakes can happen while importing data. Have your colleagues replicate your queries. The more eyes that can look over your work, the better.
Stand by what you publish
Be brave enough to stand up for what you put out for public consumption. Add a “nerd box” to your story on your site that explains how you obtained your data and summarizes your analysis.
Meredith McGrath is a journalism student at the University of Missouri.