A nine year-old girl was beaten to death by her foster brother in Kentucky, but the tragedy did not become a statistic. The state agency in charge of counting abused children didn’t categorize it as a death, attorney Jon Fleischaker said in the “Legal issues, access and big data” session.
Local and state government goes to great lengths to hide what they find disparaging from people who want to see it, like reporters, he said.
A legal battle eventually pried the data from the recalcitrant agency’s hands, but Fleischaker said the case is instructive.
“The data that goes into any statistical analysis or database, you have to look carefully how it’s compiled because agencies will consistently change definitions,” he said. Black is white, up is down and a death is not a death. “I think (reporters) have to be really, really careful.”
If you’re not cynical already, time to start, he said.
Since you don’t know statistics have been compiled, Sarah Cohen of the New York Times recommends approaching the story from the ground up. Don’t start your reporting with an aggregated statistic, start reporting with reporting: interviews, clip searches, observation. Once you have your baseline, go see if the data bears out the story.
Once you dig into the stats, ask, “compared to what?” Sure you found something interesting, but how does it fit into a broader context? If the data does not tell the same story as your reporting, you might rethink your story’s angle.
Cohen also urged getting to know what is NOT represented by the data. Should what you’re looking for be in the database? Are you sure? Maybe the agency didn’t include all data elements in your request. Maybe the agency doesn’t even compile those numbers in the database you requested.
Backstop your data with other records requests. Each record in the database has its non-data equivalent, which will help you understand the numbers even better.
“I think we have to view databases as the index to records, not the records themselves,” Cohen said.