Data with personally identifiable information are an invaluable tool for reporters nationwide. For beat reporters and veteran investigative journalists alike, information such as names, birth dates and addresses can make or break a story. But access to such information isn’t guaranteed, with laws that restrict the public’s access a regular source of frustration. And bills adding to those restrictions are introduced regularly.

But short of lobbying state legislatures for changes in the law, what can journalists do to get the data they request? What are some techniques and methods that journalists have used to negotiate successfully for data?

In attempt to answer that question, I asked hundreds of reporters to recount how they negotiated for data with personally identifiable information. I emailed reporters a questionnaire asking about a records request for data with personally identifiable information. About 50 responded, all with valuable lessons on how to get the data that you need.

All but five of the 47 respondents work for newspapers, with an average of 13 years of experience as journalists. Reporters wrote about requests mostly for state or local data, with 17 and 21 requests, respectively. Law enforcement was the most popular category with 13 requests, followed by education, with six. The most common request was for employee salary data, although requested data ran the gamut from hunting licenses to fire department dispatch data.  

A general takeaway that emerged from the responses was that persistence, ingenuity and good relationships with records custodians can produce positive results. Of the 40 respondents who either got the data they requested or a version of it, 17 initially were denied. They succeeded using a number of techniques, including negotiation, compromise and finding other data sources.

Sometimes the only way to get the data you need is to find other avenues – often more time-consuming ones. Isaac Wolf of the Scripps Howard News Service wanted to see if owners of stores that had been banned from accepting food stamps were simply creating shell companies that bought the banned stores. To do that he requested a list of banned stores from the U.S. Department of Agriculture, which gave him most of what he sought except for owner names.

Without the names, Wolf had no easy way to check whether people were using shell companies in order to accept food stamps again. So instead of doing a comprehensive check on all banned stores, Wolf took a handful and tracked their ownership using corporate records, health inspections, liquor licenses, and other records. Although time-consuming and less comprehensive, the method was sufficient to prove his hypothesis: Some people owning stores that had lost the right to participate in the food stamp program simply created new store names and continued to participate. The first story based on the data was published in February 2012.

A major project that Chad Day of The Arkansas Democrat-Gazette worked on required similar creativity. Day and his colleagues wanted to find out how many Arkansans charged with murder or manslaughter had committed crimes while on parole. Virtually all personally identifiable information about parolees was exempt in Arkansas, so Day and his colleagues requested a court database with case information for everyone in Arkansas charged with murder or manslaughter, which they then joined with a prison database. The court data allowed Chad to find parolees with pending criminal cases, as opposed to only those already convicted. It also had the arrest date for each offender, crucial for tracking a parolee’s path through the criminal justice system.

Among other things, the data revealed that an Arkansas Community Correction department policy released parolees of state supervision after their sentences were finished, even if they had previously stopped reporting to their parole officers. Some of those parolees went on to commit crimes, including murder. Articles based on the data started running in the summer of 2013.

Sometimes there is no way to get the data you need, in which case reporters might have to compromise.

When Andy Boyle was an intern at the Democrat-Gazette, he requested hunting license data from the Arkansas Game and Fish Commission, but was denied licensee dates of birth. He asked for month and year of birth instead, which the agency decided he could have. That was sufficient for him to join that table with a table of felons. He discovered there were convicted felons in Arkansas with hunting licenses. The paper published Boyle’s article based on the data in August 2008.

Informal agreements can also help reporters get personally identifiable information. Linda Johnson of the Lexington (Ky.) Herald-Leader requested an employee salary database for everyone working at the University of Kentucky. The university agreed to give the Herald-Leader employee gender and race information, but only on the condition that the paper would only publish this information in aggregate.

In another example of compromise, while at The San Antonio Express-News, Joe Yerardi requested a database of city employees’ salaries over multiple years. He asked for the employee ID number as well to accurately track people as they changed positions or got married. But the city said that employees used their ID numbers to buy gasoline, and the Texas Attorney General had already determined that those IDs were not a matter of public record. Yerardi then asked the city to generate a random and unique string for each employee, which they did. This allowed him to track employees through the system while allaying the city’s fears that Yerardi or his colleagues would use the ID numbers to buy gasoline. He didn’t end up using the data for any stories, though he did have it on hand if needed.

It’s also important to keep in mind that multiple agencies could have the data you need. Halle Stockton of PublicSource, an investigative news organization in Pennsylvania, wanted to find out which employers in her state had applied to pay disabled employees below minimum wage. The state’s labor department denied her request because the records contained names of disabled employees and the department wasn’t required to redact them. Stockton appealed the denial to the Pennsylvania Open Records Office, but it ruled in the department’s favor.

At the same time she filed a request with the state, however, Stockton also filed one with the U.S. Department of Labor, which collected many of the same records. She didn’t hold out much hope because the feds aren’t known for their timely responses to records. Still, within a few months, stacks of records came in the mail. Although she had to create her own database based on the paper records, Stockton got what she needed: The names and addresses of the employers as well as how much they paid their disabled employees.

If you can’t get the data you need, sometimes you need to settle for what you have. In 2009 Paul Monies of The Oklahoman tried to get the names and dates of birth of all Oklahoma state employees. The request lead to a suit that landed in the Oklahoma Supreme Court, which ruled that state employee dates of birth were not public record.

The Oklahoman needed the data to make sure they weren’t misidentifying people in stories. When it wasn’t available, Monies and his colleagues decided to use a publicly available database of registered voters. That database is incomplete because not all state employees are registered voters, but the paper had to make do.

Personally identifiable information is an invaluable resource for journalists, and problems getting it aren’t going away anytime soon. Even though almost every records battle is unique, as is each agency and records custodian, there are numerous techniques and strategies which, in combination with persistence, allow journalists to keep fighting the good fight.


Fedor Zarkhin is a database reporter for the Palm Beach Post. He wrote this article as part of his master’s degree project at the Missouri School of Journalism. Contact him at fzarkhin@pbpost.com.