If you fill out the "Forgot Password" form but don't get an email to reset your password within 5-10 minutes, please email logistics@ire.org for assistance.
If you’re tasked with covering business, you might want to check out a new feature from Sqoop. The document-mining site recently announced a new geographic search/location filter feature for SEC filings. That means that reporters can look for all SEC form types, or just a subset such as Form Ds or 8-Ks, in their state. And if you want to get even more specific, you can narrow your location even further by proximity to a specific latitude/longitude.
You can also save these location-based searches as alerts, which means you’ll get an email whenever a new document is filed that matches your criteria. That means no more searching the same sites for the same keywords every week.
You can learn more about Sqoop’s new feature on their website.
Bill Hankes of Sqoop presented at the 2016 CAR Conference. You can download his tipsheet here.
Investigative broadcast journalists across the country gathered at the 2016 CAR Conference and shared some of their secrets for bringing data to life on TV.
1. Get raw data
KXAS-TV producer Eva Parks explained how her team requested complaint and violation records concerning noise at the Dallas Love Field airport. Unfortunately, the city of Dallas responded with simplified infographics summarizing the data. She said such documents are easy to look at, but not useful for journalists. Parks stressed the importance of getting raw data and looking at the numbers just as they would appear on the computer of the public servant working with the data.
2. Look into public meetings
Jeremy Jojola of KUSA-Denver talked about how his investigative team overcame the challenge of finding people to talk to about numbers for a six-part series, Citation Nation. Analyzing 270 different town budget and revenue reports, the KUSA team found that police fines and fees make up close to or more than 50 percent of the revenue for some towns in Colorado. Not surprisingly, KUSA struggled to find public officials who would explain the numbers.
Public meetings can be useful in these cases, Jojola said. Officials often feel less ambushed and respond better to questions when they happen in a public forum. Public hearings and meetings are also a great opportunity to observe public reaction and find members of the community concerned about the issue.
3. Create your own database
Sometimes, data investigations involve creating your own database from multiple documents. That’s what Jenna Susko of KNBC-Los Angeles did when her investigative unit had to look into how a mayor had been using his personal reward cards for employee travel. Because the city didn’t have an existing electronic database of public servant travel records, KNBC had to go through paper copies of travel receipts and build an electronic database of their own.
4. Look for compelling stories in unexpected places
Susko also presented her investigative team’s story on the thriving industry of school filming in Los Angeles. KNBC examined school contracts and filming permit records to see which film companies were shooting school footage, what they were using it for, how much schools were paid and how the money was being spent. Susko said teachers and parent were very hesitant to talk to the reporters, which made it difficult for them to humanize the numbers. But then the reporters happened to see three students, including one wearing a high school T-shirt, in a video clip of a red carpet event for an adult film. So the journalists looked through high school yearbooks to find the students. They later learned that the high school had been allowing a company to film a pornographic film on campus. Obviously, this became a compelling hook for the story.
5. Let people illustrate the numbers
Some stories start with a human source that leads to a request for data. Other times, speakers said, they requested records and data without a clear idea of where that information might take them. But no matter how you start your reporting, the speakers agreed that human faces are crucial to making your numbers relatable to the people in your community.
Soo Rin Kim is a University of Missouri senior studying investigative reporting and data journalism. She interned at The Reporters Committee for Freedom of the Press and currently works as a student assistant in the IRE Resource Center.
Dear Friends:
I’m sorry to have to tell you that Mark Horvit has decided to take an exciting new job at the University of Missouri and will be stepping down as IRE’s executive director at the end of the summer.
While this represents a huge loss to IRE, the board is hopeful that we’ll still get to work with Mark from time to time through the university. Mizzou has been our home and partner for 40 years, and Mark will be just down the hall training the next generation of IRE members. More details on his new position will be announced by the school shortly.
We will establish a search committee in the next couple of weeks, which will then circulate an invitation to apply for the job. In the meantime, please contact me directly with any questions or concerns.
I can’t let this moment go by without reviewing some of Mark’s accomplishments at IRE.
In the last eight years, Mark built on the strong foundation he inherited from Brant Houston, leading the organization through a deep recession and industry crises that devastated our membership and finances.
Now IRE may be the world’s largest support system for investigative reporters who are working to make a difference in their communities. Our membership stands at an all-time high of more than 5,000 world-wide with a full-time staff of 16 and an annual budget of more than $2 million. Conferences in the last two years have been the largest and most diverse in IRE’s history, thanks largely to Mark’s leadership and the entire staff, with training directed by Jaimi Dowdell and events by Stephanie Sinn.
Our international reach has been expanding. Student involvement in IRE has skyrocketed through initiatives like the Campus Coverage Project, student membership drives and more. Mark has personally trained journalists in at least 15 countries during his tenure at IRE and has continued great partnerships with our longtime foundation supporters while adding new ones. While we have a long way to go, Mark has created programs like the fellowship for Historically Black Colleges and Universities and partnerships with sister organizations like the Asian American Journalists Association to help diversify our membership and leadership.
Past executive directors have been among IRE’s biggest cheerleaders and supporters. I know Mark will join that alumni association with the same spirit.
IRE’s annual meeting in New Orleans will be the 18th national conference that he’ll have helped oversee. We’ll say a more formal farewell there. For now, I hope you’ll join me in offering Mark our gratitude and good wishes.
Sarah Cohen
President of IRE’s board of directors
sarah.cohen@nytimes.com
Freelance journalist Samantha Sunne and Helena Bengtsson, data projects editor for The Guardian, spent an hour at the 2016 CAR Conference going over free, open-source tools that can replace expensive data cleaning, analysis and visualization programs.
Here are some of my favorite tools the speakers mentioned:
Google Sheets is a widely known spreadsheet tool that is easy to use and accessible from any computer, making collaboration easy. But Google Sheets functions are not as extensive as those in Excel.
2. csvkit
This is a very powerful tool that allows journalists to perform very specific manipulations on CSV files. While this tool is powerful, it may not be the most user-friendly. Because it runs in the command line, it’s not the most intuitive tool and can be difficult for non-programmers to use.
3. SublimeText
SublimeText is one of the most widely used text editors among coders, including me. It recognizes file types and highlights elements in different colors. The free version keeps asking you to purchase the paid version, but hey, we’re all used to advertisements now, aren’t we?
4. iFOIA
iFOIA.org, run by The Reporters Committee for Freedom of the Press, not only generates record request letters, but it also provides journalists with a detailed analysis of state records laws. I can’t help but be biased about iFOIA because I used to update contact information for state agencies on the website when I interned for RCFP. It’s a great resource for reporters.
5. Evernote
Evernote is a high-quality tool for taking notes, editing audio, clipping images, scanning documents and much more. The scanner is especially useful. All you need is a smartphone to take pictures with, and you can search for words in the uploaded images.
New tools I discovered in the session:
1. PostgreSQL
These days it’s hard to find SQL tools you don’t have to pay, but PostgreSQL is one of them. I’ve been using MySQL because it was how I first learned about the world of SQL. PostgreSQL is very easy to install and even has a mapping function, but the interface is quite primitive.
2. Overview
Overview is a powerful tool that can handle millions of documents. It not only breaks down searches but also allows multi-search functions. It visualizes search results but has limited analysis functions.
3. CartoDB
If you’ve worn out Google Fusion Tables, check out CartoDB. This tool helps us create high-quality maps with customizable functions. The downside: If you’re on the free account, you can only have five maps at once.
4. Silk
This tool allows us to turn spreadsheets into visualizations, including charts and maps. It works especially well with Google Sheets and can automatically update if you make changes to your data. The website can be a little buggy, but, hey, it’s free!
5. TimelineJS
TimelineJS is an open-source tool developed by the Knight Lab that allows us to easily create timelines. This tool also uses Google Sheets as its source, and you can add text, images, video clips and more. While the default interface is quite extensive, it’s hard to customize. The Denver Post’s used TimelineJS as part of its Pulitzer-winning coverage of the Aurora theater shooting.
All these free tools are great resources for journalists, but the speakers stressed that when you use free tools provided by third parties, you sometimes have to sacrifice stability or security. Interactive visualizations that rely on third-party code can crash if anything goes wrong. Also, confidential information might not remain secure once it goes through a third-party tool like Google Sheets.
Soo Rin Kim is a University of Missouri senior studying investigative reporting and data journalism. She interned at The Reporters Committee for Freedom of the Press and currently works as a student assistant in the IRE Resource Center.
There are certain pieces of advice you hear over and over again at our annual computer-assisted reporting conference: Get out there, take risks and experiment. So at our recent conference in Denver, we sent University of Missouri journalism student Jack Howard to do just that. On this bonus episode, you’ll hear his five-minute experiment: capturing the NICAR experience and turning it into audible data.
As always, you can find us on Soundcloud, iTunes and Stitcher. If you have a story you think we should feature on the show, drop us a note at web@ire.org. We’d love to hear from you.
EPISODE NOTES
Looking for links to the stories, resources and events we discussed on this week's podcast? We've collected them for you.
CREDITS
Jack Howard reported and produced this episode. IRE Web Editor Sarah Hutchins edits the podcast. We are recorded in the studios of KBIA at the University of Missouri.
By Raven Nichols
Training a newsroom to look for data and interactive ideas isn’t always easy. At the 2016 CAR Conference, Dana Williams of the Honolulu Star-Advertiser and Rachel Schallom of Fusion explained how to introduce these concepts to your newsroom.
The first step is to hold a training session. Here are a few suggestions:
Host these training sessions frequently. Exactly how often you run them can sometimes depend on newsroom turnover.
Some people may not understand techniques for finding data. Showing examples and offering hands-on training can help mitigate this problem. It can also be useful to pair journalists with others in the newsroom who understand data journalism techniques.
Some reporters won't know how to incorporate data into their stories. Schallom suggested reporters create a mockup storyboard in Adobe Illustrator to visualize how the text would look in an interactive. This would be beneficial if, for example, you wanted to add a slideshow to your story.
A data journalist or editor sometimes has to veto an idea if it doesn’t make sense. There is a respectful way to do this. Tell the writer why the idea doesn’t work and show him or her examples of interactive or other ideas that have worked in the past.
Remember, this process takes time. Introducing new skills to a newsroom is never easy, but keeping an open mind will help with this process.
Raven Nichols is a sophomore mass communications major at Louisiana State University. She is the entertainment and news editor for lsunow.com.
By Kouichi Shirayanagi
There are a lot of photos circulating on social media. Some photos, such as the famous sharks in a flooded mall or sharks jumping at a rescue helicopter, you know are fake. But how do you verify that photos used in the press are the real deal?
Nikon award a prize in an amateur photo contest to an image that turned out to be a fake. The Ted Cruz presidential campaign manufactured a photo of Sen. Marco Rubio shaking hands with President Barack Obama to show the two were friends. The photo was a stock photo with the faces altered. The Pulitzer committee awarded a prize to an entry of a photo from Syria in which a cameraman’s video camera was edited out. The Associated Press cut all ties with the photographer.
You can detect photo alteration through duplicated pixels in a photo.
Google’s reverse image search and TinEye can answer the following questions: Has the image appeared before? When and where did it first appear? Has it been modified?
TinEye finds exact and altered copies of images, including those that have been cropped, color adjusted, resized, heavily edited or slightly rotated.
A lot of images were shared on social media immediately after the Malaysia Airlines crash in Ukraine. Photo stills from the TV show “Lost” were photoshopped with the Malaysia Airlines logo over them within minutes of the crash. These photos circulated widely as though they were real, but TinEye quickly detected the photos as fake.
EXIF metadata is information about camera make, date, etc., on JPEG images. Using the EXIF metadata, you can see where the photo was taken and do a Google Maps search to see if the background of the photo is realistic or not. A growing number of professional cameras have GPS data, and almost all camera phones include it.
To be a verified photo, an image should pass three tests:
Someone who fakes a photo that gets through all three tests is faking at a very high level. Tools such as those mentioned above are especially helpful for verifying photos that come from eyewitnesses who are the first at the scene of a news event.
Kouichi Shirayanagi is a graduate student at the Missouri School of Journalism. Last summer he worked at the business page of the St. Louis Post-Dispatch.
By Maggie Angst
For journalists working with audio or video, it can sometimes be challenging to find the best way to display data in story.
Joe Wertz, an environment and energy reporter for StateImpact Oklahoma, emphasized that although we really want a character to tell a story, sometimes the data can be the character.
For example, Wertz and Michael Corey, senior news applications developer for Reveal, worked on a data story about the increase in Oklahoma earthquakes.
In 2014, Oklahoma had nearly three times as many earthquakes as California. The reason, they found, had to do with the oil and gas boom. Unfamiliar with radio at first, Corey wondered, “How are we going to make the data felt on the radio?”
In order to tell the story, Corey worked with a sound engineer and ran the earthquake waves through synthesizers to create a beep whenever there was an earthquake. Using Musical Instrument Digital Interface, MIDI, Corey was able to map the set of musical notes to the time and magnitude of each earthquake.
“Just like other visualizations, you need to ask yourself, is this conveying it clearly?” Corey said. “Follow the rules you follow for visual charting, which usually come down to taking more out.”
Not all stories makes for a good video, but Kavya Sukumar, a developer at Vox Media, provided some examples of ones that work well:
- Long-form stories that allow you to compress the whole narrative into a visual format
- A concept explainer for complex issues
- Promo videos for a big series that’s just starting to run.
Journalists have several ways to make these videos:
Overlays
Hand-drawn, which means you don’t need design or programming skills
Keyframe animation, which can be made using software like Adobe After Effects
Nancy Watzman, managing editor of the Political TV Ad Archive for the Internet Archive, also showed the audience how to use a new tool in order to enhance their video stories.
The Political TV Ad Archive captures political ads that journalists can download and embed into their stories. Reporters can also download a spreadsheet of data including:
- Which ads are airing in a certain area
- Which ads are airing the most
- Which TV programs are targeted
- Who is sponsoring the ad
“We have all this data, and we are just starting to scratch the surface of what you can do with it,” Watzman said.
In the end, all four panelists, like Watzman, emphasized the importance of experimenting with new things and figuring out the most engaging video or audio technique to use in order to tell a story.
Maggie Angst is a senior studying watchdog convergence journalism at the University of Missouri.
By Brittany Crocker
Big data has changed the way we look at housing. We no longer have to view a house through the lens of a real estate agent.
Zillow has a database of 110 million homes that includes data for buyers and sellers, renters, homeowners, real estate agents, property managers, mortgage providers…and reporters.
Skylar Olsen, a senior economist at Zillow, showed CAR Conference attendees how to navigate the site and how different datasets and statistics can be used to develop stories.
Olsen said Zillow data is a good way to look at inequality and livelihood. Housing reporters can show where home values are rising the fastest. A business reporter could examine how those home values change in relation to a company moving to or leaving a town. A school reporter could see if test scores vary in relation to home values.
The Wall Street Journal used Zillow data to highlight how different demographics are recovering from the housing bust.
Features reporters can offer consumers tips on buying or renting and point out neat patterns, like the most popular street names in each state.
Reporting with housing data can be tricky though, because different websites use different methodologies and data sources. So, the metrics we choose can tell completely different stories.
The most obvious example would be median rental values from Zillow compared with standard median listed rent indexes.
A median listed rent index generally lists median rent for what is available, or listed. Zillow’s median rental value index adjusts for the type of housing and controls for availability variables by using estimates for both listed and unlisted housing.
Zillow does this to capture what is happening to rental values over time, instead of a single point.
Olsen said they’ve found a lot of other interesting patterns over time using similar methodological approaches.
For instance, Olsen said they’ve found home appreciation values vary widely across the country. By looking into aggressively appreciating metros, reporters can avoid over-leveraging housing bubble and bust cycles.
Zillow has also found that different levels of rent burden that fall under 30 percent still show the same savings rate. But for people with high rent burdens, the savings rate is more erratic. Zillow data can be combined with census or American Community Survey data to tell a more in-depth story.
Olsen said Zillow data indicates that the affordability crisis is an urban one, and that the first thing to go for a person (or family) living in a rent crisis is dental care.
That’s a story right there, but Zillow data isn’t limited to in-depth enterprise stories. The company also provides research briefs and data analysis as well as local market overviews to supplement a shorter or breaking news story.
They also have a media room where you can subscribe to get research highlights twice a week. A press team also offers custom analysis for more specific stories and answers general inquiries. And, it’s free.
So whether you’re just getting your feet wet in data reporting or you’re a veteran data sleuth, Zillow’s housing database is one to put in the toolbox.
Brittany Crocker is a graduate student in investigative and computer assisted reporting at the Missouri School of Journalism.
By Jinghong Chen
During a session at the 2016 CAR Conference, Kevin Crowe of the Milwaukee Journal Sentinel and Jamie Grey of KOMU-Columbia shared tips and data projects that journalists everywhere can try.
1. Infrastructure
Crowe said that people in lots of places, especially aging cities, are worried about potholes, water main breaks and pavement quality.
The basic information needed for this kind of reporting includes locations, date reported, date repaired, political district and census tract. Crowe also recommended keeping an eye on data from surrounding areas and benchmarks, such as how fast city thinks it should repair potholes and how often it meets this goal.
You can get basic information from a city’s public works department. For pavement quality data, go to your city, county or state and ask for a database with grades for each section of the street.
2. Delinquent taxes
The panelists suggested looking into people and organizations and whether or not they still owe taxes. For example, does the state have people on its payroll who owe the government money?
- This can be a huge issue in any city or county stuck in a recession.
- How good is the government is about collecting taxes?
- Check out who’s getting fined for what.
- What happens if people don’t pay?
- Use ordinance violation fines to track down potential slumlords.
3. Airport usage
Ask if your local airport is being used as much as the Chamber of Commerce says it is. If there’s talk of expansion, ask if it’s practical or a necessity. You can get data from the U.S. Department of Transportation’s bureau of transportation statistics.
4. Schools: Cafeteria inspections and standardized tests
Do your schools have dirty cafeterias? Are they being inspected? Panelists suggested checking with your state’s health department.
For standardized tests, look into whether or not students or teachers are cheating. You can get standardized test discrepancy data from the U.S. Department of Education.
5. Snow days
How often do you school officials cancel classes? Is that number going up or down? You can compare the snow days in your school district with National Weather Service data about snowfall.
6. Claims and lawsuits
Look for the forms people have to file with the government before they can sue. You can also get claims, lawsuits and expense data. The data can be found at your city’s risk management division, which is usually housed under the finance department.
7. Other ideas:
Jinghong Chen is a graduate student at the Missouri School of Journalism focusing on data and international journalism.
Looks like you haven't made a choice yet.