Freelance journalist Samantha Sunne and Helena Bengtsson, data projects editor for The Guardian, spent an hour at the 2016 CAR Conference going over free, open-source tools that can replace expensive data cleaning, analysis and visualization programs.
Here are some of my favorite tools the speakers mentioned:
Google Sheets is a widely known spreadsheet tool that is easy to use and accessible from any computer, making collaboration easy. But Google Sheets functions are not as extensive as those in Excel.
This is a very powerful tool that allows journalists to perform very specific manipulations on CSV files. While this tool is powerful, it may not be the most user-friendly. Because it runs in the command line, it’s not the most intuitive tool and can be difficult for non-programmers to use.
SublimeText is one of the most widely used text editors among coders, including me. It recognizes file types and highlights elements in different colors. The free version keeps asking you to purchase the paid version, but hey, we’re all used to advertisements now, aren’t we?
iFOIA.org, run by The Reporters Committee for Freedom of the Press, not only generates record request letters, but it also provides journalists with a detailed analysis of state records laws. I can’t help but be biased about iFOIA because I used to update contact information for state agencies on the website when I interned for RCFP. It’s a great resource for reporters.
Evernote is a high-quality tool for taking notes, editing audio, clipping images, scanning documents and much more. The scanner is especially useful. All you need is a smartphone to take pictures with, and you can search for words in the uploaded images.
New tools I discovered in the session:
These days it’s hard to find SQL tools you don’t have to pay, but PostgreSQL is one of them. I’ve been using MySQL because it was how I first learned about the world of SQL. PostgreSQL is very easy to install and even has a mapping function, but the interface is quite primitive.
Overview is a powerful tool that can handle millions of documents. It not only breaks down searches but also allows multi-search functions. It visualizes search results but has limited analysis functions.
If you’ve worn out Google Fusion Tables, check out CartoDB. This tool helps us create high-quality maps with customizable functions. The downside: If you’re on the free account, you can only have five maps at once.
This tool allows us to turn spreadsheets into visualizations, including charts and maps. It works especially well with Google Sheets and can automatically update if you make changes to your data. The website can be a little buggy, but, hey, it’s free!
TimelineJS is an open-source tool developed by the Knight Lab that allows us to easily create timelines. This tool also uses Google Sheets as its source, and you can add text, images, video clips and more. While the default interface is quite extensive, it’s hard to customize. The Denver Post’s used TimelineJS as part of its Pulitzer-winning coverage of the Aurora theater shooting.
All these free tools are great resources for journalists, but the speakers stressed that when you use free tools provided by third parties, you sometimes have to sacrifice stability or security. Interactive visualizations that rely on third-party code can crash if anything goes wrong. Also, confidential information might not remain secure once it goes through a third-party tool like Google Sheets.
Soo Rin Kim is a University of Missouri senior studying investigative reporting and data journalism. She interned at The Reporters Committee for Freedom of the Press and currently works as a student assistant in the IRE Resource Center.