Archiving data journalism

  • Event: 2018 CAR Conference
  • Speakers: Katherine Boss of New York University; Meredith Broussard of New York University; Nora Paul of University of Minnesota; Ben Welsh of Los Angeles Times
  • Date/Time: Sunday, Mar. 11 at 11:30am
  • Location: Addison
  • Audio file: Only members can listen to conference audio

Remember that story you read online in 2005, the one with the cool Flash graphics? How about that amazing interactive data visualization that you saw way back when, the one that made you want to level up your news nerd game? Good luck finding those stories today. Data journalism is disappearing from the web. 

Data journalism is more fragile than most people realize. Every time a news organization reorganizes its staff or updates its CMS or stops paying the bill for the data team’s servers, complex data journalism projects are lost. Conventional archiving methods, like the Internet Archive’s crawlers or the automated archiving feeds of companies like Lexis-Nexis, are no longer sufficient to capture projects that involve big data, databases, streaming data or interactive graphics.

In this session, we’ll discuss why data journalism is the new digital ephemera, and we’ll explore the state of the art for archiving. We’ll talk about strategies data journalists can use to preserve their own work and how news organizations can better preserve their valuable digital assets. Finally, we’ll report on how journalists, librarians and scholars are thinking about future-proofing the news.

Speaker Bios

  • Katherine Boss is the Librarian for Journalism, Media, Culture and Communication at New York University. Her research focuses on the challenges of archiving data journalism. She is currently part of a team working on a grant-funded project to build an emulation-based web archiving tool (more info here: @katy_boss

  • Meredith Broussard is an assistant professor at the Arthur L. Carter Journalism Institute of New York University and the author of Artificial Unintelligence: How Computers Misunderstand the World. Her research focuses on artificial intelligence in investigative reporting, with a particular interest in using data analysis for social good. Follow her on Twitter @merbroussard or contact her via

  • Nora Paul is co-author of Future-Proofing the News: Preserving the First Draft of History.  She is the former director of the Minnesota Journalism Center at the University of Minnesota where she also taught classes on information strategies. Formerly at the Poynter Institute as a faculty member and at the Miami Herald where she ran the news research library. Now blissfully retired, but happy to share her perspective on the archiving panel.

  • Ben is the editor of the Data and Graphics Department in the Los Angeles Times newsroom. The team of reporters and computer programmers works together to collect, organize, analyze and present large amounts of information. He is also a co-founder of the California Civic Data Coalition, a network of journalists and computer programmers dedicated to opening up public data, and the leader of PastPages, an open-source effort to better archive digital news. @palewire

Related Tipsheets

No tipsheets have yet been uploaded for this event.