Skip to content

By Nakylah Carter, IRE & NICAR

This edition of “Show Your Work” compiles five essential data journalism resources from Ben Welsh’s website, palewire. Welsh, a longtime IRE member, frequently speaks at IRE and NICAR conferences. His website offers transparent guidance for others to build their data journalism skills, while also providing tips and tricks for tackling some of data journalists’ most prevalent problems. 

A photo of Ben Welsh with a light blue background.
Ben Welsh

Born in Iowa, Welsh is a New York City-based reporter, editor and computer programmer working at Reuters. At Reuters, Welsh founded the organization’s News Application Desk, where he leads “the development of dashboards, databases and automated systems that benefit clients, inform readers, empower reporters and serve the public interest.” Prior to joining Reuters, Welsh spent 15 years at the Los Angeles Times where he contributed to a Pulitzer Prize-winning project and co-founded the Time’s first digitally-focused projects team and went on to lead the modernization of the newspaper’s graphics department

“In those roles, I helped to create the most popular pages in the history of latimes.com, including the site’s first live election results, a mapping platform that set a new standard for defining L.A. neighborhoods, custom designs for dozens of flagship projects, an award-winning wildfire tracker and the most complete resource on the spread of COVID-19 in California,” he wrote on his website. 

His coding skills are robust, with experience in contributing to open-source software projects, including Project Jupyter, Django, IPython, Observable, pandas and Altair, and teaching experience at numerous universities. Pulled from Welsh’s palewire website, the following resources come recommended by journalists on IRE’s training team. All are free and open for public use.

First Python Notebook

In one of his most popular talks, Welsh details a step-by-step guide to analyzing data with Python and Juptyer notebooks. This guide contains information about the Python computer language that can be understood by beginners, including how to read, filter, join, group, aggregate and rank structured data with pandas. This guide also includes information on how to record, remix and republish work using Project Jupyter and how to explore data using the Altair Python package for generating charts.  

This course was first developed by Welsh for an October 2016 watchdog workshop organized by IRE at San Diego State University’s School of Journalism and Media Studies.

First Visual Story

This tutorial serves as a guide to publishing a standalone visual story from a dataset that can translate to large audiences. The guide will provide hands-on experience to journalists to transcend content management systems to publish data stories on deadline and explore writing JavaScript, HTML and CSS within a Node.js static-site framework. 

This guide was first prepared by Dana Amihere, Armand Emamdjomeh and Welsh for a training session at the 2018 NICAR Conference in Chicago.

First GitHub Scraper

In this guide, Welsh offers a step-by-step introduction to web scraping with GitHub’s free Actions feature. In this guide, you will learn how to create a GitHub repository to store your code, use Python to scrape data from the web, configure GitHub Actions to schedule the scrape, automatically save the results to the repository, and send a Slack notification when new data arrive.

This guide was prepared for a training session at the 2022 NICAR Conference in Atlanta. The authors are Iris Lee, Aadit Tambe and Welsh. The tutorial is published as open-source software.

First Automated Chart

This guide informs journalists on how to use Python and the Datawrapper API to create numerous charts for data-driven stories. This tutorial covers creating a key that allows you to edit charts using the Datawrapper API, creating a chart with the Python datawrapper library, writing a template function that can create a chart for each item in a list and how to regularly update charts on a schedule.

This guide was prepared by Welsh and Sergio Sanchez Zavala for a training session at the 2024 NICAR conference in Baltimore. Some of the copy was written with the assistance of GitHub’s Copilot, an AI-powered text generator. The materials are available as free and open source on GitHub.

First LLM Classifier

This guide will help you gain hands-on experience with creating a large-language model (LLM) that can read and categorize datasets. Journalists will learn how to submit large-language model prompts with the Python programming language, write structured prompts that can classify text into predefined categories, submit dozens of prompts at once as part of an automated routine, evaluate results using a rigorous, scientific approach, and improve results by training the model with rules and examples.

Welsh and Derek Willis prepared this guide for a training session at the 2025 NICAR conference in Minneapolis. Some of the copy was written with the assistance of GitHub’s Copilot, an AI-powered text generator. The materials are available as free and open source on GitHub.

In his own words

For more information about Welsh’s path into data journalism, be sure to listen to his appearance on the IRE Radio Podcast in May 2025. 

Scroll To Top