Data Fusion for data journalism: Enriching datasets with a graph database

  • Event: 2017 IRE Conference
  • Speaker: William Lyon of Neo4j
  • Date/Time: Saturday, Jun. 24 at 9:00am
  • Location: Pinnacle Peak 1
  • Audio file: No audio file available.

This hands-on workshop will show how we can use the Neo4j graph database for data journalism. Starting with an initial dataset of information about public officials and their connections to companies, we will show how to extend the data, enriching it with other public datasets including federal government contract awards, nonprofit tax filings, and corporate registry information. 

Graph databases are a tool for modeling, storing and querying complex data. The main benefits of using graph databases in data journalism are 1) an intuitive data model and query language, and 2) the ability to easily combine datasets and query across them. We will focus on the second point, showing how we can combine datasets in a graph database and ask questions of the data by querying across them using Cypher, the query language for graphs.

This session is geared toward those with some basic familiarity with databases and data analysis, but will start at an introductory level for those new to graph databases.


Speaker Bios

  • William Lyon is a software developer at Neo4j, the open source graph database. He works on building integrations for Neo4j with other technologies, helping users build graph applications, and also leads the Neo4j Data Journalism Accelerator Program. He holds a masters degree in Computer Science from the University of Montana. You can find him online at or @lyonwj

Related Tipsheets

No tipsheets have yet been uploaded for this event.