How and why to make your data analysis reproducible
You understand how you processed your data. Does your editor? Your reader? You, in six months? Without a replicable approach to extracting, transforming and loading data, we are often frustrated in our efforts to share or update our work. Join us for a panel discussion of reproducible data workflows. We’ll talk about why we use standardized processes for collecting, cleaning and analyzing data, and share practices that work for us. We’ll also discuss strategies for smart human intervention (i.e. reporting, logging and documentation) in automated workflows.
Hannah Cushman is a wayward journalist turned software developer. She cut her teeth on public life in mid-Missouri, covering municipal economic development and elections. An alumna of the Missouri School of Journalism and a veteran of the Associated Press, Hannah remains deeply interested in how information is consumed, shared, and acted upon. https://hancush.github.io
Ryann Grochowski Jones is the data editor at ProPublica. Previously, she was deputy editor for data at ProPublica and a data reporter at inewsource in San Diego. She received her master's degree from the University of Missouri School of Journalism, where she was a data librarian for IRE/NICAR. Ryann began her career as a municipal beat reporter for her hometown newspaper in Wilkes-Barre, Pennsylvania. @ryanngro
Hannah Recht is a data journalist at Bloomberg News. She likes scraping obscure insurance filings and wrote an R package that accesses Census data. She previously worked at the Urban Institute as a researcher and data visualization developer. @hannah_recht
Jeremy Singer-Vine is the data editor at BuzzFeed News. He also publishes Data Is Plural, a weekly newsletter of useful/curious datasets. www.jsvine.com
Buzzfeed open resources
An index of all Buzzfeed's open-source data, analysis, libraries, tools, and guides. ttps://github.com/BuzzFeedNews/everything