Many tasks in investigative journalism boil down to classification problems. Is my police department cooking its crime stats by assigning incident reports to the wrong categories? Of the thousands of planes in the air each day, which ones might be involved in government surveillance? How can we identify political ads on Facebook?
Drawing on examples including the LA Times' investigation into the misclassification of violent crimes by the LAPD, BuzzFeed News' identification of spy planes operating in U.S. airspace, and ProPublica's tracking of political ads on Facebook, we'll consider practical questions like: I'm not a data scientist, I'm a reporter. What's in it for me? What type of story or reporting task can machine learning help with? When is machine learning *not* the answer? Which algorithm should I choose? How can I structure my data to give the algorithm more to work with?