Tags : programming open-source

News Apps: Where Code Meets Copy

A new specialty in newsrooms is emerging that’s giving data new reach. It’s separate from computer-assisted reporting but shares much of the same DNA. Like CAR, it involves working intensively with data, but the end product is a web-based software application, not a story. A formal name for this field hasn’t completely gelled, but at ProPublica and some other newsrooms we call them “news applications.”

What are news applications? How do they relate to CAR? How can CAR nerds work with news apps nerds?

Simply put, news applications are journalism done with software development, much like photojournalism ...

Read more ...

Tech Tip: Data Access With Python tuples

NOTE: This tutorial assumes a basic familiarity with the Python language and interpreter, an interactive environment for running code. It also requires Python 2.6 or later.

As a database editor, my daily routine often involves pulling data from spreadsheets and databases and processing it for a variety of other tasks: automated email alerts, visualizations, mashups with data from other sources.

The moment data leaves the confines of Microsoft Excel spreadsheets or a relational database manager, you lose all those handy column headers that help sort, filter and otherwise make sense of information. This recipe shows how, using Python, you ...

Read more ...

Web maps localize Iowa air pollution story

Des Moines Register reporters Chase Davis and Perry Beeman spent months compiling and making sense of data for a series on air pollution in Iowa. But, with more than 1,600 polluting facilities across the state, there simply wasn’t space in the stories to mention any but the most noteworthy. That’s where data editor James Wilkerson and digital projects editor Michael Corey came in. They developed an interactive map that allowed users to see information about the facilities near them. "It localized the story to basically every community in Iowa," Davis said of the map. It also gave ...

Read more ...

Homemade database boosts disciplined nurses probe

California's Board of Registered Nursing oversees more licensees, some 350,000, than any state nursing agency in the country. It is responsible for ensuring that nurses at patients' bedsides are not only competent, but sober, sane and law-abiding. So when we became suspicious that the board was fumbling its duties, leaving members of the public at risk, we wanted to ground our reporting in more than anecdotes, although those were rich and plentiful. Figuring out how to do this proved both time-consuming and hugely rewarding.

We first became interested in the board after we spent much of 2003 and ...

Read more ...

Senate Votes in XML

One of my personal annoyances came to a quiet end last week, when the U.S. Senate decided to begin publishing vote information in XML rather than the HTML that had been its format for years. The House, usually the institutionally more nimble of the chambers, began publishing vote information in XML back in 2003 (view the source on this page to see an example). Here's a Senate vote - it has information on the date and time of the vote, plus all of the individual positions. This makes it easier to parse the information into a spreadsheet or database ... Read more ...

Scraping vs. Parsing

I almost never do any Web scraping any more. That's not because it's not useful - it's one of the most powerful tools I've picked up - but because when it comes to making data out of HTML, scraping by matching patterns of characters doesn't always make a lot of sense. So instead of relying strictly on regular expressions to match patterns within a Web page, I now use HTML parsers to locate and extract information. The difference is precision. Often when I use regexes, I spend too much time testing out lengthy and complex regular expressions ... Read more ...

Programming 101: The loop

Since I've taken on the label of journalist-programmer, I've noticed two things: Journalists think programming is a lot more complicated than it is, and the amount of actual programming we use is very, very minimal. Most of the programming I do is really just variations on the same few themes. If you've been doing CAR for any length of time, this should sound familiar to you. We use a fraction of the tool, and what we do really isn't that hard or complicated. So, in that spirit, let's look at one of the most basic ... Read more ...