analyticjournalism.com » RRAW-P process

SIDEBAR

»
S
I
D
E
B
A
R
«

Some imaginative election "gaming" from USC and the Annenburg Center

Jun 19th, 2007 by JTJ

Monday, June 18. 2007

The Redistricting Game

University of Southern California students developed the online game for the Annenburg Center for Communications to teach about the challenges (and partisanness) of redistricting. Along the way players learn that to keep their candidates elected they may need to examine ethical issues. The game is Flash-based.

From the [original News 10] site: The Redistricting Game is designed to educate, engage, and empower citizens around the issue of political redistricting. Currently, the political system in most states allows the state legislators themselves to draw the lines. This system is subject to a wide range of abuses and manipulations that encourage incumbents to draw districts which protect their seats rather than risk an open contest.

No Comments »

NYT needs to install a "math checker" on every copy editor's desk

May 27th, 2007 by JTJ

This weekend, friend-of-the-IAJ Joe Traub sent the following to the editor of the New York Times. Here's the story Joe is talking about: “White House….“

To the Editor:

The headline on page 1 on May 26 states
“White House Said to Debate '08 Cut in Troops by 50%”
The article reports a possible reduction to 100,000 troops
from 146,000. Thats 31.5%, not 50%. NPR's Morning Edition
picked up the story from the NYT and also reported 50%
erroneously.

Joseph F. Traub
The writer is a Professor of Computer Science at Columbia University

The headline error is bad enough (it's only in the hed, not not in the story) — and should be a huge embarrassment to the NYT. But the error gets compounded because while the Times no longer sets the agenda for the national discussion, it is still thought of (by most?) as the paper of record. Consequently, as other colleagues have pointed out, the reduction percentage gets picked up by other journalists who don't bother to do the math (or who cannot do the math.)

See, for example:
* CBS News — “Troop Retreat In '08?” — (This video has a shot of the NYT story even though the percentage is not mentioned. Could it be that the TV folks don't think viewers can do the arithmetic?)
(NB: We could not yet find on the NPR site the transcript of the radio story that picked up the 50 percent error. But run a Google search with “cut in Troops by 50%” and note the huge number of bloggers who also went with the story without doing the math.)

Colleague Steve Doig has queried the reporter of the piece, David Sanger, asking if the mistake is that of the NYT or the White House. No answer yet received, but Doig later commented: “Sanger's story did talk about reducing brigades from 20 to 10. That's
how they'll justify the “50% reduction” headline, I guess, despite the
clear reference higher up to cutting 146,000 troops to 100,000.”

Either way, it is a serious blunder of a fundamental sort on an issue most grave. It should have been caught, but then most journalists are WORD people and only word people, we guess.

We would also point out the illogical construction that the NYT uses consistently in relaying statistical change over time. To wit: “… could lower troop levels by the midst of the 2008 presidential election to roughly 100,000, from about 146,000…” We wince.

English is read from left to right. Most English calendars and horizontal timelines are read from left to right. When writing about statistical change, the same convention should be followed: oldest dates and data precedes newest or future dates and data. Therefore, this should best be written: “…could lower troop levels from about 146,000 to roughly 100,000 by the midst of the 2008 presidential election.”

No Comments »

A semi- "by the numbers" tutorial on data visualization

Feb 14th, 2007 by JTJ

Juan C. Dürsteler, in Barcelona, Spain, edits a fine online magazine devoted to information graphics. The current issue describes “… the diagram for the process of
Information Visualisation as seen by Yuri Engelhardt and the author
after a series of discussions about its nature and the process that
leads from Data to Understanding.”

And it is available in English and Spanish. Check out
http://www.infovis.net/printMag.php?num=187&lang=2

No Comments »

Hey, bunky, you say you need a story for tomorrow, and the well is dry

Jan 2nd, 2007 by JTJ

No story? Then check out Swivel, a web site rich with data — and the display of data — that you didn't know about and which is pregnant with possibilities for a good news feature. And often a news feature that could be localized.

Here, for example, is a posting from the SECRECY REPORT CARD 2005 illustrating the changing trends in the the classification and de-classification of U.S. government data. (You can probably guess the direction of the curves.)

Spotlight What is the US Government Not Telling Us?

Source: SECRECY REPORT CARD 2005

The
number of classified documents is steadily increasing, while the number
of pages being declassified is dwindling. This data were uploaded by mcroydon.

No Comments »

The Quick and the Dead

Nov 9th, 2006 by JTJ

Paul Parker, of the Providence (Rhode Island) Journal, is the Quick and an impressive list of folks on the state's voter registration rolls are the Dead this week. Below is a note Parker posted to the NICAR-L listserv. The great thing about this is the recipe Parker provides for an analytic journalists' cookbook. Said he:

Nothing new or innovative, but we ran a dead voters story today, and
it's getting tons of buzz. I would recommend — no, URGE — everyone on
the list do the same for your area.

Here's the link:
http://www.projo.com/extra/election/content/deadvoters9_11-09-06_DN2P2GR.33b46ef.html

I know it's CAR101, but I'll outline how we did it (which is also
explained in the story):

1. Get your state's central voter registration database.
2. Get your state slice of the Social Security Administration's Death
Master File from IRE/NICAR.
3. Run a match on First Name, Last Name and Date of Birth.
4. Exclude matches where middle initials conflict. (Allow P=PETER or
P=NULL, but not P=G.)
5. Calculate a per capita rate for each city/town by dividing the number
of dead people by the total registered.
6. Interview the biggest offenders about why they're the biggest offenders.

This was so easy, and now everyone at the paper thinks I'm some sort of
journalism deity. (And the voter registration people called to ask,
“Where do I get a copy of that Social Security list.”)

As for the possibility of false positives, we pointed this out in the
story, which I think sufficed because the odds are low enough. I also
hand checked a few against our obituary archives.

—
Paul Parker
Reporter
The Providence Journal
75 Fountain Street
Providence, RI 02902
401-277-7360
pparker@projo.com

Then David Heath, at the Seattle Times layered in his experience. Said he:

We
did a dead-voter story last year after a squeeker of a governor's race.
Our
story looked for dead people actually voting. At first, we were
surprised by
the number of matches. But very few of them withstood
scrutiny. Matching a
name and a birthdate will get you lots of false
matches. You really need to
include address, which you can do in our
state where the death-certificate
database is public.

We then went to the county election board and got the
actual page voters
signed when they voted. We even looked
at absentee ballots. What
we discovered were a lot of cases where a
vote was recorded for a person
because someone else accidentally signed the
wrong line on the page —
John R. Smith signing on John P. Smith's line, for
example. Or cases
where the person scanning the data with a bar-code reader
into the
database missed and scanned the wrong line. We also found cases
where
parents and children had the same name; the parent died but the son
or
daughter was mistakenly scrubbed from the registry.

We did find a
few cases of dead people voting. Usually it was a recent
death and someone in
the family turned in an absentee ballot and forged
the signature. But you
have to be careful that a story about dead voters
isn't really a story about
dirty data.

David Heath
The Seattle
Times

No Comments »

Something less than half a measure

Oct 17th, 2006 by JTJ

A brief comment was passed along on the NICAR-L (National Institute for Computer-Assisted Reporting) listserv this morning by Daniel Lathrop, of the Seattle Post-Intelligencer. Said he:

Really interesting story on lobbyists-related-to-lawmakers in The USA
Today. I think those of us who cover money-in-politics should all have
a little story envy on this one.

http://www.usatoday.com/news/washington/2006-10-16-lobbyist-family-cover_x.htm

Daniel Lathrop
Seattle P-I

Well, yeah. An interesting story, but also one demonstrating why newspapers as institutions simply do not grasp the shift in power inherent in the Digital Age, a shift away from institutions and to citizens.

First, the story reports: “The family connections between lobbying and lawmaking are prompting
complaints that Congress is not doing enough to police itself.” Fair enough, but can't you SHOW us, in the online version, the evidence to support this sweeping generalization of “prompting complaints.” Why should we take your word for it, guys, when the evidence must be at hand.

Second, “…USA TODAY reviewed thousands of pages of financial disclosures and
lobbyist registrations, property records, marriage announcements and
other public documents to identify which lawmakers and staffers had
relatives in the lobbying business.” WOW! Would I like to see those pages, and even drill down into them to see if there's anything there related to my representative. But nooooooooo. The paper must of had some way to manage all this
public-record data, some way to cross-reference it, to search it, to retrieve documents and
content. Why not put all that up on the
web and let readers peruse their own subjects of interest?

Ironically, an example of the power shift mentioned above turns up, buried in a sidebar to the story, “Little Accountability in Earmarks.” There we find reference to something called the Sunlight Foundation. I had not heard of the Sunlight Foundation, but, hey, it's only been around since the first of the year. It turns out this organization is doing just what newspapers should be doing: leveraging the power of the digital environment to connect people to the data and tools needed to analyze that data so they can make informed decisions.

Another opportunity missed by the industry, and tragically so.

No Comments »

Teasing out attitudes from text

Oct 5th, 2006 by JTJ

Eric Lipton has a piece in Wedneday's (4 Oct. 2006) NYTimes about some “new” research efforts to come up with software “that would let the [U.S.] government monitor negative opinions of the United States or its leaders in newspapers and other publications overseas.” (See “Software Being Developed to Monitor Opinions of U.S.“) Surely this is an interesting problem, and one made especially difficult when the translation factor kicks in.

This is not, however, the first attempt to gin-up such software. We have long admired the work done some years ago at the Pacific Northwest National Laboratory in the ThemeRiver™ visualization.

It “…helps users identify time-related
patterns, trends, and relationships across a large collection of
documents. The themes in the collection are represented by a 'river'
that flows left to right through time. The river widens or narrows to
depict changes in the collective

strength of selected themes in the
underlying documents. Individual themes are represented as colored 'currents' flowing within the river. The theme currents narrow or widen
to indicate changes in individual theme strength at any point in time. Status: An interactive proof of concept prototype has been developed. Download a QuickTime video about ThemeRiver (20MB)

We hope the PNNL will continue by giving us more of this intriguing tool.

No Comments »

Tracking the bucks all the way to court

Oct 2nd, 2006 by JTJ

Another unique investigation by The New York Times gets A1 play in this Sunday's edition (1 Oct. 2006) under the hed “Campaign Cash Mirrors a High Court's Rulings.” Adam Liptak and Janet Roberts (who probably did the heavy lifting on the data analysis) took a long-term look at who contributed to the campaigns of Ohio's Supreme Court justices. It ain't a pretty picture if one believes the justices should be above lining their own pockets, whether it's a campaign fund or otherwise.

In any event, there seems to be a clear correlation between contributions — and the sources — and the outcome to too many cases. A sidebar, “Case Studies: West Virginia and Illinois,” would suggest there is much to be harvested by reporters in other states.

There is, thankfully, a fine description of how the data for the study was collected and analyzed. See “How Information Was Collected“

There are two accompanying infographics, one (“Ruling on Contributors' Cases” ) is much more informative than the other (“While the Case Is Being Heard, Money Rolls In” ), which is a good, but confusing, attempt to illustrate difficult concepts and relationships.

At the end of the day, though, we are grateful for the investigation, data crunching and stories.

No Comments »

Library on the moon

Sep 21st, 2006 by Tom Johnson

Friend Laura Soto-Bara posts the following to the NewsLib listserv:

Library on the moon
http://www.boingboing.net/2006/09/20/library_on_the_moon.html
Wednesday, September 20, 2006

The moon might be a good place for a massive storehouse of digital
information, sort of a Lunar Library of Alexandria. That's the idea
proposed by NASA scientist David McKay, who ten years ago led the team
that announced that a Mars meteorite contained evidence of life.
According to the New Scientist blog, McKay says the lunar library could
be stored on computers buried in the ground, placed inside craters, or
located in hollow lava tubes…. From the post:

The benefits of lunar storage are that there is no oxygen to erode the
material, constant sub-freezing temperature and the Moon is currently
free of all of the havoc wreaked by humankind…

Families could even pay a fee to preserve photographs in the lunar
library for future civilizations. McKay calls it the “ultimate time
capsule.”

No Comments »

Statistically speaking….

Sep 20th, 2006 by Tom Johnson

Any discipline always has subsets of argument, typically about definitions, methodologies, process or significance. Statistics, of course, is no different. Below is an interesting article from the Washington Monthly about what constitutes statistical significance. The article is OK, but the commentary below it even better. See http://www.blogware.com/admin/index.cgi/cmd=post_article

LIES, DAMN LIES, AND….Via Kieran Healy, here's something way off the beaten path: a new paper by Alan Gerber and Neil Malhotra titled “Can political science literatures be believed? A study of publication bias in the APSR and the AJPS.”
It is, at first glance, just what it says it is: a study of publication
bias, the tendency of academic journals to publish studies that find
positive results but not to publish studies that fail to find results.
The reason this is a problem is that it makes positive results look
more positive than they really are. If two researchers do a study, and
one finds a significant result (say, tall people earn more money than
short people) while the other finds nothing, seeing both studies will
make you skeptical of the first paper's result. But if the only paper
you see is the first one, you'll probably think there's something to it.

The chart on the right shows G&M's basic result. In statistics
jargon, a significant result is anything with a “z-score” higher than
1.96, and if journals accepted articles based solely on the quality of
the work, with no regard to z-scores, you'd expect the z-score of
studies to resemble a bell curve. But that's not what Gerber and
Malhotra found. Abovebelow
a z-score of 1.96 there are far fewer studies than you'd expect.
Apparently, studies that fail to show significant results have a hard
time getting published.

No Comments »

» Substance:WordPress » Style:Ahren Ahimsa