Alfredo Covaleda,
Bogota, Colombia
Stephen Guerin,
Santa Fe, New Mexico, USA
James A. Trostle,
Trinity College, Hartford, Connecticut, USA
Check out Google-Refine at http://code.google.com/p/google-refine/
Google Refine is a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase.
From Flowingdata.com …..
Inspired by Shan Carter's simple data converter, appropriately named Mr. Data Converter, Matthew Ericson just put Mr. People online. The tool lets you paste a list of names, and it will parse the first and last name, suffix, title, and other parts for you. You can even have multiple names in a single row.
Years ago, while trying to clean up the names of donors in campaign finance data from the Federal Election Commission, I hacked together a Perl module — loosely based on the Lingua-EN-NameParse module — to standardize names. One port to Ruby later, I've finally put together a Web front end for it.
Getting data in the right format, whether for analysis or visualization, can be a huge pain. Imagine. All the data you need is right in front of you, but you can't do anything with it yet, because as often is the case, it's not in a nice and pretty rectangular format. So anything that makes this easier and quicker is an instant bookmark for me.
[Mr. People via @mericson]
The Journal of Artificial Societies and Social Simulation (http://jasss.soc.surrey.ac.uk ) has been around for 13 years, and it has become increasingly important for analytic journalists who believe that simulation modeling is — and increasingly will be — an keystone perspective for serious, value-added journalism. And it's one of the FREE e-journals available. Support it if you can.
In the meantime, if you're not familiar with agent-based modeling, check out this article by Lynne Hamill; it will point you to some useful concepts and tools:
Agent-Based Modelling: The Next 15 Years by Lynne Hamill http://jasss.soc.surrey.ac.uk/13/4/7.html Abstract This short note makes recommendations for the future direction of research in agent-based modelling (ABM). It is a personal view based on my experience as a policy adviser who has recently come to ABM. I suggest that to promote the use of ABM, the ABM community needs demonstrate the value of modelling to other social scientists by showing-by-doing and offering training projects; and to produce tools, guidance on good-practice and basic building blocks. Then the policy contexts most likely to benefit from ABM need to be identified along with any new data requirements, so that the usefulness of ABM can be demonstrated to policy analysts. This is, in my view, the challenge facing the ABM community for the next 15 years.
Census tract data and maps, while better than nothing, can often deceive because the size of the tract is greatly influenced by population size, not area. It is not uncommon that natural and constructed barriers — mountains or freeways — influence the movements and spatial demographics of a tract. Ah, but BLOCK data, now there is some fine, fine-grained data that we can use to extract insights and meaning. Once again, Flowingdata.com tips us off to a good visualization of population data and the resulting maps.
Instead of breaking up demographics by defined boundaries, Bill Rankin uses dots to show the more subtle changes across neighborhoods in a map of Chicago using block-specific data US Census.
Any city-dweller knows that most neighborhoods don't have stark boundaries. Yet on maps, neighborhoods are almost always drawn as perfectly bounded areas, miniature territorial states of ethnicity or class. This is especially true for Chicago, where the delimitation of Chicago's official “community areas” in the 1920s was one of the hallmarks of the famous Chicago School of urban sociology.
Each dot represents 25 people of the map color's corresponding ethnicity.
Eric Fischer takes the next step and applies the same method to forty major cities. Here are the maps for Los Angeles, San Francisco, and New York, respectively. Same color-coding applies. You definitely see the separation, but zoom and you much more subtle transitions.
[Eric Fischer via Data Pointed]
Patrick Cain, who correctly describes himself as “a journalist who makes maps for the Web,” has posted a couple neat sets of tips to his blog. Basically, they suggest ways to tweak some of Google's code to improve presentation. Check out his blog tips at
I’ve never been a fan of the way Google Maps handles local labels (neighbourhoods, for example) – they are often redundant, inconsistent and wrong, as well as cluttering the map visually.
These examples didn’t take long to collect:
From FlowingData….
“You can get pretty far with data graphics with just limited statistical knowledge, but if you want to take your skills, resume, and portfolio to the next level, you should learn standard data practices. Of all places, UK Parliament has some short and free guides to help you with basic statistical concepts. They provide 13 notes, each only two or three pages long that can help you with stuff like how to adjust for inflation, confidence intervals and statistical significance, or basic graph suggestions [pdf]. I like.”
A good piece on the googlegeodevelopers.blogspot.com on how the Wall Street Journal crew created a fine set of maps illustrating various major-city marathons. Go here for complete piece.
Sunday, August 08, 2010
The following guest blog post was written by Albert Sun of the Wall Street Journal. He takes us behind the scenes in the creation of a recent news graphic titled: “Going the Distance: Comparing Marathons“.
The Google Maps API has been a great boon for news websites and a great help in creating all kinds of interactive graphics involving maps. Here at the WSJ we're big fans of the API and happy that Google continues to improve it and roll out new features.
We got the idea to map out the routes of Marathons from a story by Kevin Helliker about how despite the beautiful scenic route of the race, the San Francisco marathon was still very unpopular. The difficulty and the hilly terrain kept people from attempting it. To help people see this better, we decided to compare the San Francisco marathon to the big three US marathons: Boston, New York and Chicago.
The code for our marathons graphic grew out of a similar graphic we did for our coverage of the Tour De France. In this one, we managed to incorporate many improvements. Two new features of the Google Maps API played a big role in this graphic. The Elevation API let us quickly and easily get a comparison between the different routes.
Styled Maps let us give the map more of a distinctive WSJ look. We have a distinctive style for our maps in print, and there is some reluctance to run maps online that deviate from that style. Styled Maps lets us get close enough for what we're trying to show. When Styled Maps first becomes available we used the Styled Map Wizard to create a set of different looks for different types of maps, trying to recreate our own maps style.
Along with the Google Maps API, we used jQuery for its wealth of convenience functions and how much easier it makes writing programs in JavaScript. The core of the graphic is a basic Polyline drawn in Google Maps showing the route. [more]
Jack Kinsella, of the Bolivian Express, writes:
“The Bolivian Express, an English language magazine in La Paz, Bolivia, set up by Bolivian graduates in collaboration with students from around the world. We are a subsidiary of the Grupo Express Press, which publishes another magazine in Bolivia, Revista Metro (http://www.metrobolivia.com/metro/default.asp). We would love if you could include us in your database of journalism internships and feature us on your website (http://journalism.nyu.edu/careerservices/internships/postintern.html). “The Bolivian Express has just started an ongoing journalism internship program in Bolivia where interns take Spanish classes, journalism classes, photography classes and cinematography classes. Participants are paired with Bolivians in La Paz and are then expected to explore Bolivian culture, eventually producing four pages of content for our magazine each month. This content is then passed to our editors who offer feedback, helping to improve our intern's writing skills. Due to the large numbers of classes offered our internship is perfect for students with a strong passion for learning. “Our magazine is distributed on the ground, in the skies and, within the next week, online. “For more information see our website: http://www.bolivianexpress.org
“The Bolivian Express, an English language magazine in La Paz, Bolivia, set up by Bolivian graduates in collaboration with students from around the world. We are a subsidiary of the Grupo Express Press, which publishes another magazine in Bolivia, Revista Metro (http://www.metrobolivia.com/metro/default.asp). We would love if you could include us in your database of journalism internships and feature us on your website (http://journalism.nyu.edu/careerservices/internships/postintern.html).
“The Bolivian Express has just started an ongoing journalism internship program in Bolivia where interns take Spanish classes, journalism classes, photography classes and cinematography classes. Participants are paired with Bolivians in La Paz and are then expected to explore Bolivian culture, eventually producing four pages of content for our magazine each month. This content is then passed to our editors who offer feedback, helping to improve our intern's writing skills. Due to the large numbers of classes offered our internship is perfect for students with a strong passion for learning.
“Our magazine is distributed on the ground, in the skies and, within the next week, online.
“For more information see our website: http://www.bolivianexpress.org
If you've acquired a spreadsheet file with a bunch of addresses, you can quickly map them using BatchGeo. We haven't tried it yet with a huge data set, but it works nicely with a couple hundred addresses. Check out BatchGeo at http://batchgeo.com/
“Have locations in a spreadsheet? Well try this free and unique tool to…
Get started by following the steps below, or check out our video tutorials
… What could I use this for?
15 Crazy Useful JavaScript Solutions for Charts and Graphs
Graphs and charts are a great way to break down the information at hand to the user in a descriptive and visually enticing manner. These visual structures allow you to easily simplify complex data and output easier to understand content. Everyone can use a graph or chart, however, not everyone has the right tools to create an effective one. Below we’ve compiled the best JavaScript graphs and chart solutions. We chose to put a list of JavaScript graphs because of their flexibility and functionality.