Alfredo Covaleda,
Bogota, Colombia
Stephen Guerin,
Santa Fe, New Mexico, USA
James A. Trostle,
Trinity College, Hartford, Connecticut, USA
We are finding O'Reilly's Radar an increasingly valuable site/blog to keep up with interesting developments in Web 2.0, publishing and the general Digital Revolution. Brady Forrest's contribution below is an example.
See http://radar.oreilly.com/archives/2007/05/trends_of_onlin.html
Trends of Online Mapping Portals
Posted: 21 May 2007 04:34 PM CDT
By Brady Forrest
Last week there were several announcements made that show the direction of the online mapping portals. Satellite images and slippy maps are no longer differentiators for attracting users, everyone has them and as I noted last week there are now companies that have cropped up to service companies that want their own maps. Some of these new differentiators are immersive experiences, owning the stack, and data!
Immersive experience within the browser – A couple of weeks ago Google maps added building frames that are visible at street level in some cities. These 2.5D frames are very clean and useful when trying to place something on a street.
Now the Mercury News (warning: annoying reg required; found via TechCrunch) is reporting that these builds will soon be fully fleshed out.
The Mercury News has learned that Google has quietly licensed the sensing technology developed by a team of Stanford University students that enabled Stanley, a Volkswagon Touareg R5, to win the 2005 DARPA Grand Challenge. In that race, the Stanford robotic car successfully drove more than 131 miles through the Mojave Desert in less than seven hours. The technology will enable Google to map out photo-realistic 3-D versions of cities around the world, and possibly regain ground it has lost to Microsoft's 3-D mapping application known as Virtual Earth.
The Mercury News has learned that Google has quietly licensed the sensing technology developed by a team of Stanford University students that enabled Stanley, a Volkswagon Touareg R5, to win the 2005 DARPA Grand Challenge. In that race, the Stanford robotic car successfully drove more than 131 miles through the Mojave Desert in less than seven hours.
The technology will enable Google to map out photo-realistic 3-D versions of cities around the world, and possibly regain ground it has lost to Microsoft's 3-D mapping application known as Virtual Earth.
The license will be exclusive, but don't think Google will be the only ones with 3-D in the browser. Microsoft has had 3-D for a while now (unfortunately, it requires the .NET framework; my assumption is that the team is busy converting it to SilverLight). 3-D is going to become a standard part of mapping applications. The trick will be making sure that the extra data doesn't get in the way of the user's quest to get information. Buildings are slow to render and can obscure directions.
This strategy is a nice compliment to their current strategy of gathering and harnessing 3-D models from users. Currently these are only available in Google Earth. The primary location to get them is Google's 3D Warehouse. I suspect that we will start to see user contributed models on Google Maps.
No word on how many cities Google will roll out their 3D models in or when the new data will be available via their API.
Data, Data, & More Data – Until recently, search engines did not provide neighborhoods as a way of searching cities. Neighborhoods are an incredibly useful, if hard to define, method of defining an area of a city.
Google has now added neighboorhood data to their index, but they have not really done much with it. If you know the neighborhood name then you can use that to supplement searching a city. However, if you are uncertain or if you are unaware of the feature, then you are SOL. There is no indication that the feature exists, how widespread it is, or what the boundaries of the neighborhood are. I hope that they continue to expand on this feature.
Ask on the other hand has done a great job with this feature (see above). They surface nearby neighborhood names for easy follow-on searches (see below). They show you the bounds of the neighborhood quite clearly.
Ask is using data from SF startup Urban Mapping. Urban Mapping claims complete coverage of ~300 urban areas in the US and Canada (with Europe coming). This isn't an easy problem. Urban Mapping has been working at it for quite sometime and are known for having a good data set. They have also been aggregating transit data. An interesting thing to note is that many of the same neighborhoods available on Ask are also available on Google maps (examples: Tenderloin, SF: Google, Ask; Civic Center, SF: Google, Ask) No word yet if any of the other big engines are going to add neighborhood data, but my guess is that it will soon become a standard feature; it's too useful to not have.
Own the Stack – Until recently, Yahoo! used deCarta to handle creating directions (or routing). They have announced that they have taken ownership of this part of the stack and have built their own routing engine. Ask and Google still use deCarta. Microsoft has always had their own. Yahoo! is hoping to make their new engine a differentiator. In some ways this is analogous to Microsoft's purchase of Vexcel, a 3D imagery provider. Microsoft did not want the same 3D data as Google Earth or any other search engine for its 3D world.
I think that any vendor servicing Google, Microsoft, Ask, Yahoo or MapQuest will have to keep an eye on their next source of revenue. Those contracts aren't going to necessarily last too long. The geostack is too valuable to outsource.
There is only one part of the stack that I think *might* be to expensive for any one of the engines to buy or build out right. That's the street data and it's a data source primarily supplied by two companies, NAVTEQ and Tele Atlas. NAVTEQ has a market cap of 3.5 bilion dollars as of this writing; Tela Atlas has one of 1.4 billion pounds. These would be spendy purchases. Microsoft is currently working closely with Facet Technology Corporation to collect street data for cities to add a street-level 3D layer (see Facet's SightMap for a preview), but this Facet is not collecting data to match the other players. It will be interesting to see if Yahoo! parleys its partnershipOpenStreetMap into a data play. with
An interesting piece of analysis and visual infographics posted today on the O'Reilly Radar site. See http://radar.oreilly.com/archives/2007/05/baseball_team_overpaid.html
Assuming you have a baseball team, Ben Fry will let you answer that question. He has created a tool for visualizing the salary of Major League Baseball teams versus their performance in 2007 (prev. As he explains:
This sketch looks at all 30 Major League Baseball Teams and ranks them on the left according to their day-to-day standings. The lines connect each team to their 2007 salary, listed on the right. Drag the date at the top to move through the season. The first ten days of the season are ommitted because the rankings to (at least) that point are statistically silly. You can also use the arrow keys on the keyboard to move forward or backward one day. A steep blue line means that the team is doing well for its money, which reflects well on the team's General Manager. A steep red line implies that the team is throwing away money. The thickness of the line is proportional to the team's salary relative to the others.
This sketch looks at all 30 Major League Baseball Teams and ranks them on the left according to their day-to-day standings. The lines connect each team to their 2007 salary, listed on the right.
Drag the date at the top to move through the season. The first ten days of the season are ommitted because the rankings to (at least) that point are statistically silly. You can also use the arrow keys on the keyboard to move forward or backward one day.
A steep blue line means that the team is doing well for its money, which reflects well on the team's General Manager. A steep red line implies that the team is throwing away money. The thickness of the line is proportional to the team's salary relative to the others.
The images above are captures of the beginning of the season rankings (left) as compared to now (right). It looks like Boston is now at a break-even point whereas the Yankees are sinking and a bit over-paid. I wonder if any of the GM compensation decisions are made based on this tool.
We're at the UCLA conference center attending the 4th Lake Arrowhead Conference on Human Complex Systems First take:
Bill Lawless' interesting work finds that groups operating on a “concensus model” are less effective and efficient when compared to “majority model” decision-making groups.
Chasparis' work has implications for journalism institutions IF they understand that they can (should?) be the hub (or node) for facilitating transactions between users and those with the desired resources and/or between the journalistic institution and the community. The presentation is complicated and laden with equations — after all, the authors are in mechanical engineering — but study well their implications of how networks are created and emerge.
What this presentation suggests is that we could model circulation/promotion campaigns by “selling” one subscription to an individual household. Then, having planted that seed of recognition and brand AND assuming that there is neighbor-to-neighbor communication, we fertilize that seed by delivering for free our paper to the immediately adjacent neighbors. And, perhaps, we use stick-on/peel off labels to publicize something special for that node of concentration. Now we have created a potential point of commonality for the neighbors to talk about and, we hope, appreciate. The question then becomes “How can we create added value” for that cluster of subscribers.
Second point raised: Can we model what is the optimum time for prescription offers? Is 13-weeks the best or five? Let's find out.
See Gessler's homepage — http://gessler.bol.ucla.edu/ — for excellent collection of visual and dynamic tools for modeling.
Presentation on residential segregation modeling. “Schelling suggests that segregation can emerge at the active level even if it is not sought by the residents.” Later findings (Bruch and Mare): Segregation increases with indifference to segregation. Why? Not really a lack of indifference. Also, equal granularity in the multicultural function. (See also: http://paa2006.princeton.edu/download.aspx?submissionId=60143)
Conclusions:
Interesting discussion of what he terms “discourse communities.” i.e. “Dynamic interplay of cultural resources and situated identities.”
His approach is to apply a number of theoretical metrics (15 models) to building a “society” (based on good anthro data) and see which works best. An approach closely related to exploratory data analysis that analytic journalists often use.
Commonalities of models that worked well: 1) Agents were quasi-optimal (smart) 2) Agents nonetheless diverse (heterogeneous.e.g. individual agents doing different things.)
Interesting related link here for
Good presentation on simulation (computational modeling?) of the Tuberculosis cycle in Tijuana plus looking at models of corruption. He points out that the Chinese population in Tijuana is growing very fast. Interesting, and valuable, application of Maslow's pyramid of needs concepts (i.e starting with the physical needs to social to moral needs.)
Working on integrating Beer's Visable Systems Model with transactional analysis models.
Fifth Session
Objective: to make logistics systems work in/as complex adaptive models.[Essentially, this is about the best — most efficient — way to receive raw materials and deliver the finished product to customers of various types. Could have direct application for publishing industry, if it only knew about such methods.]
They are researching how to build-in RFID chips into products like cars to imbue the product with enough intelligence to, for example, figure out the most optimum way to get itself to a truck or ship.
PlaSMA: Multiagent-based simulation for logistics
This doesn't have anything to do with Analytic Journalism per se, but while flying from Cairo to Dubai recently I looked out the window at 39,000 feet somewhere over the sands of central Saudi Arabia. What to my wondering eyes did appear, but an expanse of pie charts.
Of course these are irrigated crops. A friend in Dubai, who grew up in Saudi Arabia, said the reason they are not all completely filled circles is because some growers don't have enough money (yet) to buy the equipment necessary to complete the 360-degree irrigation.
Our thanks to someone somewhere who pointed us to “Flashearth,” an interesting site under development that supplies links to multiple mapping programs that draw on global satellite imagery. The are: Google Maps; Microsoft VE (aerial); Microsoft VE (labels); Yahoo Maps; Ask.com (aerial); Ask.com (physical); OpenLayers; NASA Terra (daily).The sites vary in the degree of “zoomability,” but each offers slightly different capability and data. In any event, it is most likely worthy of a bookmark.
We realize there is a robust handful of very good infographic reporters and designers working out there for many different publications, but the gang at the NY Times just keeps on keepin' on with innovative — and 98 percent of the time — highly informative infographics and visual displays of data. Today's (25 Feb 2007) edition is a basket rich with fine examples:
* “Truck Sales Slip, Tripping Up Chrysler” (Business Section, p. 8). Offers up a complex (they often are) “treemap” of vehicle sales.
* “Who Do You Think We Are?” (Week in Review – Op-Art, p. 15). Ben Schott, author of “Schott’s Original Miscellany” and “Schott’s Almanac 2007,” a yearbook of American society.” presents some basic line and bar charts, but on subjects of interest to AJ readers. Specifically, “Confidence in Institutions” (the “press” is the lowest, even below Congress) and “Newspaper Readership.” (And you already know what that graph looks like.)
*) “How Two Rights Can Make a Wrong” (Week in Review – p. 5). Howard Markel, M.D. and Bill Marsh give us a fine graphic illustrating complex drug interactions.
Juan C. Dürsteler, in Barcelona, Spain, edits a fine online magazine devoted to information graphics. The current issue describes “… the diagram for the process of Information Visualisation as seen by Yuri Engelhardt and the author after a series of discussions about its nature and the process that leads from Data to Understanding.”
And it is available in English and Spanish. Check out http://www.infovis.net/printMag.php?num=187&lang=2
The IAJ — and the cause for analytic journalism — picked up some good publicity in Venezuela this week. The IAJ's managing director, Tom Johnson, is in Maracaibo to give three days of lectures and workshops to students and professors from the University of Zulia and a handful of local journalists. The largest newspaper in Maracaibo, Panorama, gave good inside play on Monday to an interview with Johnson.
[Ed. Note: Strange as it may seem, the policy of this newspaper, Panorama, is to remove or de-activate the links to stories 24 hours after they appear. No, it is not a money thing, because the newspaper's web site doesn't even have a “search” tool. So we've taken the liberty to post the entire article below.]
ENTREVISTA. EL PERIODISTA NORTEAMERICANO DICTA CURSO
Tom Johnson: “Vivimos una revolución digital”
Texto: Ricardo Pineda Toledo
12 Feb. 2007
A escasas horas de haber aterrizado en Maracaibo, el periodista norteamericano Tom Johnson visitó PANORAMA para compartir algunos detalles del curso que impartirá, a partir hoy y hasta el miércoles), sobre periodismo analítico en plena era digital.
El profesor de la Universidad de San Francisco [sic: San Francisco State University[, California, y alguna vez periodista de las revistas Time y Fortune, por sólo nombrar algunas, estuvo acompañado por los profesores de La Universidad del Zulia María Isabel Neuman y Ángel Páez. Aseguró que tiene muchas expectativas de interactuar con profesionales y estudiantes de periodismo en el Zulia.
—¿Cuáles son los principales problemas que, a su juicio, se presentan en las redacciones de los diarios de hoy?—El problema principal no es uno nuevo: es la resistencia de los gobiernos, de cualquier país, en dejar que el público acceda a la data que ellos necesitan para formar sus propias decisiones. En todos los países del mundo, los políticos quieren mantener las cosas en secreto. Son tiempos muy difíciles en los Estados Unidos con la actual administración (de George W. Bush). En segundo lugar, las instituciones tradicionales del periodismo —tanto impreso como audiovisual— no invierten, en mi opinión, el dinero suficiente en la educación de sus empleados. Estamos ante una revolución digital, el ambiente informativo se transforma literalmente todos los días.
—¿Cómo cree que deba ser la incorporación de los recursos tecnológicos en las escuelas de periodismo universitarias?—Tradicionalmente, los departamentos de comunicación social han enfatizado la enseñanza de cómo escribir. La redacción es muy importante y es absolutamente necesaria, pero en sí mismo no es suficiente.
Escribir es una de las herramientas que todo periodista debe dominar; lo que hablaré en este seminario es cómo deben pensar cualitativamente en su habilidad redaccional y, simultáneamente, de manera cuantitativa, para que empleen habilidades numéricas del tipo analítico.
Por ejemplo, si alguien dice que Maracaibo está creciendo y expandiéndose, eso es un juicio cualitativo; pero los periodistas necesitan medir qué tan rápido serán esos cambios y a qué grado.
—¿Cuánto porcentaje del contenido de la red utiliza un reportero promedio?—No puedo dar un ejemplo preciso, excepto por mi esposa, que es antropóloga y abogada. Hace un año terminó un libro de sistema de legalidad internacional. Le tomó cinco años realizarlo y pensó, al comienzo, que tal vez le tomaría la mitad investigarlo por internet. Pasó el tiempo y al final del libro toda su investigación se basó por entero en la red.
Los reporteros pueden trabajar de la misma manera; es sólo cuestión de salir a hacer la búsqueda inicialmente en la calle y luego tomar de la red el complemento. Hay que salir a la calle porque no se puede realizar todo desde la oficina. Un periodista siempre tiene que tener contacto humano, hablar con las fuentes, encontrar los tópicos más importantes de un tema.
—Internet y periodismo. Nombre algunas ventajas y advertencias.
—La primera ventaja es la velocidad de acceder a la data, y por data no me refiero a la cuantitativa sino que también a la cualitativa. Veremos más data de audio y de video. La habilidad de contactar una variedad de fuentes alrededor del mundo sobre un problema se está agilizando.
En cuanto a las advertencias o, mejor dicho, desventajas, pienso que los periodistas más jóvenes tienden a pensar que todo lo que está en la red es verdad y eso tal vez no sea el caso. Así que nosotros, como educadores, tenemos que tratar de inculcar a nuestros alumnos un sentido de escepticismo más grande, así sean de fuentes oficiales.
—¿Qué destrezas deberá cultivar un periodista de hoy para ser más competitivo en su carrera? —Como sugerí anteriormente, las habilidades cuantitativas. El periodismo es el primer refugio de los fóbicos a las matemáticas, o al menos lo ha sido. Las personas que le tienen miedo a los problemas aritméticos, se dicen que prefieren ser periodistas. Ya eso no funciona. Necesitamos tener esas habilidades cuantitativas y analíticas; esa sería la primera destreza.
El segundo más importante, creo yo, es el uso de sistemas de información geográfica, porque todo lo que la gente hace o dice tiene un punto geográfico: demografía, planos, flujo de tráfico… todos ellos nos pueden dar una base para entender diferentes tipos de fenómeno y saber, a partir de su aritmética, algo a partir de la observación y sus locaciones.
—Según su experiencia, ¿qué es lo más curioso que le ha ocurrido en esta época de rápidas evoluciones tecnológicas? —He vivido en San Francisco desde 1975 y, desde que trabajaba como periodista en la revista Time, presencié cómo esta revolución de computadoras se vino desarrollando en nuestra profesión. Tuve la oportunidad de asistir a la primera feria de computadoras en San Francisco, en 1980.
He visto tantos cambios desde entonces, como traducciones al instante a través de un chat entre colegas de diferentes países e idiomas. Esa velocidad es el factor de contacto simultáneo del cual me refiero, con personas que en sus vidas ni se han conocido. Eso es magia.
—En el caso de Latinoamérica, ¿cuáles son los desafíos que tienen los editores frente a la red?—Creo que los desafíos son los mismos: ¿Cómo seguimos produciendo periódicos todos los días y encontrar tiempo para realizar los métodos de enseñanza a los que me referí al comienzo? Pienso que por los próximos 20 años Latinoamérica, África y el suroeste de Asia tienen potencial de crecimiento en términos de “tinta y papel”, en contraste con Europa y Norteamérica, donde vemos las circulaciones decaer no sólo por el número de lectores potenciales sino por lo escaso de jóvenes estudiantes en el área periodística. Ustedes no tienen ese problema, nosotros y Europa, sí.
—Los blogs, ¿moda o un nuevo discurso de interacción en la red?—Ambas. Yo prefiero verlas como una interesante tecnología, porque es muy accesible con pocas destrezas. Por eso tienes a millones de personas escribiendo en una especie de diario virtual. Se convierte en una cuestión de ego, pero esa misma tecnología está creando oportunidades entre varias comunidades para hacerle saber a la gente qué es lo que está pasando a su alrededor. Cualquier persona con mínimo esfuerzo puede contribuir a una conversación comunitaria.
—Con los blogs y las facilidades para producir y difundir información se habla de que todo el mundo puede hacer periodismo. ¿Es eso peligroso?—No lo creo. Existe un mecanismo de autocorrección en el flujo de manejar la información. Puede haber pequeños momentos en donde un amateur puede capturar la esencia de la noticia como cualquier periodista, tal vez no tan artísticamente como un fotógrafo o redactor profesional.
Frente a un suceso, ciudadanos pueden tomar fotos desde sus teléfonos celulares. ¿Es eso periodismo? Sí lo creo. Los ciudadanos, a la larga, se convierten en periodistas al azar.
—¿Qué servicios y desarrollos debe ofrecer un periódico electrónico para ser realmente competitivo? —Mapas, mostrar al lector el punto geográfico donde se desarrolla el hecho (léase infográfico). En segundo lugar, mientras le podamos ofrecer al lector data primaria relacionada con sus vidas, más querrán visitar la página web.
Han habido esfuerzos exitosos de poner en los sitios del periódico imágenes en tiempo real de mapas de tráfico, para ver si hay un accidente o embotellamiento. Eso los previene.
Los periódicos en esta nueva era deben convertirse en un mercado para transacciones: que el público acceda a la información de su interés, bien sea de índole política, críticas de cine o el precio de un jabón. Después de todo, ¿qué tanto hacemos para que las personas en sus comunidades se relacionen con nuestros potenciales anunciantes?###
Pardon the expression, but there seems to be a real “surge” in infographics and visual statistics news in recent days. This post on Tim O'Reilly blog (an increasingly informative site, I find) points us to some interesting tools out of the IBM shop. Be sure to check out the site for “Many Eyes.” Impressive, and highly informative visualization of useful data.
By Tim O'Reilly
IBM today announced Many Eyes, a site for sharing and commenting on visualizations. Martin Wattenberg, who developed the original version of the treemap we use for our book market visualizations as well as the awesome baby name voyager, and Fernanda Viegas, who worked with him on the equally awesome history flow visualizations of Wikipedia, are the geniuses behind this project.
As with swivel, users can upload any data set, but the tools for visualizing and graphing the data are much richer. The visualization options include US and World maps, line graphs, stack graphs, bar charts, block histograms, bubble diagrams, scatter plots, network diagrams, pie charts, and treemaps. The site isn't yet live, but should be very shortly. Meanwhile, you can get a good sense of the types of graphs available by checking out the visualization gallery.
I asked Martin and Fernanda how they compared themselves to swivel, and Fernanda replied:
You also asked if we see our site as “Swivel for visualization”. That phrase isn't quite accurate (any more than Swivel is “Many Eyes for data” ;-). Both our site and Swivel are examples of a broader phenomenon, which we call “social data analysis,” where playful, social exploration of data leads to serious analysis. At the same time the two sites fall on different ends of a spectrum. Swivel seems to have some neat data mining technology that finds correlations automatically. By contrast, we've placed our emphasis on the power of human visual intelligence to find patterns. My guess is that both approaches will be successful because social data analysis is a powerful idea.
Martin added:
In Many Eyes our goal is to “democratize” visualization by offering it as a simple service. We also think that there's something special about visualizations that gets people talking, so we placed a big emphasis in design and technology to let people have conversations around the visualizations.
Personally, I'd love to see swivel and manyeyes working together, as swivel already has some great data sets, but has only a limited number of graphing tools. But that's an exercise for the future. For now, data wonks can just rejoice that both sites exist, and should start exploring, and as Martin says, conversing about what they find. I love both of these sites.
Thanks to our friend at the University de Zulia in Maracaibo, Prof. Maria-Isabel Neuman, we just learned about this Rosetta Stone of data visualization. This is a must-see: “A Periodic Table of Visualization Methods.”http://www.visual-literacy.org/pages/documents.htm These guys in Switzerland at the Visual-Literacy Project have pulled together, in a wonderfully coherent fashion, the multiple concepts that many of us have been working on for years. Be sure to also take a look at the paper by Lengler and Eppler at the bottom of the “Maps” page. It's a good, tight explanation of what they are up to. We like their definition:
But we're not so sure that “permanent” is crucial or should even be included. If they are referring to “method,” then that would seem to limit the opportunity for refinements over time. And if they are talking about the resulting displays of data, might not that reduce the possibility of dynamic data displays, say real-time traffic flows or changes in the stock market? Simulations? Oh, well, a refinement ripe for discussion.