Sent to you via Google Reader
High up on the 28th floor of the New York Times, a pair of researchers have been poring over the newspaper’s data, looking to understand the way influence plays out online. What Mark Hansen, a UCLA statistics professor on sabbatical, and Jer Thorp, a data artist in residence at the Times, have found is that stories take on a life of their own, which can be mapped and visualized in some startlingly beautiful ways. The work, still “crazy” preliminary, shows how organizations are looking to mine their data to find ways to improve their operations. And it also shows the challenges that lay ahead in trying to turn the data into clear actions.
Hansen and Thorp, who talked at a TimesOnline event last night, took two weeks of August data from the paper, looking at how stories were shared through the Times’ site, Bit.ly and Twitter. The pair built a tool that allowed them to see the life of a story, from where it first began as a URL tweeted by the Times to being retweeted and shared again and again. The tool can render a simple timeline, a wheel with spokes or a radar view showing spikes of tweets. But it can also go 3-D, creating a funnel that expands over time as stories keep getting shared.
By visualizing the data, Hansen and Thorp were able to isolate “cascades,” a chain of events that extend the life of a story, and can identify who has the influence online to keep it going. For example, a column by Paul Krugman inspired modest sharing but took off when Tim O’Reilly, founder of O’Reilly Media, retweeted it. In other cases, like the story of the flight attendant who escaped down the plane’s slide, the cascades are more dynamic and complicated.
While it’s still quite early, Hansen said the next steps will be to make the project handle both real-time and archived information. The hope is that the Times can suss out which factors can affect a story’s life, whether it’s the section it’s in or the time it’s released. But this is where the tough part begins. It’s not enough to get the data; now the paper has to ask the right questions of it. As Michael Driscoll, founder of Dataspora and co-founder of Metamarkets (see disclosure below) said in a previous story, analytics is the key to tapping the potential of big data. The ingesting and visualization of data are critical elements but analysis is where companies make their money.
Think of using data as a three-step process. One has to have the data, then ask the data the right questions, then act upon the information. But with more data available to people, the number of questions that can be asked expand. It’s kind of like suddenly...