A Twitter analytics company that said it detected Osama bin Laden’s death before it was reported by the news media has signed a partnership with Twitter, and is expanding the availability of its service for notifying financial firms and government clients of highly unusual events.
The company, Dataminr, described its technology at the Twitter Devnest conference last May, shortly after its service used Twitter data to report bin Laden’s death to its clients before the story hit major media outlets. Today, Dataminr is announcing a partnership with Twitter allowing it greater access to tweets and their metadata, and is expanding availability of the service.
Dataminr and Twitter did not make its executives available for phone interviews, saying Dataminr customers are concerned about revealing too much information, but gave us an early copy of the press release being issued this morning. The announcement says that “Dataminr has just signed a partnership with Twitter, which includes access to the full Twitter Firehose in real-time,” and that it is unveiling “its novel technology for using Twitter’s public Tweets to create actionable signals for enterprise clients.”
Financial services firms, naturally, want to know as soon as possible what major events will move the markets. Bin Laden’s death certainly qualified, and at the time it occurred Dataminr said it already had had three of the top five “bulge-bracket investment banks” and a $15 billion equities hedge fund testing the service in beta.
The early detection of the bin Laden death prompted Dataminr to start selling to government clients, a company spokesperson told Ars. (Correction: We are now being told the bin Laden death is not what prompted the company to sell to government clients, but Dataminr does indeed have government customers.) The exact identity of clients hasn’t been revealed, but the company says it includes “buy-side and sell-side financial firms, as well as municipal and federal clients in the government sector.” If consistently accurate and early, this type of analytics could be useful for high-frequency trading, in which every second counts.
How Dataminr spotted bin Laden’s death
So, how did Dataminr learn about bin Laden’s death before most of the rest of the world? It came down primarily to 19 tweets seen in a five-minute period. Linguistic analysis, sentiment classification, analysis of Twitter metadata, and monitoring of spikes in volume, merged with unspecified “third-party and client proprietary data” is used to detect reports of major events and their reliability.
“Dataminr sent an alert in this instance at 10:20 [p.m. Eastern time on May 1, 2011], based on only 19 messages,” Dataminr founder and CEO Ted Bailey told the Twitter Devnest crowd. “This was, for our clients, the earliest warning system in the entire financial industry for this event, which had a dramatic effect on the market once it hit the financial radar. This was also one of the most viral events in Twitter’s history. Messages went from 19 in this 5 minute period where we caught it, up to 20,000 per minute in just half an hour.”
Bailey said the first move in S&P Futures caused by bin Laden’s death occurred at around 10:39 p.m., the US dollar index moved at 10:41, and the New York Times and Bloomberg started reporting the death at 10:43.
There were some earlier, vague indications. A Barack Obama press conference to be held at 10:30 p.m. Eastern time was announced at 9:45, but without details on what the president would say.
At 10:24, pretty solid confirmation came in a tweet from Keith Urbahn, former chief of staff to former defense secretary Donald Rumsfeld. “So I’m told by a reputable person they have killed Osama Bin Laden. Hot damn,” Urbahn wrote.
But that tweet was a few minutes after Dataminr said it was able to break the news. Bailey said various Twitter leaks, combined with the announcement of the Obama press conference, and other details helped make it clear to its system that there was a great statistical likelihood that the bin Laden death rumor was true.
Since then, Dataminr’s service has improved, it says.
“On any given day, Dataminr alerts its clients to numerous relevant events that are either pre-news or off the mainstream radar,” the company’s press release said. “In recent days, these ranged from an assassination attempt on high-ranking Arab leaders in Tajikistan, to a tsunami warning in Chile, to panic-buying of fuel by the UK public in reaction to an oil tank driver strike.”
Access to the Twitter firehose will “unlock new value,” Dataminr said. In addition to Dataminr, the social media data companies Gnip and DataSift were previously granted full access to the Firehose, which can help overcome streaming API limits.
The Firehose is not a publicly available resource, and the level of access it provides is required by very few applications, according to Twitter. There is a publicly available version of the Firehose feed called “Spritzer,” but it provides a random sample composed of just 1 percent of all public statuses. Another feed, with limited availability, is called the “Gardenhose’ and provides about 10 percent of tweets, making it more suitable for data mining. The Firehose returns all public tweets, billions each month.
The Dataminr stack includes Hadoop for batch processing, MySQL for providing quick data access to users (in the form of desktop widgets), and is hosted on Amazon’s Elastic Compute Cloud.
While Dataminr tracks certain topics on an ongoing basis, and was tracking tweets about Osama bin Laden prior to his death, its analysis is wide-ranging enough that it can detect quick changes and long-term changes in nearly any topic. Aside from major news events, Dataminr says it tracks signals from “localized on-the-ground chatter, consumer product reactions, discussion shifts in niche online communities, and growth and decay patterns in public attention.”
Phi Beta Iota: Amusing at multiple levels. N-Grams was doing this in 1985, but the CIA refused to be serious then about open source monitoring and the opportunity was lost. The capability has value, but it is not a substitute for intelligence with integrity. There is no connection between the detection of loose lips from a Rumsfeld aid, and Bin Laden’s death many years ago. What would be really interesting is an overlay of twitter lies versus twitter truths based on veracity scores that are built up over time and cross-fertilized by transparency, truth, and trust circles.