5.0 out of 5 starsWorld-Changing Book Documenting Intersection of Humans, Technology, and Policy-Ethics, February 2, 2015
This is a hugely important work, one that responds to the critical needs outlined by Micah Sifry in The Big Disconnect: Why The Internet Hasn’t Transformed Politics (Yet) and others such as myself writing these past 25 years on the need to reform the pathologically dysfunctional US secret intelligence community that is in constant betrayal of the public trust.
Digital Humanitarians are BURYING the secret world. For all the bru-ha-ha over NSA’s mass surveillance and the $100 billion a year we spend doing largely technical spying (yet only processing 1% of what we waste money on in collection), there are two huge facts that this book, FOR THE FIRST TIME, documents:
Crowds—rather than sole individuals—are increasingly bearing witness to disasters large and small. Instagram users, for example, snapped 800,000 #Sandy pictures during the hurricane last year. One way to make sense of this vast volume and velocity of multimedia content—Big Data—during disasters is with PhotoSynth, as blogged here. Another perhaps more sophisticated approach would be to use CrowdOptic, which automatically zeros in on the specific location that eyewitnesses are looking at when using their smartphones to take pictures or recording videos.
“Once a crowd’s point of focus is determined, any content generated by that point of focus is automatically authenticated, and a relative significance is assigned based on CrowdOptic’s focal data attributes […].” These include: (1) Number of Viewers; (2) Location of Focus; (3) Distance to Epicenter; (4) Cluster Timestamp, Duration; and (5) Cluster Creation, Dissipation Speed.” CrowdOptic can also be used on live streams and archival images & videos. Once a cluster is identified, the best images/videos pointing to this cluster are automatically selected.
My colleague Fernando Diaz has continued working on an interesting Wikipedia project since he first discussed the idea with me last year. Since Wikipedia is increasingly used to crowdsource live reports on breaking news such as sudden-onset humanitarian crisis and disasters, why not mine these pages for structured information relevant to humanitarian response professionals?
In computing-speak, Sequential Update Summarization is a task that generates useful, new and timely sentence-length updates about a developing event such as a disaster. In contrast, Value Tracking tracks the value of important event-related attributes such as fatalities and financial impact. Fernando and his colleagues will be using both approaches to mine and analyze Wikipedia pages in real time. Other attributes worth tracking include injuries, number of displaced individuals, infrastructure damage and perhaps disease outbreaks. Pictures of the disaster uploaded to a given Wikipedia page may also be of interest to humanitarians, along with meta-data such as the number of edits made to a page per minute or hour and the number of unique editors.
Fernando and his colleagues have recently launched this tech challenge to apply these two advanced computing techniques to disaster response based on crowdsourced Wikipedia articles. The challenge is part of the Text Retrieval Conference (TREC), which is being held in Maryland this November. As part of this applied research and prototyping challenge, Fernando et al. plan to use the resulting summarization and value tracking from Wikipedia to verify related crisis information shared on social media. Needless to say, I’m really excited about the potential. So Fernando and I are exploring ways to ensure that the results of this challenge are appropriately transferred to the humanitarian community. Stay tuned for updates.
See also: Web App Tracks Breaking News Using Wikipedia Edits [Link]
As part of QCRI’s Artificial Intelligence for Monitoring Elections (AIME) project, I liaised with Kaggle to work with a top notch Data Scientist to carry out a proof of concept study. As I’ve blogged in the past, crowdsourced election monitoring projects are starting to generate “Big Data” which cannot be managed or analyzed manually in real-time. Using the crowdsourced election reporting data recently collected by Uchaguzi during Kenya’s elections, we therefore set out to assess whether one could use machine learning to automatically tag user-generated reports according to topic, such as election-violence. The purpose of this post is to share the preliminary results from this innovative study, which we believe is the first of it’s kind.
My colleague Hemant Purohit at QCRI has been working with us on automatically extracting needs and offers of help posted on Twitter during disasters. When the 2-mile wide, Category 4 Tornado struck Moore, Oklahoma, he immediately began to collect relevant tweets about the Tornado’s impact and applied the algorithms he developed at QCRI to extract needs and offers of help.