5.0 out of 5 starsWorld-Changing Book Documenting Intersection of Humans, Technology, and Policy-Ethics, February 2, 2015
This is a hugely important work, one that responds to the critical needs outlined by Micah Sifry in The Big Disconnect: Why The Internet Hasn’t Transformed Politics (Yet) and others such as myself writing these past 25 years on the need to reform the pathologically dysfunctional US secret intelligence community that is in constant betrayal of the public trust.
Digital Humanitarians are BURYING the secret world. For all the bru-ha-ha over NSA’s mass surveillance and the $100 billion a year we spend doing largely technical spying (yet only processing 1% of what we waste money on in collection), there are two huge facts that this book, FOR THE FIRST TIME, documents:
Crowds—rather than sole individuals—are increasingly bearing witness to disasters large and small. Instagram users, for example, snapped 800,000 #Sandy pictures during the hurricane last year. One way to make sense of this vast volume and velocity of multimedia content—Big Data—during disasters is with PhotoSynth, as blogged here. Another perhaps more sophisticated approach would be to use CrowdOptic, which automatically zeros in on the specific location that eyewitnesses are looking at when using their smartphones to take pictures or recording videos.
“Once a crowd’s point of focus is determined, any content generated by that point of focus is automatically authenticated, and a relative significance is assigned based on CrowdOptic’s focal data attributes […].” These include: (1) Number of Viewers; (2) Location of Focus; (3) Distance to Epicenter; (4) Cluster Timestamp, Duration; and (5) Cluster Creation, Dissipation Speed.” CrowdOptic can also be used on live streams and archival images & videos. Once a cluster is identified, the best images/videos pointing to this cluster are automatically selected.
My colleague Fernando Diaz has continued working on an interesting Wikipedia project since he first discussed the idea with me last year. Since Wikipedia is increasingly used to crowdsource live reports on breaking news such as sudden-onset humanitarian crisis and disasters, why not mine these pages for structured information relevant to humanitarian response professionals?
In computing-speak, Sequential Update Summarization is a task that generates useful, new and timely sentence-length updates about a developing event such as a disaster. In contrast, Value Tracking tracks the value of important event-related attributes such as fatalities and financial impact. Fernando and his colleagues will be using both approaches to mine and analyze Wikipedia pages in real time. Other attributes worth tracking include injuries, number of displaced individuals, infrastructure damage and perhaps disease outbreaks. Pictures of the disaster uploaded to a given Wikipedia page may also be of interest to humanitarians, along with meta-data such as the number of edits made to a page per minute or hour and the number of unique editors.
Fernando and his colleagues have recently launched this tech challenge to apply these two advanced computing techniques to disaster response based on crowdsourced Wikipedia articles. The challenge is part of the Text Retrieval Conference (TREC), which is being held in Maryland this November. As part of this applied research and prototyping challenge, Fernando et al. plan to use the resulting summarization and value tracking from Wikipedia to verify related crisis information shared on social media. Needless to say, I’m really excited about the potential. So Fernando and I are exploring ways to ensure that the results of this challenge are appropriately transferred to the humanitarian community. Stay tuned for updates.
See also: Web App Tracks Breaking News Using Wikipedia Edits [Link]
As part of QCRI’s Artificial Intelligence for Monitoring Elections (AIME) project, I liaised with Kaggle to work with a top notch Data Scientist to carry out a proof of concept study. As I’ve blogged in the past, crowdsourced election monitoring projects are starting to generate “Big Data” which cannot be managed or analyzed manually in real-time. Using the crowdsourced election reporting data recently collected by Uchaguzi during Kenya’s elections, we therefore set out to assess whether one could use machine learning to automatically tag user-generated reports according to topic, such as election-violence. The purpose of this post is to share the preliminary results from this innovative study, which we believe is the first of it’s kind.
My colleague Hemant Purohit at QCRI has been working with us on automatically extracting needs and offers of help posted on Twitter during disasters. When the 2-mile wide, Category 4 Tornado struck Moore, Oklahoma, he immediately began to collect relevant tweets about the Tornado’s impact and applied the algorithms he developed at QCRI to extract needs and offers of help.
My colleague Samia Kallidis is launching a brilliant self-help app to facilitate community-based disaster recovery efforts. Samia is an MFA Candidate at the School of Visual Arts in New York. While her work on this peer-to-peer app began as part of her thesis, she has since been accepted to the NEA Studio Incubator Program to make her app a reality. NEA provides venture capital to help innovative entrepreneurs build transformational initiatives around the world. So huge congrats to Samia on this outstanding accomplishment. I was already hooked back in February when she presented her project at NYU and am even more excited now. Indeed, there are exciting synergies with the MatchApp project I’m working on with QCRI and MIT-CSAIL , which is why we’re happily exploring ways to collaborate & complement our respective initiatives.
My colleagues at the United Nations Office for the Coordination of Humanitarian Affairs (OCHA) have just published a groundbreaking must-read study on Humanitarianism in the Network Age; an important and forward-thinking policy document on humanitarian technology and innovation. The report “imagines how a world of increasingly informed, connected and self-reliant communities will affect the delivery of humanitarian aid. Its conclusions suggest a fundamental shift in power from capital and headquarters to the people [that] aid agencies aim to assist.” The latter is an unsettling prospect for many. To be sure, Humanitarianism in the Network Age calls for “more diverse and bottom-up forms of decision-making—something that most Governments and humanitarian organizations were not designed for. Systems constructed to move information up and down hierarchies are facing a new reality where information can be generated by any-one, shared with anyone and acted by anyone.”
The purpose of this blog post (available as a PDF) is to summarize the 120-page OCHA study. In this summary, I specifically highlight the most important insights and profound implications. I also fill what I believe are some of the report’s most important gaps. I strongly recommend reading the OCHA publication in full, but if you don’t have time to leaf through the study, reading this summary will ensure that you don’t miss a beat. Unless otherwise stated, all quotes and figures below are taken directly from the OCHA report.
An iRevolution reader very kindly pointed me to this excellent conceptual study: “The Theory of Crowd Capital”. The authors’ observations and insights resonate with me deeply given my experience in crowdsourcing digital humanitarian response. Over two years ago, I published this blog post in which I wrote that, “The value of Crisis Mapping may at times have less to do with the actual map and more with the conversations and new collaborative networks catalyzed by launching a Crisis Mapping project. Indeed, this in part explains why the Standby Volunteer Task Force (SBTF) exists in the first place.” I was not very familiar with the concept of social capital at the time, but that’s precisely what I was describing. I’ve since written extensively about the very important role that social capital plays in disaster resilience and digital humanitarian response. But I hadn’t taken the obvious next step: “Crowd Capital.”
My team and I at QCRI have just had this paper (PDF) accepted at the World Wide Web (WWW 2013) conference in Rio next month. The paper relates directly to our Artificial Intelligence for Disaster Response (AIDR) project. One of our main missions at QCRI is to develop open source and freely available next generation humanitarian technologies to better manage Big (Crisis) Data. Over 20 million tweets and half-a-million Instagram pictures were posted during Hurricane Sandy, for example. In Japan, more 2,000 tweets were posted every second the day after the devastating earthquake and Tsunami struck the Eastern Coast. Recent empirical studies have shown that an important percentage of tweets posted during disaster are informative and even actionable. The challenge before is how to find those proverbial needles in the haystack and to do so in as close to real-time as possible.
So we analyzed disaster tweets posted during Hurricane Sandy (2012) and the Joplin Tornado (2011). We demonstrate that disaster-relevant information can be automatically extracted from these datasets. The results indicate that 40% to 80% of tweets that contain disaster-related information can be automatically detected. We also demonstrate that we can correctly identify the type of disaster information 80% to 90% of the time. Because these classifiers are developed using machine learning, they get more accurate with more data. This explains why we are building AIDR. Our aim is not to replace human involvement and oversight but to significantly lessen the load on humans.
Both humanitarian and development organizations are completely unprepared to deal with the rise of “Big Crisis Data” & “Big Development Data.” But many still hope that Big Data is but an illusion. Not so, as I’ve already blogged here, here and here. This explains why I’m on a quest to tame the Big Data Beast. Enter Zooniverse. I’ve been a huge fan of Zooniverse for as long as I can remember, and certainly long before I first mentioned them in this post from two years ago. Zooniverse is a citizen science platform that evolved from GalaxyZoo in 2007. Today, Zooniverse “hosts more than a dozen projects which allow volunteers to participate in scientific research” (1). So, why do I have a major “techie crush” on Zooniverse?
Oh let me count the ways. Zooniverse interfaces are absolutely gorgeous, making them a real pleasure to spend time with; they really understand user-centered design and motivations. The fact that Zooniverse is conversent in multiple disciplines is incredibly attractive. Indeed, the platform has been used to produce rich scientific data across multiple fields such as astronomy, ecology and climate science. Furthermore, this citizen science beauty has a user-base of some 800,000 registered volunteers—with an average of 500 to 1,000 new volunteers joining every day! To place this into context, the Standby Volunteer Task Force (SBTF), a digital humanitarian group has about 1,000 volunteers in total. The open source Zooniverse platform also scales like there’s no tomorrow, enabling hundreds of thousands to participate on a single deployment at any given time. In short, the software supporting these pioneering citizen science projects is well tested and rapidly customizable.
. . . . . . . . . .
One of the most attractive features of many microtasking platforms such as Zooniverse is quality control. Think of slot machines. The only way to win big is by having three matching figures such as the three yellow bells in the picture above (righthand side). Hit the jackpot and the coins will flow. Get two out three matching figures (lefthand side), and some slot machines may toss you a few coins for your efforts. Microtasking uses the same approach. Only if three participants tag the same picture of a galaxy as being a spiral galaxy does that data point count. (Of course, you could decide to change the requirement from 3 volunteers to 5 or even 20 volunteers). This important feature allows micro-tasking initiatives to ensure a high standard of data quality, which may explain why many Zooniverse projects have resulted in major scientific break-throughs over the years.
Resilience is often defined as the capacity for self-organization, which in essence is cooperation without hierarchy. In turn, such cooperation implies mutuality; reciprocation, mutual dependence. This is what the French politician, philo-sopher, economist and socialist “Pierre-Joseph Proudhon had in mind when he first used the term ‘anarchism,’ namely, mutuality, or cooperation without hierarchy or state rule” (1).
As renowned Yale Professor James Scott explains in his latest book, Two Cheers for Anarchism, “Forms of informal cooperation, coordination, and action that embody mutuality without hierarchy are the quotidian experience of most people.” To be sure, “most villages and neighborhoods function precisely be-cause of the informal, transient networks of coordination that do not require formal organization, let alone hierarchy. In other words, the experience of anar-chistic mutuality is ubiquitous.”
The existence, power and reach of the nation-state over the centuries may have undermined the self-organizing capacity (and hence resilience) of individuals and small communities. Indeed, “so many functions that were once accomplished by mutuality among equals and informal coordination are now state organized or state supervised.” In other words, “the state, arguably, destroys the natural initiative and responsibility that arise from voluntary cooperation.”
This is goes to the heart what James Scott argues in his new book, which he does in a very compelling manner. Says Scott: “I am suggesting that two centuries of a strong state and liberal economies may have socialized us so that we have largely lost the habits of mutuality and are in danger now of becoming precisely the dangerous predators that Hobbes thought populated the state of nature. Leviathan may have given birth to its own justification.” And yet, we also see a very different picture of reality, one in which solidarity thrives and mutual-aid remains the norm: we see this reality surface over & over during major disasters—a reality facilitated by mobile technology and social media networks.
GeoFeedia was not originally designed to support humanitarian operations. But last year’s blog post on the potential of GeoFeedia for crisis mapping caught the interest of CEO Phil Harris. So he kindly granted the Standby Volunteer Task Force (SBTF) free access to the platform. In return, we provided his team with feedback on what features (listed here) would make GeoFeedia more useful for digital disaster response. This was back in summer 2012. I recently learned that they’ve been quite busy since. Indeed, I had the distinct pleasure of sharing the stage with Phil and his team at this superb conference on social media for emergency management. After listening to their talk, I realized it was high time to publish an update on GeoFeedia, especially since we had used the tool just two months earlier in response to Typhoon Pablo, one of the worst disasters to hit the Philippines in the past 100 years.
Humanitarian donors and organizations are increasingly championing innovation and the use of new technologies for humanitarian response. DfID, for example, is committed to using “innovative techniques and technologies more routinely in humanitarian response” (2011). In a more recent strategy paper, DfID confirmed that it would “continue to invest in new technologies” (2012). ALNAP’s important report on “The State of the Humanitarian System” documents the shift towards greater innovation, “with new funds and mechanisms designed to study and support innovation in humanitarian programming” (2012). A forthcoming land-mark study by OCHA makes the strongest case yet for the use and early adoption of new technologies for humanitarian response (2013).
These strategic policy documents are game-changers and pivotal to ushering in the next wave of humanitarian technology and innovation. That said, the reports are limited by the very fact that the authors are humanitarian professionals and thus not necessarily familiar with the field of advanced computing. The purpose of this post is therefore to set out a more detailed research framework for next generation humanitarian technology and innovation—one with a strong focus on information systems for crisis response and management.
My colleagues and I at QCRI partnered with the World Bank several months ago to develop an automated GeoTagger platform to increase the transparency and accountability of international development projects by accelerating the process of opening key development and finance data. We are proud to launch the first version of the GeoTagger platform today. The project builds on the Bank’s Open Data Initiatives promoted by former President, Robert Zoellick, and continued under the current leadership of Dr. Jim Yong Kim.
The Bank has accumulated an extensive amount of socio-economic data as well as a massive amount of data on Bank-sponsored development projects worldwide. Much of this data, however, is not directly usable by the general public due to numerous data format, quality and access issues. The Bank therefore launched their “Mapping for Results” initiative to visualize the location of Bank-financed projects to better monitor development impact, improve aid effectiveness and coordination while enhancing transparency and social accountability. The geo-tagging of this data, however, has been especially time-consuming and tedious. Numerous interns were required to manually read through tens of thousands of dense World Bank project documentation, safeguard documents and results reports to identify and geocode exact project locations. But there are hundreds of thousands of such PDF documents. To make matters worse, these documents make seemingly “random” passing references to project locations, with no sign of any standardized reporting structure whatsoever.
Professor Muki Haklay kindly shared with me this superb new study in which he questions the alleged democratization effects of Neogeography. As my colleague Andrew Turnerexplained in 2006, “Neogeography means ‘new geography’ and consists of a set of techniques and tools that fall outside the realm of traditional GIS, Geographic Information Systems. […] Essentially, Neogeography is about people using and creating their own maps, on their own terms and by combining elements of an existing toolset. Neogeography is about sharing location information with friends & visitors, helping shape context, and conveying under-standing through knowledge of place.” To this end, as Muki writes, “it is routinely argued that the process of producing and using geographical information has been fundamentally democratized.” For example, as my colleague Nigel Snoad argued in 2011, “[…] Google, Microsoft and OpenStreetMap have really demo-cratized mapping.” Other CrisisMappers, including myself, have made similar arguments over the years.
Over 1 million unique users posted more than 2.7 million tweets in just 3 days following the triple bomb blasts that struck Mumbai on July 13, 2011. Out of these, over 68,000 tweets were “original tweets” (in contrast to retweets) and related to the bombings. An analysis of these tweets yielded some interesting patterns. (Note that the Ushahidi Map of the bombings captured ~150 reports; more here).
One unique aspect of this study (PDF) is the methodology used to assess the quality of the Twitter dataset. The number of tweets per user was graphed in order to test for a power law distribution. The graph below shows the log distri-bution of the number of tweets per user. The straight lines suggests power law behavior. This finding is in line with previous research done on Twitter. So the authors conclude that the quality of the dataset is comparable to the quality of Twitter datasets used in other peer-reviewed studies.