I’m headed to the Philippines this week to collaborate with the UN Office for the Coordination of Humanitarian Affairs (OCHA) on humanitarian crowdsourcing and technology projects. I’ll be based in the OCHA Offices in Manila, working directly with colleagues Andrej Verity and Luis Hernando to support their efforts in response to Typhoon Yolanda. One project I’m exploring in this respect is a novel radio-SMS-computing initiative that my colleague Anahi Ayala (Internews) and I began drafting during ICCM 2013 in Nairobi last week. I’m sharing the approach here to solicit feedback before I land in Manila.
The “Radio + SMS + Computing” project is firmly grounded in GSMA’s official Code of Conduct for the use of SMS in Disaster Response. I have also drawn on the Bellagio Big Data Principles when writing up the in’s and out’s of this initiative with Anahi. The project is first and foremost a radio-based initiative that seeks to answer the information needs of disaster-affected communities.
The project: Local radio stations in the Philippines would create and broadcast radio programs inviting local communities to serve as “community journalists” to describe how the Typhoon has impacted their communities. The radio stations would provide a free SMS short-code and invite said communities to text in their observations. Each radio station would include in their broadcast a unique 2-letter identifier and would ask those texting in to start their SMS with that identifier. They would also emphasize that text messages should not include any Personal Identifying Information (PII) and no location information either. Those messages that do include PII would be deleted.
Text messages sent to the SMS short code would be automatically triaged by radio station (using the 2-letter identifier) and forwarded to the respective radio stations via SMS. (At this point, few local radio stations have web access in the disaster-affected areas). These radio stations would be funded to create radio programs based on the SMS’s received. These programs would conclude by asking local communities to text in their information needs—again using the unique radio identifier as a prefix in the text messages. Radio stations would create follow-up programs to address the information needs texted in by local communities (“news you can use”). This could be replicated on a weekly basis and extended to the post-disaster reconstruction phase.
In parallel, the text messages documenting the impact of the Typhoon at the community level would be categorized by Cluster—such as shelter, health, education, etc. Each classified SMS would then be forwarded to the appropriate Cluster Leads. This is where advanced computing comes in: the application of microtasking and machine learning. Trusted Filipino volunteers would be invited to tag each SMS by Cluster-category (and also translate relevant text messages into English). Once enough text messages have been tagged per category, the use of machine learning classifiers would enable the automatic classification of incoming SMS’s. As explained above, these classified SMS’s would then be automatically forwarded to a designated point of contact at each Cluster Agency.
This process would be repeated for SMS’s documenting the information needs of local communities. In other words, information needs would be classified by Cluster category and forwarded to Cluster Leads. The latter would share their responses to stated information needs with the radio stations who in turn would complement their broadcasts with the information provided by the humanitarian community, thus closing the feedback loop.
The radio-SMS project would be strictly opt-in. Radio programs would clearly state that the data sent in via SMS would be fully owned by local communities who could call in or text in at any time to have their SMS deleted. Phone numbers would only be shared with humanitarian organization if the individuals texting to radio stations consented (via SMS) to their numbers being shared. Inviting communities to act as “citizen journalists” rather than asking them to report their needs may help manage expectations. Radio stations can further manage these expectations during their programs by taking questions from listeners calling in. In addition, the project seeks to limit the number of SMS’s that communities have to send. The greater the amount of information solicited from disaster-affected communities, the more challenging managing expectations may be. The project also makes a point of focusing on local information needs as the primary entry point. Finally, the data collection limits the geographical resolution to the village level for the purposes of data privacy and protection.
It remains to be seen whether this project gets funded, but I’d welcome any feedback iRevolution readers may have in any event since this approach could also be used in future disasters. In the meantime, my QCRI colleagues and I are looking to modify AIDR to automatically classify SMS’s (in addition to tweets). My UNICEF colleagues already expressed to me their need to automatically classify millions of text messages for their U-Report project, so I believe that many other humanitarian and development organizations will benefit from a free and open source platform for automatic SMS classification. At the technical level, this means adding “batch-processing” to AIDR’s current “streaming” feature. We hope to have an update on this in coming weeks. Note that a batch-processing feature will also allow users to upload their own datasets of tweets for automatic classification.