The Multi-Cluster/Sector Initial Rapid Assessment (MIRA) is the methodology used by UN agencies to assess and analyze humanitarian needs within two weeks of a sudden onset disaster. A detailed overview of the process, methodologies and tools behind MIRA is available here (PDF). These reports are particularly insightful when comparing them with the processes and methodologies used by digital humanitarians to carry out their rapid damage assessments (typically done within 48-72 hours of a disaster).
Take the November 2013 MIRA report for Typhoon Haiyan in the Philippines. I am really impressed by how transparent the report is vis-à-vis the very real limitations behind the assessment. For example:
Cryptocurrency: Yesterday, the bitcoin exchange MtGox – riddled by problems – issued a press release saying the bitcoin protocol was to blame for its ongoing problems. That statement, which caused the markets to nosedive temporarily, is outright false. The problem is, and was, bad code hygiene in the MtGox exchange itself. Here are the details.
Yesterday, when MtGox blamed “transaction malleability” as the cause of MtGox’ problems, implying that the problems at MtGox affected all exchanges and everything bitcoin, that was a sign of a very elastic relationship with facts. It’s true that transaction malleability was a factor, but not nearly in the way that MtGox implied. (We’ll be returning to what the “malleability” is.)
Here’s the real problem: MtGox is running its own homebuilt bitcoin software, and has not cared to update and upgrade that software along with the developments of the bitcoin protocol. Recently, after a very long grace period, the bitcoin protocol tightened slightly in order to disallow unnecessary information in transaction records, and did this to fix the malleability problem that MtGox blamed.
So the problem of malleability remained at MtGox, while having been fixed in the rest of the world. This – the discrepancy itself – was the root cause of the problem, because it meant that MtGox started issuing invalid transaction records for bitcoin withdrawals. Obviously, they were rejected by the bitcoin network.
The article How To Do Predictive Analytics with Limited Data from Datameer on Slideshare suggests that Limited Data may replace Big Data in import. The idea of “semi-supervised learning” is presented to handle the difficulties associated with creating predictions based on limited data such as expense and manageability and simply missing key data. The overview states,
“As it turns out, recent research on machine learning techniques has found a way to deal effectively with such situations with a technique called semi-supervised learning. These techniques are often able to leverage the vast amount of related, but unlabeled data to generate accurate models. In this talk, we will give an overview of the most common techniques including co-training regularization. We first explain the principles and underlying assumptions of semi-supervised learning and then show how to implement such methods with Hadoop.”
The presentation summarizes possible approaches to semi-supervised learning and the assumptions it is possible to make about unlabeled data (these include such models as clustering, low density and manifold assumptions). It also covers the concepts of Label Propagation and Nearest Neighbor Join. However, as inviting as it is to forget Big Data, and switch to predictive analytics with Limited Data the suggestion may sound too much like Bayes-Laplace.