“Arguing that Big Data isn’t all it’s cracked up to be is a straw man, pure and simple—because no one should think it’s magic to begin with.” Since citing this point in my previous post on Big Data for Disaster Response: A List of Wrong Assumptions, I’ve come across more mischaracterizations of Big (Crisis) Data. Most of these fallacies originate from the Ivory Towers; from social scientists who have carried out one or two studies on the use of social media during disasters and repeat their findings ad nauseam as if their conclusions are the final word on a very new area of research.
The mischaracterization of “Big Data and Sample Bias”, for example, typically arises when academics point out that marginalized communities do not have access to social media. First things first: I highly recommend reading “Big Data and Its Exclusions,” published by Stanford Law Review. While the piece does not address Big Crisis Data, it is nevertheless instructive when thinking about social media for emergency management. Secondly, identifying who “speaks” (and who does not speak) on social media during humanitarian crises is of course imperative, but that’s exactly why the argument about sample bias is such a straw man—all of my humanitarian colleagues know full well that social media reports are not representative. They live in the real world where the vast majority of data they have access to is unrepresentative and imperfect—hence the importance of drawing on as many sources as possible, including social media. Random sampling during disasters is a Quixotic luxury, which explains why humanitarian colleagues seek “good enough” data and methods.
Some academics also seem to believe that disaster responders ignore all other traditional sources of crisis information in favor of social media. This means, to follow their argument, that marginalized communities have no access to other communication life lines if they are not active on social media. One popular observation is the “revelation” that some marginalized neighborhoods in New York posted very few tweets during Hurricane Sandy. Why some academics want us to be surprised by this, I know not. And why they seem to imply that emergency management centers will thus ignore these communities (since they apparently only respond to Twitter) is also a mystery. What I do know is that social capital and the use of traditional emergency communication channels do not disappear just because academics chose to study tweets. Social media is simply another node in the pre-existing ecosystem of crisis information.
Furthermore, the fact that very few tweets came out of the Rockaways during Hurricane Sandy can be valuable information for disaster responders, a point that academics often overlook. To be sure, monitoring social media footprints during disasters can help humanitarians get a better picture of the “negative space” and thus infer what they might be missing, especially when comparing these “negative footprints” with data from traditional sources. Indeed, knowing what you don’t know is a key component of situational awareness. No one wants blind spots, and knowing who is not speaking on social media during disasters can help correct said bing spots. Moreover, the contours of a community’s social media footprint during a disaster can shed light on how neighboring areas (that are not documented on social media) may have been affected. When I spoke about this with humanitarian colleagues in Geneva this week, they fully agreed with my line of reasoning and even added that they already apply “good enough” methods of inference with traditional crisis data.
My PopTech colleague Andrew Zolli is fond of saying that we shape the world by the questions we ask. My UN colleague Andrej Verity recently reminded me that one of the most valuable aspects of social media for humanitarian response is that it helps us to ask important questions (that would not otherwise be posed) when coordinating disaster relief. So the next time you hear an academic go on about issues of bias and exclusion, feel free to share the above along with this list of wrong assumptions. Most importantly, tell them this: “Arguing that Big Data isn’t all it’s cracked up to be is a straw man, pure and simple—because no one should think it’s magic to begin with.” It is high time we stop mischaracterizing Big Crisis Data. What we need instead is a can-do, problem-solving attitude. Otherwise we’ll all fall prey to the Smart-Talk trap.