“Arguing that Big Data isn’t all it’s cracked up to be is a straw man, pure and simple—because no one should think it’s magic to begin with.” Since citing this point in my previous post on Big Data for Disaster Response: A List of Wrong Assumptions, I’ve come across more mischaracterizations of Big (Crisis) Data. Most of these fallacies originate from the Ivory Towers; from [a small number of] social scientists who have carried out one or two studies on the use of social media during disasters and repeat their findings ad nauseam as if their conclusions are the final word on a very new area of research.
The mischaracterization of “Big Data and Sample Bias”, for example, typically arises when academics point out that marginalized communities do not have access to social media. First things first: I highly recommend reading “Big Data and Its Exclusions,” published by Stanford Law Review. While the piece does not address Big Crisis Data, it is nevertheless instructive when thinking about social media for emergency management. Secondly, identifying who “speaks” (and who does not speak) on social media during humanitarian crises is of course imperative, but that’s exactly why the argument about sample bias is such a straw man—all of my humanitarian colleagues know full well that social media reports are not representative. They live in the real world where the vast majority of data they have access to is unrepresentative and imperfect—hence the importance of drawing on as many sources as possible, including social media. Random sampling during disasters is a Quixotic luxury, which explains why humanitarian colleagues seek “good enough” data and methods.
Some academics also seem to believe that disaster responders ignore all other traditional sources of crisis information in favor of social media. This means, to follow their argument, that marginalized communities have no access to other communication life lines if they are not active on social media. One popular observation is the “revelation” that some marginalized neighborhoods in New York posted very few tweets during Hurricane Sandy. Why some academics want us to be surprised by this, I know not. And why they seem to imply that emergency management centers will thus ignore these communities (since they apparently only respond to Twitter) is also a mystery. What I do know is that social capital and the use of traditional emergency communication channels do not disappear just because academics chose to study tweets. Social media is simply another node in the pre-existing ecosystem of crisis information.
Furthermore, the fact that very few tweets came out of the Rockaways during Hurricane Sandy can be valuable information for disaster responders, a point that academics often overlook. To be sure, monitoring social media footprints during disasters can help humanitarians get a better picture of the “negative space” and thus infer what they might be missing, especially when comparing these “negative footprints” with data from traditional sources. Indeed, knowing what you don’t know is a key component of situational awareness. No one wants blind spots, and knowing who is not speaking on social media during disasters can help correct said blind spots. Moreover, the contours of a community’s social media footprint during a disaster can shed light on how neighboring areas (that are not documented on social media) may have been affected. When I spoke about this with humanitarian colleagues in Geneva this week, they fully agreed with my line of reasoning and even added that they already apply “good enough” methods of inference with traditional crisis data.
My PopTech colleague Andrew Zolli is fond of saying that we shape the world by the questions we ask. My UN colleague Andrej Verity recently reminded me that one of the most valuable aspects of social media for humanitarian response is that it helps us to ask important questions (that would not otherwise be posed) when coordinating disaster relief. So the next time you hear an academic go on about [a presentation on] issues of bias and exclusion, feel free to share the above along with this list of wrong assumptions.
Most importantly, tell them [say] this: “Arguing that Big Data isn’t all it’s cracked up to be is a straw man, pure and simple—because no one should think it’s magic to begin with.” It is high time we stop mischaracterizing Big Crisis Data. What we need instead is a can-do, problem-solving attitude. Otherwise we’ll all fall prey to the Smart-Talk trap.
“One of the most valuable aspects of social media for humanitarian response is that it helps us to ask important questions (that would not otherwise be posed) when coordinating disaster relief.” Well put. I seem to remember that was the opening slide on the analysis training for the Standby Task Force too. We’ve known this all along, I’m not sure why people keep worrying we will draw hard conclusions from this type of data.
Many thanks for confirming that I’m not totally off on this, Helena, always reassuring! 🙂
This reminds me recent flood simulation we had with official responders. One of those old firemen guys from tiny city, who has no idea what twitter, facebook or crowdsourcing is, told us:
“During crisis, for us, the firemen, it is like having a dark house where only some rooms are lit (ie information from mayors and other official local sources in villages and cities affected). What you do, is that you are lightning up more rooms for us. So don’t worry, it is enough. Let it be us to figure out what is going on in those rooms.”
This reminds me recent flood simulation we had with official responders. One of those old firemen guys from tiny city, who has no idea what twitter, facebook or crowdsourcing is told us:
“During crisis, for us, the firemen, it is like having a dark house where only some rooms are lit (ie information from mayors and other official local sources in villages and cities affected). What you do, is that you are lightning up more rooms for us. So don’t worry, it is enough. Let it be us to figure out what is going on in those rooms.”
Brilliant, many thanks for sharing, Jaro! I’ve actually got a follow up blog post on exactly this notion, “lightning up more rooms” and your example is perfect for that as well 🙂
Dear Patrick, I can’t really agree with you on this one. Your arguments don’t hold their ground very well unless you mention the specific studies your criticism is aimed at (“Ivory Tower”, “social scientists”, seriously?). There is a wide spread hype around Big Data. I am not saying that the crisis management or digital humanitarian community ever fell victim to it, to the contrary! But the hype is there, and there are issues with “Big” Data, just as there are issues with every type of data or data set. Identifying these issues is the first step in dealing with them, in fact it is the very pre-requisite of the problem-solving attitude you (correctly) demand. I am sure that there are studies for which your criticism is absolutely warranted. However, I can’t think of one off the top of my head, and I have read a couple. You should name and shame them directly, instead of addressing a vague group of scientists.
Thanks for reading, Frank. I prefer not to publicly name the individuals/studies in question for reasons I can explain offline. But I imagine you can easily guess said reasons yourself. In any event, I don’t understand why my aversion to publicly naming and shaming (which you recommend I do) undermines any of my arguments.
Let me be more precise: Your “positive” argument on the value of social media for crisis management is not affected, because you have a substantial body of research and practice supporting your case. Same goes for simply stating that anyone who accuses disaster responders to be unaware of issues with Big Data / Social Media and to be focused too much on them, is wrong. You can disprove that easily. But if you claim that specific academics/scientists/studies are wrong in their argument and use straw men (and from your post it is very clear that you have specific ones in mind), then you should refer directly to them. If you don’t want to do that for whatever reasons, that’s fine, but then I would focus more on your argument and less on them. My two cents. 🙂
I enjoyed the post, Patrick, and I agree with your argument about the weakness of the “straw man” thesis; and although I see a lot of criticism about big data in the academic community, I would characterize little of it in this way. For that reason, I side with Frank’s argument about the need for greater clarity regarding which studies specifically have been guilty of the claims you’re alleging. How many are we dealing with? How much influence have these studies had on scholarly, policy and public discourse? This shouldn’t be treated as a “naming and shaming” exercise — let’s leave that for truly morally repugnant stuff. However, if the goal is to sharpen our understanding of the strengths and limitations of big data for emergency response by introducing more nuance and perspective (I think this is what you’re calling for) it’s important to know where these claims reside, to assess the grounds on which they’re made, and to evaluate their impact and significance. Otherwise, I fear we run the risk of producing one set of caricatures to undermine another.
Thanks Josh, and point taken, especially on the risk of producing caricatures to undermine another. In a way, I really wish the individuals/studies were caricatures. I’m reacting with this blog post because it has become clear to me after several months that they are not caricatures. In any event, I am not looking for confrontation, so I prefer not to name names. I just emailed Frank (who commented on my post as well) to give him more of the context.
Fair enough, Patrick. I’m not trying to force a confrontation on the discussion into public view when that’s not your goal or intention. However, I’m interested in knowing which articles you have in mind for research purposes as I’m working in this area as well. If you’re willing to share the context and point me to the articles you have in mind I’d be grateful, and would treat that information confidentially. Cheers.
I should look things up in a dictionary when in doubt – “name and shame” seems to have stronger negative connotations than I thought… maybe “expose weak arguments” fits better… because just to clarify, it wasn’t my intention to stir up some public confrontation or fight here…
Thanks Frank 🙂
Great post Patrick, I would add, there are plenty of practitioners that also look for any reason to dismiss the type of data we exploit. Typically, they cite the validity canard.
Thanks for your support, David, I really appreciate it.
Patrick, you’re doing great work. When people see something big, and getting lots of attention, they want to be the one to poke holes and predict its downfall. Your arguments are always well supported, nuanced, and readily acknowledge weaknesses and limitations where they exist.
Thanks very much for your kind words and support, Tasha, really appreciate it.
Pingback: Rapid Disaster Damage Assessments: Reality Check | iRevolution
Pingback: Should Humanitarian Intelligence exist? | bertrand taithe
Pingback: New Findings: Rapid Assessment of Disaster Damage Using Social Media | iRevolutions