Tag Archives: Haiti

Twitter, Crises and Early Detection: Why “Small Data” Still Matters

My colleagues John Brownstein and Rumi Chunara at Harvard Univer-sity’s HealthMap project are continuing to break new ground in the field of Digital Disease Detection. Using data obtained from tweets and online news, the team was able to identify a cholera outbreak in Haiti weeks before health officials acknowledged the problem publicly. Meanwhile, my colleagues from UN Global Pulse partnered with Crimson Hexagon to forecast food prices in Indonesia by carrying out sentiment analysis of tweets. I had actually written this blog post on Crimson Hexagon four years ago to explore how the platform could be used for early warning purposes, so I’m thrilled to see this potential realized.

There is a lot that intrigues me about the work that HealthMap and Global Pulse are doing. But one point that really struck me vis-a-vis the former is just how little data was necessary to identify the outbreak. To be sure, not many Haitians are on Twitter and my impression is that most humanitarians have not really taken to Twitter either (I’m not sure about the Haitian Diaspora). This would suggest that accurate, early detection is possible even without Big Data; even with “Small Data” that is neither representative or indeed verified. (Inter-estingly, Rumi notes that the Haiti dataset is actually larger than datasets typically used for this kind of study).

In related news, a recent peer-reviewed study by the European Commi-ssion found that the spatial distribution of crowdsourced text messages (SMS) following the earthquake in Haiti were strongly correlated with building damage. Again, the dataset of text messages was relatively small. And again, this data was neither collected using random sampling (i.e., it was crowdsourced) nor was it verified for accuracy. Yet the analysis of this small dataset still yielded some particularly interesting findings that have important implications for rapid damage detection in post-emergency contexts.

While I’m no expert in econometrics, what these studies suggests to me is that detecting change-over–time is ultimately more critical than having a large-N dataset, let alone one that is obtained via random sampling or even vetted for quality control purposes. That doesn’t mean that the latter factors are not important, it simply means that the outcome of the analysis is relatively less sensitive to these specific variables. Changes in the baseline volume/location of tweets on a given topic appears to be strongly correlated with offline dynamics.

What are the implications for crowdsourced crisis maps and disaster response? Could similar statistical analyses be carried out on Crowdmap data, for example? How small can a dataset be and still yield actionable findings like those mentioned in this blog post?

Some Thoughts on Real-Time Awareness for Tech@State

I’ve been invited to present at Tech@State in Washington DC to share some thoughts on the future of real-time awareness. So I thought I’d use my blog to brainstorm and invite feedback from iRevolution readers. The organizers of the event have shared the following questions with me as a way to guide the conver-sation: Where is all of this headed?  What will social media look like in five to ten years and what will we do with all of the data? Knowing that the data stream can only increase in size, what can we do now to prepare and prevent being over-whelmed by the sheer volume of data?

These are big, open-ended questions, and I will only have 5 minutes to share some preliminary thoughts. I shall thus focus on how time-critical crowdsourcing can yield real-time awareness and expand from there.

Two years ago, my good friend and colleague Riley Crane won DARPA’s $40,000 Red Balloon Competition. His team at MIT found the location of 10 weather balloons hidden across the continental US in under 9 hours. The US covers more than 3.7 million square miles and the balloons were barely 8 feet wide. This was truly a needle-in-the-haystack kind of challenge. So how did they do it? They used crowdsourcing and leveraged social media—Twitter in particular—by using a “recursive incentive mechanism” to recruit thousands of volunteers to the cause. This mechanism would basically reward individual participants financially based on how important their contributions were to the location of one or more balloons. The result? Real-time, networked awareness.

Around the same time that Riley and his team celebrated their victory at MIT, another novel crowdsourcing initiative was taking place just a few miles away at The Fletcher School. Hundreds of students were busy combing through social and mainstream media channels for actionable and mappable information on Haiti following the devastating earthquake that had struck Port-au-Prince. This content was then mapped on the Ushahidi-Haiti Crisis Map, providing real-time situational awareness to first responders like the US Coast Guard and US Marine Corps. At the same time, hundreds of volunteers from the Haitian Diaspora were busy translating and geo-coding tens of thousands of text messages from disaster-affected communities in Haiti who were texting in their location & most urgent needs to a dedicated SMS short code. Fletcher School students filtered and mapped the most urgent and actionable of these text messages as well.

One year after Haiti, the United Nation’s Office for the Coordination of Humanitarian Affairs (OCHA) asked the Standby Volunteer Task Force (SBTF) , a global network of 700+ volunteers, for a real-time map of crowdsourced social media information on Libya in order to improve their own situational awareness. Thus was born the Libya Crisis Map.

The result? The Head of OCHA’s Information Services Section at the time sent an email to SBTF volunteers to commend them for their novel efforts. In this email, he wrote:

“Your efforts at tackling a difficult problem have definitely reduced the information overload; sorting through the multitude of signals on the crisis is no easy task. The Task Force has given us an output that is manageable and digestible, which in turn contributes to better situational awareness and decision making.”

These three examples from the US, Haiti and Libya demonstrate what is already possible with time-critical crowdsourcing and social media. So where is all this headed? You may have noted from each of these examples that their success relied on the individual actions of hundreds and sometimes thousands of volunteers. This is primarily because automated solutions to filter and curate the data stream are not yet available (or rather accessible) to the wider public. Indeed, these solutions tend to be proprietary, expensive and/or classified. I thus expect to see free and open source solutions crop up in the near future; solutions that will radically democratize the tools needed to gain shared, real-time awareness.

But automated natural language processing (NLP) and machine learning alone are not likely to succeed, in my opinion. The data stream is actually not a stream, it is a massive torent of non-indexed information, a 24-hour global firehose of real-time, distributed multi-media data that continues to outpace our ability to produce actionable intelligence from this torrential downpour of 0’s and 1’s. To turn this data tsunami into real-time shared awareness will require that our filtering and curation platforms become more automated and collaborative. I believe the key is thus to combine automated solutions with real-time collabora-tive crowdsourcing tools—that is, platforms that enable crowds to collaboratively filter and curate real-time information, in real-time.

Right now, when we comb through Twitter, for example, we do so on our own, sitting behind our laptop, isolated from others who may be seeking to filter the exact same type of content. We need to develop free and open source platforms that allow for the distributed-but-networked, crowdsourced filtering and curation of information in order to democratize the sense-making of the firehose. Only then will the wider public be able to win the equivalent of Red Balloon competitions without needing $40,000 or a degree from MIT.

I’d love to get feedback from readers about what other compelling cases or arguments I should bring up in my presentation tomorrow. So feel free to post some suggestions in the comments section below. Thank you!

Tracking Population Movements using Mobile Phones and Crisis Mapping: A Post-Earthquake Geospatial Study in Haiti

I’ve been meaning to blog about this project since it was featured on BBC last month: “Mobile Phones Help to Target Disaster Aid, says Study.” I’ve since had the good fortune of meeting Linus Bengtsson and Xin Lu, the two lead authors of this study (PDF), at a recent strategy meeting organized by GSMA. The authors are now launching “Flowminder” in affiliation with the Karolinska Institutet in Stockholm to replicate their excellent work beyond Haiti. If “Flowminder” sounds familiar, you may be thinking of Hans Rosling’s “Gapminder” which also came out of the Karolinska Institutet. Flowminder’s mission: “Providing priceless information for free for the benefit of those who need it the most.”

As the authors note, “population movements following disasters can cause important increases in morbidity and mortality.” That is why the UN sought to develop early warning systems for refugee flows during the 1980’s and 1990’s. These largely didn’t pan out; forecasting is not a trivial challenge. Nowcasting, however, may be easier. That said, “no rapid and accurate method exists to track population movements after disasters.” So the authors used “position data of SIM cards from the largest mobile phone company in Haiti (Digicel) to estimate the magnitude and trends of population movements following the Haiti 2010 earthquake and cholera outbreak.”

The geographic locations of SIM cards were determined by the location of the mobile phone towers that SIM cards were connecting to when calling. The authors followed the daily positions of 1.9 million SIM cards for 42 days prior to the earthquake and 158 days following the quake. The results of the analysis reveal that an estimated 20% of the population in Port-au-Prince left the city within three weeks of the earthquake. These findings corresponded well with of a large, retrospective population based survey carried out by the UN.

“To demonstrate feasibility of rapid estimates and to identify areas at potentially increased risk of outbreaks,” the authors “produced reports on SIM card move-ments from a cholera outbreak area at its immediate onset and within 12 hours of receiving data.” This latter analysis tracked close to 140,000 SIM cards over an 8 day period. In sum, the “results suggest that estimates of population movements during disasters and outbreaks can be delivered rapidly and with potentially high validity in areas with high mobile phone use.”

I’m really keen to see the Flowminder team continue their important work in and beyond Haiti. I’ve invited them to present at the International Conference of Crisis Mappers (ICCM 2011) in Geneva next month and hope they’ll be able to join us. I’m interested to explore the possibilities of combining this type of data and analysis with crowdsourced crisis information and satellite imagery analysis. In addition, mobile phone data can also be used to estimate the hardest hit areas after a disaster. For more on this, please see my previous blog post entitled “Analyzing Call Dynamics to Assess the Impact of Earthquakes” and this post on using mobile phone data to assess the impact of building damage in Haiti.

OpenStreetMap’s New Micro-Tasking Platform for Satellite Imagery Tracing

The Humanitarian OpenStreetMap Team’s (HOT) response to Haiti remains one of the most remarkable examples of what’s possible when volunteers, open source software and open data intersect. When the 7.0 magnitude earthquake struck on January 12th, 2010, the Google Map of downtown Port-au-Prince was simply too incomplete to be used for humanitarian response. Within days, however, several hundred volunteers from the OpenStreetMap (OSM) commu-nity used satellite imagery to trace roads, shelters and other important features to create the most detailed map of Haiti ever made.

OpenStreetMap – Project Haiti from ItoWorld on Vimeo.

The video animation above shows just how spectacular this initiative was. More than 1.4 million edits were made to the map during the first month following the earthquake. These individual edits are highlighted as bright flashes of light in the video. This detailed map went a long way to supporting the humanitarian community’s response in Haiti. In addition, the map enabled my colleagues and I at The Fletcher School to geo-locate reports from crowdsourced text messages from Mission 4636 on the Ushahidi Haiti Map.

HOT’s response was truly remarkable. They created wiki’s to facilitate mass collaboration such as this page on “What needs to be mapped?” They also used this “OSM Matrix” to depict which areas required more mapping:

The purpose of OSM’s new micro-tasking platform is to streamline mass and rapid collaboration on future satellite image tracing projects. I recently reached out to HOT’s Kate Chapman and Nicolas Chavent to get an overview of their new platform. After logging in using my OSM username and password, I can click through a list of various on-going projects. The one below relates to a very neat HOT project in Indonesia. As you can tell, the region that needs to be mapped on the right-hand side of the screen is divided into a grid.

After I click on “Take a task randomly”, the screen below appears, pointing me to one specific cell in the grid above. I then have the option of opening and editing this cell within JOSM, the standard interface for editing OpenStreetMap. I would then trace all roads and buildings in my square and submit the edit. (I was excited to also see a link to WalkingPapers which allows you to print out and annotate that cell using pen & paper and then digitize the result for import back into OSM).

There’s no doubt that this new Tasking Server will go a long way to coordinate and streamline future live tracing efforts such as for Somalia. For now, the team is mapping Somalia’s road network using their wiki approach. In the future, I hope that the platform will also enable basic feature tagging and back-end triangulation for quality assurance purposes—much like Tomnod. In the meantime, however, it’s important to note that OSM is far more than just a global open source map. OSM’s open data advocacy is imperative for disaster preparedness and response: open data saves lives.

An Open Letter to the Good People at Benetech

Dear Good People at Benetech,

We’re not quite sure why Benetech went out of their way in an effort to discredit ongoing research by the European Commission (EC) that analyzes SMS data crowdsourced during the disaster response to Haiti. Benetech’s area of expertise is in human rights (rather than disaster response), so why go after the EC’s findings, which had nothing to do with human rights?  To our fellow readers who desire context, feel free to read this blog postof mine along with these replies by Benetech’s CEO:

Issues with Crowdsourced Data Part 1
Issues with Crowdsourced Data Part 2

The short version of the debate is this: the EC’s exploratory study found that the spatial pattern of text messages from Mission 4636 in Haiti was positively correlated with building damage in Port-au-Prince. This would suggest that crowdsourced SMS data had statistical value in Haiti—in addition to their value in saving lives. But Benetech’s study shows a negative correlation. That’s basically it. If you’d like to read something a little more spicy though, do peruse this recent Fast Company article, fabulously entitled “How Benetech Slays Monsters with Megabytes and Math.” In any case, that’s the back-story.

So lets return to the Good People at Benetech. I thought I’d offer some of my humble guidance in case you feel threatened again in the future—I do hope you don’t mind and won’t take offense at my unsolicited and certainly imperfect advice. So by all means feel free to ignore everything that follows and focus on the more important work you do in the human rights space.

Next time Benetech wants to try and discredit the findings of a study in some other discipline, I recommend making sure that your own counter-findings are solid. In fact, I would suggest submitting your findings to a respected peer-reviewed journal—preferably one of the top tier scientific journals in your discipline. As you well know, after all, this really is the most objective and rigorous way to assess scientific work. Doing so would bring much more credibility to Benetech’s counter-findings than a couple blog posts.

My reasoning? Benetech prides itself (and rightly so) for carrying out some of the most advanced, cutting-edge quantitative research on patterns of human rights abuses. So if you want to discredit studies like the one carried out by the EC, I would have used this as an opportunity to publicly demonstrate the advanced expertise you have in quantitative analysis. But Benetech decided to use a simple non-spatial model to discredit the EC’s findings. Why use such a simplistic approach? Your response would have been more credible had you used statistical models for spatial point data instead. But granted, had you used more advanced models, you would have found evidence of a positive correlation. So you probably won’t want to read this next bit: a more elaborate “Tobit” correlation analysis actually shows the significance of SMS patterns as an explanatory variable in the spatial distribution of damaged buildings. Oh, and the correlation is (unfortunately) positive.

But that’s really beside the point. As my colleague Erik Hersman just wrote on the Ushahidi blog, one study alone is insufficient. What’s important is this: the last thing you want to do when trying to discredit a study in public is to come across as sloppy or as having ulterior motives (or both for that matter). Of course, you can’t control what other people think. If people find your response sloppy, then they may start asking whether the other methods you do use in your human rights analysis are properly peer-reviewed. They may start asking whether a strong empirical literature exists to back up your work and models. They may even want to know whether your expert statisticians have an accomplished track record and publish regularly in top-tier scientific journals. Other people may think you have ulterior motives and will believe this explains why you tried to discredit the EC’s preliminary findings. This doesn’t help your cause either. So it’s important to think through the implications of going public when trying to discredit someone’s research. Goodness knows I’ve made some poor calls myself on such matters in the past.

But lets take a step back for a moment. If you’re going to try and discredit research like the EC’s, please make sure you correctly represent the other side’s arguments. Skewing them or fabricating them is unlikely to make you very credible in the debate. For example, the EC study never concluded that Search and Rescue teams should only rely on SMS to save people’s lives. Furthermore, the EC study never claimed that using SMS is preferable over using established data on building density. It’s surely obvious—and you don’t need to demonstrate this statistically—to know that using a detailed map of building locations would provide a far better picture of potentially damaged buildings than crowdsourced SMS data. But what if this map is not available in a timely manner? As you may know, data layers of building density are not very common. Haiti was a good example of how difficult, expensive and time-consuming, the generation of such a detailed inventory is. The authors of the study simply wanted to test whether the SMS spatial pattern matched the damage analysis results, which it does. All they did was propose that SMS patterns could help in structuring the efforts needed for a detailed assessment, especially because SMS data can be received shortly after the event.

So to summarize, no one (I know) has ever claimed that crowdsourced data should replace established methods for information collection and analysis. This has never been an either or argument. And it won’t help your cause to turn it into a black-and-white debate because people familiar with these issues know full well that the world is more complex than the picture you are painting for them. They also know that people who take an either-or approach often do so when they have either run out of genuine arguments or had few to begin with. So none of this will make you look good. In sum, it’s important to (1) accurately reflect the other’s arguments, and (2) steer clear of creating an either-or, polarized debate. I know this isn’t easy to do, I’m guilty myself… on multiple counts.

I’ve got a few more suggestions—hope you don’t mind. They follow from the previous ones. The authors of the EC study never used their preliminary findings to extrapolate to other earthquakes, disasters or contexts. These findings were specific to the Haiti quake and the authors never claimed that their model was globally valid. So why did you extrapolate to human rights analysis when that was never the objective of the EC study? Regardless, this just doesn’t make you look good. I understand that Benetech’s focus is on human rights and not disaster response, but the EC study never sought to undermine your good work in the field of human rights. Indeed, the authors of the study hadn’t even heard of Benetech. So in the future, I would recommend not extrapolating findings from one study and assume they will hold in your own field of expertise or that they even threaten your area of expertise. That just doesn’t make any sense.

There are a few more tips I wanted to share with you. Everyone knows full well that crowdsourced data has important limitations—nobody denies this. But a number of us happen to think that some value can still be derived from crowdsourced data. Even Mr. Moreno-Ocampo, the head of the International Criminal Court (ICC), who I believe you know well, has pointed to the value of crowdsourced data from social media. In an interview with CNN last month, Mr. Moreno-Ocampo emphasized that Libya was the first time that the ICC was able to respond in real time to allegations of atrocities, partially due to social-networking sites such as Facebook. He added that, “this triggered a very quick reaction. The (United Nations) Security Council reacted in a few days; the U.N. General Assembly reacted in a few days. So, now because the court is up and running we can do this immediately,” he said. “I think Libya is a new world. How we manage the new challenge — that’s what we will see now.”

Point is, you can’t control the threats that will emerge or even prevent them, but you do control the way you decide to publicly respond to these threats. So I would recommend using your response as an opportunity to be constructive and demonstrate your good work rather than trying to discredit others and botching things up in the process.

But going back to the ICC and the bit in the Fast Company article about mathematics demonstrating the culpability of the Guatemalan government. Someone who has been following your work closely for years emailed me because they felt somewhat irked by all this. By the way, this is yet another unpleasant consequence of trying to publicly discredit others, new critics of your work will emerge. The critic in questions finds the claim a “little far fetched” re your mathematics demonstrating the culpability of the Guatemalan government. “There already was massive documented evidence of the culpability of the Guatemalan government in the mass killings of people. If there is a contribution from mathematics it is to estimate the number of victims who were never documented. So the idea is that documented cases are just a fraction of total cases and you can estimate the gap between the two. In order to do this estimation, you have to make a number of very strong assumptions, which means that the estimate may very well be unreliable anyway.”

Now, I personally think that’s not what you, Benetech, meant when you spoke with the journalist, cause goodness knows the number of errors that journalists have made writing about Haiti.

In any case, the critic had this to add: “In a court of law, this kind of estimation counts for little. In the latest trial at which Benetech presented their findings, this kind of evidence was specifically rejected. Benetech and others claim that in an earlier trial they nailed Milosevic. But Milosevic was never nailed in the first place—he died before judgment was passed and there was a definite feeling at the time that the trial wasn’t going well. In any case, in a court of law what matters are documented cases, not estimates, so this argument about estimates is really beside the point.”

Now I’m really no expert on any of these issues, so I have no opinion on this case or the statistics or the arguments involved. They may very well be completely wrong, for all I know. I’m not endorsing any of the above statements. I’m simply using them as an illustration of what might happen in the future if you don’t carefully plan your counter-argument before going public. People will take issue and try to discredit you in turn, which can be rather unpleasant.

In conclusion, I would like to remind the Good People at Benetech about what Ushahidi is and isn’t. The Ushahidi platform is not a methodology (as I have already written on iRevolution and the Ushahidi blog). The Ushahidi platform is a mapping tool. The methodology that people choose to use to collect information is entirely up to them. They can use random sampling, controlled surveys, crowdsourcing, or even the methodology used by Benetech. I wonder what the good people at Benetech would say if some of their data were to be visualized on an Ushahidi platform. Would they dismiss the crisis map altogether? And speaking of crisis maps, most Ushahidi maps are not crisis maps. The platform is used in a very wide variety of ways, even to map the best burgers in the US. Is Benetech also going to extrapolate the EC’s findings to burgers?

So to sum up, in case it’s not entirely clear, we know full well that there are important limitations to crowdsourced data in disaster response and have never said that the methodology of crowdsourcing should replace existing methodologies in the human rights space (or any other space for that matter). So please, lets not continue going in circles endlessly.

Now, where do we go from here? Well, I’ve never been a good pen pal, so don’t expect any more letters from me in response to the Good People at Benetech. I think everyone knows that a back and forth would be unproductive and largely a waste of time, not to mention an unnecessary distraction from the good work that we all try to do in the broader community to bring justice, voice and respect to marginalized communities.

Sincerely,

New Publications on Haiti, Crowdsourcing and Crisis Mapping

Two new publications that may be of interest to iRevolution readers:

MIT’s Journal, Innovations: Technology, Governance, Globalization just released a special edition focused on Haiti which includes lead articles by President Bill Clinton and Digicel’s CEO Denis O’Brien. My colleague Ida Norheim-Hagtun and I were invited to contribute the following piece: Crowdsourcing for Crisis Mapping in Haiti. The edition also includes articles by Mark Summer from Inveneo and my colleague Josh Nesbit from Medic:Mobile.

The SAIS Review of International Affairs recently published a special edition on the cyber challenge threats and opportunities in a networked world, which includes an opening article on Internet Freedom by Alec Ross. My colleague Robert Munro and I were invited to submit write the following piece: The Unprecedented Role of SMS in Disaster Response, which focuses specifically on Haiti. Colleagues from Havard University’s Berkman Center also had a piece on Political Change in the Digital Age, which I reviewed here.

Please feel free to get in touch if you’d like copies of the articles on Haiti. In the meantime, here is a must-read for everyone working in Haiti: “Foreign Friends, Leave January 12th to Haitians.”

How Crowdsourced Data Can Predict Crisis Impact: Findings from Empirical Study on Haiti

One of the inherent concerns about crowdsourced crisis information is that the data is not statistically representative and hence “useless” for any serious kind of statistical analysis. But my colleague Christina Corbane and her team at the European Commission’s Joint Research Center (JRC) have come up with some interesting findings that prove otherwise. They used the reports mapped on the Ushahidi-Haiti platform to show that this crowdsourced  data can help predict the spatial distribution of structural damage in Port-au-Prince. The results were presented at this year’s Crisis Mapping Conference (ICCM 2010).

The data on structural damage was obtained using very high resolution aerial imagery. Some 600 experts from 23 different countries joined the World Bank-UNOSAT-JRC team to assess the damage based on this imagery. This massive effort took two months to complete. In contrast, the crowdsourced reports on Ushahidi-Haiti were mapped in near-real time and could “hence  represent an invaluable early indicator on the distribution and on the intensity of building damage.”

Corbane and her co-authors “focused on the area of Port-au-Prince (approximately 9 by 9 km) where a total of 1,645 messages have been reported and 161,281 individual buildings have been identified, each classified into one of the 5 different damage grades.” Since the focus of the study is the relationship between crowdsourced reports and the intensity of structural damage, only grades 4 and 5 (structures beyond repair) were taken into account. The result is a bivariate point pattern consisting of two variables: 1,645 crowdsourced reports and 33,800 damaged buildings (grades 4 and 5 combined).

The above graphic simply serves as an illustrative example of the possible relationships between simulated distributions of SMS and damaged buildings. The two figures below represent the actual spatial distribution of crowdsourced reports and damaged buildings according to the data. The figures show that both crowdsourced data and damage patterns are clustered even though the latter is more pronounced. This suggests that some kind of correlation exists between the two distributions.

Corbane and colleagues therefore used spatial point pattern process statistics to better understand and characterize the spatial structures of crowdsourced reports and building damage patterns. They used the Ripley’s K-function which is often considered “the most suitable and functional characteristic for analyzing point processes.” The results clearly demonstrate the existence of statistically significant correlation between the spatial patterns of crowdsourced data and building damages at “distances ranging between 1 and 3 to 4 km.”

The co-authors then used the marked Gibbs point process model to “derive the conditional intensity of building damage based on the pairwise interactions between SMS [crowdsourced reports] and building damages.” The resulting model was then used to compute the predicted damage intensity values, which is depicted below with the observed damage intensity.

The figures clearly show that the similarity between the patterns exhibited by the predictive model and the actual damage pattern is particularly strong. This visual inspection is confirmed by the computed correlation between the observed and predicted damage patterns shown below.

In sum, the results of this empirical study demonstrates the existence of a spatial dependence between crowdsourced data and damaged buildings. The results of the analysis also show how statistical interactions between the patterns of crowdsourced data and building damage can be used for modeling the intensity of structural damage to buildings.

These findings are rather stunning. Data collected using unbounded crowdsourcing (non-representative sampling) largely in the form of SMS from the disaster affected population in Port-au-Prince can predict, with surprisingly high accuracy and statistical significance, the location and extent of structural damage post-earthquake.

The World Bank-UNOSAT-JRC damage assessment took 600 experts 66 days to complete. The cost probably figured in the hundreds of millions of dollars. In contrast, Mission 4636 and Ushahidi-Haiti were both ad-hoc, volunteer-based projects and virtually all the crowdsourced reports used in the study were collected within 14 days of the earthquake (most within 10 days).

But what does this say about the quality/reliability of crowdsourced data? The authors don’t make this connection but I find the implications particularly interesting since the actual content of the 1,645 crowdsourced reports were not factored into the analysis, simply the GPS coordinates, i.e., the meta-data.

Here Come the Crowd-Sorcerers: “No We Can’t, No We Won’t” says Muggle Master

Sigh indeed. Yawn, even.

The purpose of this series is not to make it about Paul and Patrick. That’s boring as heck. The idea behind the series was not simply to provoke and use humorous analogies but to dispel confusion about crowdsourcing and thereby provide a more informed understanding of this methodology. I fear this is getting completely lost.

Recall that it was a humanitarian colleague who came up with the label “Crowd Sorcerer”. It made me laugh so I figured we’d have a little fun by using the label Muggle in return. But that’s all it is, good fun. And of course many humanitarians see eye to eye with the Crowd Sorcerer approach, so apologies to those who felt they were wrongly placed in the Muggle category. We’ll use the Sorting Hat next time.

Henry and Erik from Ushahidi

This is not about a division between Crowd Sorcerers and Muggles. As a colleague recently noted, “the line lies somewhere else, between effective implementation of new tools and methodologies versus traditional ways of collecting crisis information.” There are plenty of humanitarians who see value in trying out new approaches. Of course, there are some who simply say “No We Can’t, No We Won’t.”

There’s no point going back and forth with Paul on every one of his issues because many of these have actually little to do with crowdsourcing and more to do with him being provoked. In this post, I’m going to stick to the debate about the in’s and out’s of crowdsourcing in humanitarian response.

On Verification

Muggle Master: And of course the way in which Patrick interprets those words bears little relation to what those words actually said, which is this: “Unless there are field personnel providing “ground truth” data, consumers will never have reliable information upon which to build decision support products.”

I disagree. Again, the traditional mindset here is that unless you have field personnel (your own people) in charge, then there is no way to get accurate information. This implies that the disaster affected populations are all liars, which is clearly untrue.

Verification is of course important—no one said the contrary. Why would Ushahidi be dedicating time and resources to the Swift platform if the group didn’t think that verification was important.

The reality here is that verification is not always possible regardless of which methodology is employed. So it boils down to this: is having information that is not immediately verified better than having no information at all? If your answer is yes or “it depends”, then you’re probably a Crowd Sorcerer. If your answer is, “lets try to test some innovative ways to make rapid verification possible,” then again, you likely are a Crowd Sorcerer/ette.

Incidentally, no one I know has advocated for the use of crowdsourced data at the expense of any other information. Crowd Sorcerers and (many humanitarians) are simply suggesting that it be considered one of multiple feeds. Also, as I’ve argued before, a combined approach of bounded and unbounded crowdsourcing is the way to go.

On Impact Evaluation

The Fletcher Team has commissioned an independent evaluation of the Ushahidi deployment in Haiti to go beyond the informal testimonies of success provided by first responders. This is a four-week evaluation lead by Dr. Nancy Mock, a seasoned humanitarian and M&E expert with over 30 years of experience in the humanitarian and development field.

Nathan Morrow will be working directly with Nancy. Nathan is a geographer who has worked extensively on humanitarian and development information systems. He is a member of the European Evaluation Society and like Nancy a member of the American Evaluation Association. Nathan and Nancy will be aided by a public health student who has several years of experience in community development in Haiti and is a fluent Haitian Creole speaker.

The evaluation team has already gone through much of the data and been in touch with many of the first responders as well as other partners. Their job is to do as rigorous an evaluation  as possible and do this fully transparently. Nancy plans to present her findings publicly at the 2010 Crisis Mappers Conference where we’ve dedicated a roundtable to reviewing these findings, as well as other reviews.

As for background, the ToR (available here) was drafted by graduate students specializing in M&E and reviewed closely by Professor Cheyanne Church, who teaches advanced graduate courses on M&E. She is considered a leading expert on the subject. The ToR was then shared on a number of listserves including the ReliefWeb, CrisisMappers Group and Pelican (a listserve for professional evaluators).

Nancy and Nathan are both experienced in the method known as utilization-focused evaluation (UFE), an approach chosen by The Fletcher Team to ensure that the evaluation is useful to all primary users as well as the humanitarian field. The UFE approach means that the ToR is a living document and being adapted as necessary by the evaluators to ensure that the information gathered is useful and actionable, not just interesting.

We don’t have anything to hide here, Muggles. This was a complete first in terms of live crisis mapping and mobile crowdsourcing. Unlike the humanitarian community, we weren’t prepared at all, nor trained, nor had prior experience with live crisis mapping and mobile crowdsourcing, nor with the use of crowdsourcing for near real-time translation, nor with managing hundreds of unpaid volunteers, nor did the vast majority of them have any background in disaster response, nor were most able to focus on this full time because of their under/graduate coursework and mid-term exams, nor did they have direct links or contacts with first responders prior to the deployment, nor did the many responders know they existed and/or who they were. In sum, they had all the odds stacked against them.

If the evaluation shows that the deployment and the Fletcher Team’s efforts didn’t save lives or are unlikely to have saved any lives, rescued people, had no impact, etc., none of us will dispute this. Will we give up? Of course not, Crowd Sorcerers don’t give up. We’ll learn and do better next time.

One of the main reasons for having this evaluation is not only to assess the impact of the deployment but to create a concrete list of lessons learned so that what didn’t work then is more likely to work in the future. The point here is to assess the impact just as much as it is to assess the potential added value of the approach for future deployments.

How can anyone innovate in a space riddled with a “No We Can’t, No We Won’t” mindset? Trial and error is not allowed, iterative learning and adaptation is as illegal as the dark arts. Some Muggles really need to read this post “On Technology and Learning, or Why the Wright Brothers Did Not Create the 747.” If die-Hard Muggles had had their way, they would have forced the brothers to close up shop after just their first attempt because it “failed.”

Incidentally, the majority of development, humanitarian, aid, etc., projects are never evaluated in any rigorous or meaningful way (if at all, even). But that’s ok because these are double (Muggle) standards.

On Communicating with Local Communities

Concerns over security need not always be used as an excuse for not communicating with local communities. We need to find a way not to exclude potentially important informants. A little innovation and creative thinking wouldn’t hurt. Humanitarians working with Crowd Sorcerers could use SMS to crowdsource reports, triangulate as best as possible using manual means combined with Swift River, cross-reference with official information feeds and investigate reports that appear the most clustered and critical.

That way, if you see a significant number of text messages reporting the lack of water in an area of Port-au-Prince then at least this gives you an indication that something more serious may be happening in that location and you can cross-reference your other sources to check whether the issue has already been picked up. Again, it’s this clustering affect that can provide important insights on a given situation.

This would provide a mechanism to allow Haitians to report problems (or complaints for that matter) via SMS, phone, etc. Imogen Wall and other experienced humanitarians have long called for this to change. Hence the newly founded group Communicating with Disaster Affected Communities (CDAC).

Confusion to the End

Me: Despite what some Muggles may think, crowdsourcing is not actually magic. It’s just a methodology like any other, with advantages and disadvantages.

Muggle Master: That’s exactly what “Muggles” think.

Haha, well if that’s exactly what Muggles think, then this is yet more evidence of confusion in the land of Muggles. Crowdsourcing is just a methodology to collect information. There’s nothing new about non-probability sampling. Understanding the  advantages and disadvantages of this methodology doesn’t require an advanced degree in statistical physics.

Muggle Master: Crowdsourcing should not form part of our disaster response plans because there are no guarantees that a crowd is going to show up. Crowdsourcing is no different from any other form of volunteer effort, and the reason why we have professional aid workers now is because, while volunteers are important, you can’t afford to make them the backbone of the operation. The technology is there and the support is welcome, but this is not the future of aid work.

This just reinforces what I’ve already observed, many in the humanitarian space are still confused about crowdsourcing. The crowd is always there. Haitians were always there. And crowdsourcing is not about volunteering. Again, crowdsourcing is just a methodology to collect information. When the UN does it’s rapid needs assessment does the crowd all of a sudden vanish into thin air? Of course not.

As for volunteers, the folks at Fletcher and SIPA are joining forces to work together on deploying live crisis mapping projects in the future. They’re setting up their own protocols, operating procedures, etc. based on what they’ve learned over the past 6 months in order to replicate the “surge mapping capacity” they demonstrated in response to Haiti and Chile. (Swift River will make the need for a large number of volunteers unnecessary).

And pray tell who in the world has ever said that volunteers should be the backbone of a humanitarian operation? Please, do tell. That would be a nice magic trick.

Muggle Master: “The technology is there and the support is welcome, but this is not the future of aid work.”

The support is welcome? Great! But who said that crowdsourcing was the future of aid work? It’s just a methodology. How can one sole methodology be the future of aid work?

I’ll close with this observation. The email thread that started this Crowd-Sorcerer series ended with a second email written by the same group that wrote the first. That second email was far more constructive and conducive to building bridges. I’m excited by the prospects expressed in this second email and really appreciate the positive tone and interest they expressed in working together. I definitely look forward to working with them and learning more from them as we proceed forward in this space and collaboration.

Patrick Philippe Meier

Here Come the Crowd-Sorcerers: Highlighting Some Misunderstandings

Welcome back, folks. Here is the third episode in our “Crowd-Sorcerers Series.” You can read the first episode on “How Technology is Disrupting the Humanitarian Space and Why It’s Easy” right here. The second episode, which in a tongue-in-cheek way asks “Is it Possible to Teach an Old (Humanitarian) Dog New Tech’s?” is available here. Those episodes will highlight what this new “Crowd-Sorcerer Series” is all about.

Oh, but just before we go to episode 3, it seems someone following this series doesn’t appear to have the good sense to recognize the sarcasm and humorous tone in my posts and thereby  missed the point entirely. I’m just using these silly analogies and metaphors to get some points across. I’m drawing a caricature, so to speak, as some of these points often get overlooked in aid/dev speak.

This is not personal at all, and I very much welcome an open conversation with all interested, i.e, the point of this series. A Muggles and Crowd-Sorcerer comparison is just for fun, it isn’t about classy/not-classy, it’s about getting a point or two across to more than just a narrow segment of the aid/dev industry. So again, like I wrote in my first blog post in the series, lets please not take ourselves too seriously, ok?

Muggles: Internet-based platforms may be generating good data within a certain segment of the IT community, such as Open Street Maps, and others like Ushahidi are providing an interesting alternative to real-time news channels, but this data is not getting to where it is needed in an operational sense – the guy/gal sitting in the tent with no Internet connection trying to plan a (name your Cluster/sector/need) survey.

The Ushahidi platform allows end-users to subscribe to alerts via SMS. And that core feature is not new to the Haiti deployment, it’s been there for a good while. Not only can users get automated SMS alerts with the Haiti deployment, but they can also define exactly the type of alerts they wish to receive by setting geographic parameters, tags and even keywords. Thanks to a new plugin for the Ushahidi platform, visual voicemail is also an option for the Ushahidi platform.

In addition, a group deploying the Ushahidi platform can respond to incoming text messages directly from the same interface, allowing for near real-time, two-way communication with the disaster affected communities. See this blog post to find out how that all works.

By the way, not all guys/gals will be sitting in a tent and/or have no Internet access. Also, not all data need to go to guys/gals in tents in the first place.

On Ushahidi being an alternative to real-time news channels, the vast majority of the information mapped on the Haiti platform during the 5 days (before the 4636 SMS short code) came from:

  1. Mainstream media (television, radio, online newspapers)
  2. The Haitian Diaspora
  3. Social media (Twitter, Facebook, Flickr)
  4. Humanitarian sources (emails, situation reports, skype chats, phone calls)

One member of the Diaspora had this to say: “We are the country’s middle and upper class and Haitians living abroad. We do we monitor the Haiti radio, Facebook feeds and Twitter from all our contacts. Filter it and redistribute it […]. We also have a few contacts on the ground in Haiti. All the information we post has been confirmed to the best of our ability.” (Thanks to Rob Munro of Mission 4636 for sharing this).

Muggles: The crucial link that is required, and that the [Humanitarian Information Management] community seems to be drifting farther and farther from as we are collectively distracted by shiny objects and/or the latest, greatest thing since sliced bread, is field-based NGOs equipped with proper information-sharing platform(s) that can be used even when there is no Internet connectivity or Washington-based (or London-based or Paris-based) IT, mapping and GIS skills and support available.

Mobile pones are not new and shiny. Nor is Google Maps. Integrating both is not new either, SMS/map integration has been around for half-a-decade. The fact that the humanitarian community faces a challenge in innovating and keeping up with technology is certainly a problem. Free and open source platforms wouldn’t be filling a technology-information void if a gap didn’t exist in the first place.

Crowd-Sorcerers want to help (the ones I know at least) and they realize full well that they’re new to this space and don’t have all the answers. They want (and actually) do partner with a number of humanitarian organizations on joint projects. But are the rest of the Muggles ready to join forces with sorcerers? Or will it take a disaster like Voldermort to make that happen? (Just in case someone missed the humorous tone here, that was a joke). Incidentally, I never mentioned the humanitarian organization (from the email thread) in my blog posts. So they are completely anonymous unless they choose otherwise.

Actually, the same Muggles that started the email exchange wrote a second email which was far, far more constructive and conducive to building bridges between Muggles and Crowd-Sorcerers than other humanitarians. I’m excited by the prospects and really appreciate the positive tone and interest they expressed in working together. I definitely look forward to working with them and learning more from them as we proceed forward in this space and collaboration.

Patrick Philippe Meier

Here Come the Crowd-Sorcerers: Is it Possible to Teach an Old (Humanitarian) Dog New Tech’s?

Thanks for joining us for the second episode in the new “Crowd-Sorcerers Series.” If you missed the season premiere on “How Technology is Disrupting the Humanitarian Space and Why It’s Easy,” you can read it here. For a quick synopsis of what this is all about, I’m responding to some initial “anti-crowdsourcing” remarks made by a frustrated humanitarian group in a recent email exchange. I’m referring to this group as Muggles after they christened the Crowdsourcing Community as “Crowd-Sorcerers.” The name calling is of course all in good fun.

Here’s more from the original email exchange:

Muggles: [Our] view is that the focus [on crowdsourcing] needs to be turned around. Don’t use crowdsourcing as technology to collect data, but as a means to distribute verified, accurate and reliable information that has been collected according to recognized/accepted standards.

Well, well, well. Isn’t this interesting? Writing that “crowdsourcing is a technology” reveals how out of touch Muggles are. Crowdsourcing is a methodology, not a technology. See my blog posts on “Demystifying Crowdsourcing” and “Know What Ushahidi Is? Think Again.” Worse, to write that crowdsourcing should be used to disseminate information shows just how much confusion exists in the humanitarian space.

The importance of information dissemination has long been documented and has nothing to do with crowdsourcing! Perhaps the term they’re looking for is “crowdfeeding” but I coined this to highlight the need for technologies that promote information dissemination by the crowd for the crowd.

Confession: I shudder when reading language like “according to recognized/accepted standards.” Not because standards are not important, but just because I’m weary of the exclusive and at times elitist attitude that tends to come with this language. I get flashbacks from “Seeing Like a State.”

Perhaps an astute reader will have recognized that the title of this blog post (Here Come the Crowd-Sorcerers”) is inspired from Clay Shirky’s book “Here Comes Everybody: The Power of Organizing Without Organizations.” I won’t try to summarize all of Clay’s many lucid observations here but I do highly recommend the book to Muggles (along with Seeing Like a State).

This type of tension between regulation and innovation has been playing out in several other sectors as well, including banking (vs. mobile banking) and perhaps most notably in journalism (vs. citizen journalism). But the tensions there have matured somewhat (at least relatively). In the latter case, people are increasingly recognizing the value of citizen journalism while better understanding its limits—so much so that large media companies have themselves started to leverage crowdsourcing for content in their programming.

The journalism community’s initial reaction against bloggers is not too dissimilar to the frustration expressed by Muggles who keep hoping that crowdsourcing will just go away if they pout and stamp their feet hard enough. (Reminds me of the way that some Muggles freaked out at the invention of the printing press and later the telephone).

Here’s the bad news folks, you’ve seen nothing yet. The Crowd-Sorcerers are just getting warmed up. The level of crowdsourcing we’ve seen to date is just the tip of the wand. Haiti was a first, just a first. User-generated content is not about to vanish any time soon. In fact, it will continue growing exponentially. The vast majority of content available on the web will soon be user-generated.

The good news? Muggles can take this as an opportunity to demonstrate leadership and share their savoir-faire. What should Muggles not do? Let me share a real example from another sector: election monitoring. One of the world’s leading election monitoring groups actively discouraged local NGOs in a developing country from contributing any reports to an Ushahidi deployment that was run in-country by a local civil society network—lets call them the Gryffindors.

The Gryffindors discovered this interference when they spoke with other local NGOs. They want to partner with these NGOs for the next elections but these groups are now hesitant. So here we have a Western (i.e. external group) directly interfering by telling local NGOs they cannot participate in a local initiative to document their own elections in their own country. (Sound familiar to the LogBase example from Episode 1? Naturally). Who do the elections belong to? Citizens or foreigners?

Muggles have the opportunity to provide unique thought leadership here. Make Crowd-Sorcerers part of the solution, not the problem.

There’s more good news. Despite what some Muggles may think, crowdsourcing is not actually magic. It’s just a methodology like any other, with advantages and disadvantages. At the end of the day, you’re just collecting information and this information can also be triangulated and verified like any other type of information.

That’s the whole point behind Swift River, to provide a free and open source platform that can help validate large quantities of information in near real time. Is it the silver bullet that we’ve all been dreaming of? Of course not, this ain’t Hogwarts. What Swift River does, however, is make the triangulation of crowdsourced information far more efficient for Muggles than ever before. So to suggest that crowdsourced information is inherently unverifiable is rather shortsighted.

Was the technology community’s response to Haiti perfect? Not even close, hence the current M&E on the Ushahidi deployment and these blog posts that I wrote up earlier this year:

In fact, much of my own frustration during the emergency period stemmed from the reckless behavior of some in the technology community. In addition, some tech folks who mean well end up producing tech solutions that don’t solve anything and never get used. So as I’ve blogged about before, tech folks need to get up to speed and get their act together. Hacking away every other weekend is all fine and well as long as the tech produced is actually in line with the needs of the humanitarian and disaster affected communities.

But lets be clear that the humanitarian community’s response to Haiti was hardly stellar (c.f., John Holmes’s leaked email, etc.). No one’s perfect, of course, and that includes Crowd-Sorcerers. The volunteer community that mobilized around the Ushahidi platform had never done anything like this (because nothing like this had quite happened) before, they had no prior training nor did they have much (if any) humanitarian experience to speak of. I, for one, had never launched an Ushahidi platform before. So boy did we all learn a heck of a lot.

Haiti was a complete first as far as live crisis mapping and mobile crowdsourcing goes. Yet Muggles  blame Crowd-Sorcerers for not getting everything right on their first try. The importance of standards is repeatedly voiced by Muggles, as noted above. Well I call this a double-standard.

Stay tuned for Episode 3 in the new series: “Here Come the Crowd-Sorcerers: Highlighting Some Misunderstandings.”

Patrick Philippe Meier