Tag Archives: Computing

Handbook: How to Catalyze Humanitarian Innovation in Computing Research Institutes

This research was commissioned by the World Humanitarian Summit (WHS) Innovation Team, which I joined last year. An important goal of the Summit’s Innovation Team is to identify concrete innovation pathways that can transform the humanitarian industry into a more effective, scalable and agile sector. I have found that discussions on humanitarian innovation can sometimes tend towards conceptual, abstract and academic questions. This explains why I took a different approach vis-a-vis my contribution to the WHS Innovation Track.

WHS_Logo_0

The handbook below provides practical collaboration guidelines for both humanitarian organizations & computing research institutes on how to catalyze humanitarian innovation through successful partnerships. These actionable guidelines are directly applicable now and draw on extensive interviews with leading humanitarian groups and CRI’s including the International Committee of the Red Cross (ICRC), United Nations Office for the Coordination of Humanitarian Affairs (OCHA), United Nations Children’s Fund (UNICEF), United Nations High Commissioner for Refugees (UNHCR), UN Global Pulse, Carnegie Melon University (CMU), International Business Machines (IBM), Microsoft Research, Data Science for Social Good Program at the University of Chicago and others.

This handbook, which is the first of its kind, also draws directly on years of experience and lessons learned from the Qatar Computing Research Institute’s (QCRI) active collaboration and unique partnerships with multiple international humanitarian organizations. The aim of this blog post is to actively solicit feedback on this first, complete working draft, which is available here as an open and editable Google Doc. So if you’re interested in sharing your insights, kindly insert your suggestions and questions by using the Insert/Comments feature. Please do not edit the text directly.

I need to submit the final version of this report on July 1, so very much welcome constructive feedback via the Google Doc before this deadline. Thank you!

Could This Be The Most Comprehensive Study of Crisis Tweets Yet?

I’ve been looking forward to blogging about my team’s latest research on crisis computing for months; the delay being due to the laborious process of academic publishing, but I digress. I’m now able to make their  findings public. The goal of their latest research was to “understand what affected populations, response agencies and other stakeholders can expect—and not expect—from [crisis tweets] in various types of disaster situations.”

Screen Shot 2015-02-15 at 12.08.54 PM

As my colleagues rightly note, “Anecdotal evidence suggests that different types of crises elicit different reactions from Twitter users, but we have yet to see whether this is in fact the case.” So they meticulously studied 26 crisis-related events between 2012-2013 that generated significant activity on twitter. The lead researcher on this project, my colleague & friend Alexandra Olteanu from EPFL, also appears in my new book.

Alexandra and team first classified crisis related tweets based on the following categories (each selected based on previous research & peer-reviewed studies):

Screen Shot 2015-02-15 at 11.01.48 AM

Written in long form: Caution & Advice; Affected Individuals; Infrastructure & Utilities; Donations & Volunteering; Sympathy & Emotional Support, and Other Useful Information. Below are the results of this analysis sorted by descending proportion of Caution & Advice related tweets (click to enlarge).

Screen Shot 2015-02-15 at 10.59.55 AM

The category with the largest number of tweets is “Other Useful Info.” On average 32% of tweets fall into this category (minimum 7%, maximum 59%). Interestingly, it appears that most crisis events that are spread over a relatively large geographical area (i.e., they are diffuse), tend to be associated with the lowest number of “Other” tweets. As my QCRI rightly colleagues note, “it is potentially useful to know that this type of tweet is not prevalent in the diffused events we studied.”

Tweets relating to Sympathy and Emotional Support are present in each of the 26 crises. On average, these account for 20% of all tweets. “The 4 crises in which the messages in this category were more prevalent (above 40%) were all instantaneous disasters.” This finding may imply that “people are more likely to offer sympathy when events […] take people by surprise.”

On average, 20% of tweets in the 26 crises relate to Affected Individuals. “The 5 crises with the largest proportion of this type of information (28%–57%) were human-induced, focalized, and instantaneous. These 5 events can also be viewed as particularly emotionally shocking.”

Tweets related to Donations & Volunteering accounted for 10% of tweets on average. “The number of tweets describing needs or offers of goods and services in each event varies greatly; some events have no mention of them, while for others, this is one of the largest information categories. “

Caution and Advice tweets constituted on average 10% of all tweets in a given crisis. The results show a “clear separation between human-induced hazards and natural: all human induced events have less caution and advice tweets (0%–3%) than all the events due to natural hazards (4%–31%).”

Finally, tweets related to Infrastructure and Utilities represented on average 7% of all tweets posted in a given crisis. The disasters with the highest number of such tweets tended to be flood situations.

In addition to the above analysis, Alexandra et al. also categorized tweets by their source:

Screen Shot 2015-02-15 at 11.23.19 AM

The results depicted below (click to enlarge) are sorted by descending order of eyewitness tweets.

Screen Shot 2015-02-15 at 11.27.57 AM

On average, about 9% of tweets generated during a given crises were written by Eyewitnesses; a figure that increased to 54% for the haze crisis in Singapore. “In general, we find a larger proportion of eyewitness accounts during diffused disasters caused by natural hazards.”

Traditional and/or Internet Media were responsible for 42% of tweets on average. ” The 6 crises with the highest fraction of tweets coming from a media source (54%–76%) are instantaneous, which make “breaking news” in the media.

On average, Outsiders posted 38% of the tweets in a given crisis while NGOs were responsible for about 4% of tweets and Governments 5%. My colleagues surmise that these low figures are due to the fact that both NGOs and governments seek to verify information before they release it. The highest levels of NGO and government tweets occur in response to natural disasters.

Finally, Businesses account for 2% of tweets on average. The Alberta floods of 2013 saw the highest proportion (9%) of tweets posted by businesses.

All the above findings are combined and displayed below (click to enlarge). The figure depicts the “average distribution of tweets across crises into combinations of information types (rows) and sources (columns). Rows and columns are sorted by total frequency, starting on the bottom-left corner. The cells in this figure add up to 100%.”

Screen Shot 2015-02-15 at 11.42.39 AM

The above analysis suggests that “when the geographical spread [of a crisis] is diffused, the proportion of Caution and Advice tweets is above the median, and when it is focalized, the proportion of Caution and Advice tweets is below the median. For sources, […] human-induced accidental events tend to have a number of eyewitness tweets below the median, in comparison with intentional and natural hazards.” Additional analysis carried out by my colleagues indicate that “human-induced crises are more similar to each other in terms of the types of information disseminated through Twitter than to natural hazards.” In addition, crisis events that develop instantaneously also look the same when studied through the lens of tweets.

In conclusion, the analysis above demonstrates that “in some cases the most common tweet in one crisis (e.g. eyewitness accounts in the Singapore haze crisis in 2013) was absent in another (e.g. eyewitness accounts in the Savar building collapse in 2013). Furthermore, even two events of the same type in the same country (e.g. Typhoon Yolanda in 2013 and Typhoon Pablo in 2012, both in the Philippines), may look quite different vis-à-vis the information on which people tend to focus.” This suggests the uniqueness of each event.

“Yet, when we look at the Twitter data at a meta-level, our analysis reveals commonalities among the types of information people tend to be concerned with, given the particular dimensions of the situations such as hazard category (e.g. natural, human-induced, geophysical, accidental), hazard type (e.g. earth-quake, explosion), whether it is instantaneous or progressive, and whether it is focalized or diffused. For instance, caution and advice tweets from government sources are more common in progressive disasters than in instantaneous ones. The similarities do not end there. When grouping crises automatically based on similarities in the distributions of different classes of tweets, we also realize that despite the variability, human-induced crises tend to be more similar to each other than to natural hazards.”

Needless to say, these are exactly the kind of findings that can improve the way we use MicroMappers & other humanitarian technologies for disaster response. So if want to learn more, the full study is available here (PDF). In addition, all the Twitter datasets used for the analysis are available at CrisisLex. If you have questions on the research, simply post them in the comments section below and I’ll ask my colleagues to reply there.

bookcover

In the meantime, there is a lot more on humanitarian technology and computing in my new book Digital Humanitarians. As I note in said book, we also need enlightened policy making to tap the full potential of social media for disaster response. Technology alone can only take us so far. If we don’t actually create demand for relevant tweets in the first place, then why should social media users supply a high volume of relevant and actionable tweets to support relief efforts? This OCHA proposal on establishing specific social media standards for disaster response, and this official social media strategy developed and implemented by the Filipino government are examples of what enlightened leadership looks like.

Computing Research Institutes as an Innovation Pathway for Humanitarian Technology

The World Humanitarian Summit (WHS) is an initiative by United Nations Secretary-General Ban Ki-moon to improve humanitarian action. The Summit, which is to be held in 2016, stands to be one of the most important humanitarian conferences in a decade. One key pillar of WHS is humanitarian innovation. “Transformation through Innovation” is the WHS Working Group dedicated to transforming humanitarian action by focusing explicitly on innovation. I have the pleasure of being a member of this working group where my contribution focuses on the role of new technologies, data science and advanced computing. As such, I’m working on an applied study to explore the role of computing research institutes as an innovation pathway for humanitarian technology. The purpose of this blog post is to invite feedback on the ideas presented below.

WHS_Logo_0

I first realized that the humanitarian community faced a “Big Data” challenge in 2010, just months after I had joined Ushahidi as Director of Crisis Mapping, and just months after co-founding CrisisMappers: The Humanitarian Technology Network. The devastating Haiti Earthquake resulted in a massive overflow of information generated via mainstream news, social media, text messages and satellite imagery. I launched and spearheaded the Haiti Crisis Map at the time and together with hundreds of digital volunteers from all around the world went head-to head with Big Data. As noted in my forthcoming book, we realized there and then that crowdsourcing and mapping software alone were no match for Big (Crisis) Data.

Digital Humanitarians: The Book

This explains why I decided to join an advanced computing research institute, namely QCRI. It was clear to me after Haiti that humanitarian organizations had to partner directly with advanced computing experts to manage the new Big Data challenge in disaster response. So I “embedded” myself in an institute with leading experts in Big Data Analytics, Data Science and Social Computing. I believe that computing research institutes (CRI’s) can & must play an important role in fostering innovation in next generation humanitarian technology by partnering with humanitarian organizations on research & development (R&D).

There is already some evidence to support this proposition. We (QCRI) teamed up with the UN Office for the Coordination of Humanitarian Affairs (OCHA) to create the Artificial Intelligence for Disaster Response platform, AIDR as well as MicroMappers. We are now extending AIDR to analyze text messages (SMS) in partnership with UNICEF. We are also spearheading efforts around the use and analysis of aerial imagery (captured via UAVs) for disaster response (see the Humanitarian UAV Network: UAViators). On the subject of UAVs, I believe that this new technology presents us (in the WHS Innovation team) with an ideal opportunity to analyze in “real time” how a new, disruptive technology gets adopted within the humanitarian system. In addition to UAVs, we catalyzed a partnership with Planet Labs and teamed up with Zooniverse to take satellite imagery analysis to the next level with large scale crowd computing. To this end, we are working with humanitarian organizations to enable them to make sense of Big Data generated via social media, SMS, aerial imagery & satellite imagery.

The incentives for humanitarian organizations to collaborate with CRI’s are obvious, especially if the latter (like QCRI) commits to making the resulting prototypes freely accessible and open source. But why should CRI’s collaborate with humanitarian organizations in the first place? Because the latter come with real-world challenges and unique research questions that many computer scientists are very interested in for several reasons. First, carrying out scientific research on real-world problems is of interest to the vast majority of computer scientists I collaborate with, both within QCRI and beyond. These scientists want to apply their skills to make the world a better place. Second, the research questions that humanitarian organizations bring enable computer scientists to differentiate themselves in the publishing world. Third, the resulting research can help advanced the field of computer science and advanced computing.

So why are we see not seeing more collaboration between CRI’s & humanitarian organizations? Because of this cognitive surplus mismatch. It takes a Director of Social Innovation (or related full-time position) to serve as a translational leader between CRI’s and humanitarian organizations. It takes someone (ideally a team) to match the problem owners and problem solvers; to facilitate and manage the collaboration between these two very different types of expertise and organizations. In sum, CRI’s can serve as an innovation pathway if the following three ingredients are in place: 1) Translation Leader; 2) Committed CRI; and 3) Committed Humanitarian Organization. These are necessary but not sufficient conditions for success.

While research institutes have a comparative advantage in R&D, they are not the best place to scale humanitarian technology prototypes. In order to take these prototypes to the next level, make them sustainable and have them develop into enterprise level software, they need to be taken up by for-profit companies. The majority of CRI’s (QCRI included) actually do have a mandate to incubate start-up companies. As such, we plan to spin-off some of the above platforms as independent companies in order to scale the technologies in a robust manner. Note that the software will remain free to use for humanitarian applications; other uses of the platform will require a paid license. Therein lies the end-to-end innovation path that computing research institutes can offer humanitarian organization vis-a-vis next generation humanitarian technologies.

As noted above, part of my involvement with the WHS Innovation Team entails working on an applied study to document and replicate this innovation pathway. As such, I am looking for feedback on the above as well as on the research methodology described below.

I plan to interview Microsoft Research, IBM Research, Yahoo Research, QCRI and other institutes as part of this research. More specifically, the interview questions will include:

  • Have you already partnered with humanitarian organizations? Why/why not?
  • If you have partnered with humanitarian organizations, what was the outcome? What were the biggest challenges? Was the partnership successful? If so, why? If not, why not?
  • If you have not yet partnered with humanitarian organizations, why not? What factors would be conducive to such partnerships and what factors serve as hurdles?
  • What are your biggest concerns vis-a-vis working with humanitarian groups?
  • What funding models did you explore if any?

I also plan to interview humanitarian organizations to better understand the prospects for this potential innovation pathway. More specifically, I plan to interview ICRC, UNHCR, UNICEF and OCHA using the following questions:

  • Have you already partnered with computing research groups? Why/why not?
  • If you have partnered with computing research groups, what was the outcome? What were the biggest challenges? Was the partnership successful? If so, why? If not, why not?
  • If you have not yet partnered with computing research groups, why not? What factors would be conducive to such partnerships and what factors serve as hurdles?
  • What are your biggest concerns vis-a-vis working with computing research groups?
  • What funding models did you explore if any?

My plan is to carry out the above semi-structured interviews in February-March 2015 along with secondary research. My ultimate aim with this deliverable is to develop a model to facilitate greater collaboration between computing research institutes and humanitarian organizations. To this end, I welcome feedback on all of the above (feel free to email me and/or add comments below). Thank you.

Bio

See also:

  • Research Framework for Next Generation Humanitarian Technology and Innovation [link]
  • From Gunfire at Sea to Maps of War: Implications for Humanitarian Innovation [link]

Automatically Classifying Text Messages (SMS) for Disaster Response

Humanitarian organizations like the UN and Red Cross often face a deluge of social media data when disasters strike areas with a large digital footprint. This explains why my team and I have been working on AIDR (Artificial Intelligence for Disaster Response), a free and open source platform to automatically classify tweets in real-time. Given that the vast majority of the world’s population does not tweet, we’ve teamed up with UNICEF’s Innovation Team to extend our AIDR platform so users can also automatically classify streaming SMS.

BulkSMS_graphic

After the Haiti Earthquake in 2010, the main mobile network operator there (Digicel) offered to sent an SMS to each of their 1.4 million subscribers (at the time) to accelerate our disaster needs assessment efforts. We politely declined since we didn’t have any automated (or even semi-automated way) of analyzing incoming text messages. With AIDR, however, we should (theoretically) be able to classify some 1.8 million SMS’s (and tweets) per hour. Enabling humanitarian organizations to make sense of “Big Data” generated by affected communities is obviously key for two-way communication with said communities during disasters, hence our work at QCRI on “Computing for Good”.

AIDR/SMS applications are certainly not limited to disaster response. In fact, we plan to pilot the AIDR/SMS platform for a public health project with our UNICEF partners in Zambia next month and with other partners in early 2015. While still experimental, I hope the platform will eventually be robust enough for use in response to major disasters; allowing humanitarian organizations to poll affected communities and to make sense of resulting needs in near real-time, for example. Millions of text messages could be automatically classified according to the Cluster System, for example, and the results communicated back to local communities via community radio stations, as described here.

These are still very early days, of course, but I’m typically an eternal optimist, so I hope that our research and pilots do show promising results. Either way, we’ll be sure to share the full outcome of said pilots publicly so that others can benefit from our work and findings. In the meantime, if your organization is interested in piloting and learning with us, then feel free to get in touch.

bio

Video: Humanitarian Response in 2025

I gave a talk on “The future of Humanitarian Response” at UN OCHA’s Global Humanitarian Policy Forum (#aid2025) in New York yesterday. More here for context. A similar version of the talk is available in the video presentation below.

Some of the discussions that ensued during the Forum were frustrating albeit an important reality check. Some policy makers still think that disaster response is about them and their international humanitarian organizations. They are still under the impression that aid does not arrive until they arrive. And yet, empirical research in the disaster literature points to the fact that the vast majority of survivals during disasters is the result of local agency, not external intervention.

In my talk (and video above), I note that local communities will increasingly become tech-enabled first responders, thus taking pressure off the international humanitarian system. These tech savvy local communities already exit. And they already respond to both “natural” (and manmade) disasters as noted in my talk vis-a-vis the information products produced by tech-savvy local Filipino groups. So my point about the rise of tech-enabled self-help was a more diplomatic way of conveying to traditional humanitarian groups that humanitarian response in 2025 will continue to happen with or without them; and perhaps increasingly without them.

This explains why I see OCHA’s Information Management (IM) Team increasingly taking on the role of “Information DJ”, mixing both formal and informal data sources for the purposes of both formal and informal humanitarian response. But OCHA will certainly not be the only DJ in town nor will they be invited to play at all “info events”. So the earlier they learn how to create relevant info mixes, the more likely they’ll still be DJ’ing in 2025.

Bio

Humanitarian Crisis Computing 101

Disaster-affected communities are increasingly becoming “digital” communities. That is, they increasingly use mobile technology & social media to communicate during crises. I often refer to this user-generated content as Big (Crisis) Data. Humanitarian crisis computing seeks to rapidly identify informative, actionable and credible content in this growing stack of real-time information. The challenge is akin to finding the proverbial needle in the haystack since the vast majority of reports posted on social media is often not relevant for humanitarian response. This is largely a result of the demand versus supply problem described here.

bd0

In any event, the few “needles” of information that are relevant, can relay information that is vital and indeed-life saving for relief efforts—both traditional top-down efforts and more bottom-up grassroots efforts. When disaster strikes, we increasingly see social media traffic explode. We know there are important “pins” of relevant information hidden in this growing stack of information but how do we find them in real-time?

bd2

Humanitarian organizations are ill-equipped to managing the deluge of Big Crisis Data. They tend to sift through the stack of information manually, which means they aren’t able to process more than a small volume of information. This is represented by the dotted green line in the picture below. Big Data is often described as filter failure. Our manual filters cannot manage the large volume, velocity and variety of information posted on social media during disasters. So all the information above the dotted line, Big Data, is completely ignored.

bd3

This is where Advanced Computing comes in. Advanced Computing uses Human and Machine Computing to manage Big Data and reduce filter failure, thus allowing humanitarian organizations to process a larger volume, velocity and variety of crisis information in less time. In other words, Advanced Computing helps us push the dotted green line up the information stack.

bd4

In the early days of digital humanitarian response, we used crowdsourcing to search through the haystack of user-generated content posted during disasters. Note that said content can also include text messages (SMS), like in Haiti. Crowd-sourcing crisis information is not as much fun as the picture below would suggest, however. In fact, crowdsourcing crisis information was (and can still be) quite a mess and a big pain in the haystack. Needless to say, crowdsourcing is not the best filter to make sense of Big Crisis Data.

bd5

Recently, digital humanitarians have turned to microtasking crisis information as described here and here. The UK Guardian and Wired have also written about this novel shift from crowdsourcing to microtasking.

bd6

Microtasking basically turns a haystack into little blocks of stacks. Each micro-stack is then processed by one ore more digital humanitarian volunteers. Unlike crowdsourcing, a microtasking approach to filtering crisis information is highly scalable, which is why we recently launched MicroMappers.

bd7

The smaller the micro-stack, the easier the tasks and the faster that they can be carried out by a greater number of volunteers. For example, instead of having 10 people classify 10,000 tweets based on the Cluster System, microtasking makes it very easy for 1,000 people to classify 10 tweets each. The former would take hours while the latter mere minutes. In response to the recent earthquake in Pakistan, some 100 volunteers used MicroMappers to classify 30,000+ tweets in about 30 hours, for example.

bd8

Machine Computing, in contrast, uses natural language processing (NLP) and machine learning (ML) to “quantify” the haystack of user-generated content posted on social media during disasters. This enable us to automatically identify relevant “needles” of information.

bd9

An example of a Machine Learning approach to crisis computing is the Artificial Intelligence for Disaster Response (AIDR) platform. Using AIDR, users can teach the platform to automatically identify relevant information from Twitter during disasters. For example, AIDR can be used to automatically identify individual tweets that relay urgent needs from a haystack of millions of tweets.

bd11
The pictures above are taken from the slide deck I put together for a keynote address I recently gave at the Canadian Ministry of Foreign Affairs.

bio

New! Humanitarian Computing Library

The field of “Humanitarian Computing” applies Human Computing and Machine Computing to address major information-based challengers in the humanitarian space. Human Computing refers to crowdsourcing and microtasking, which is also referred to as crowd computing. In contrast, Machine Computing draws on natural language processing and machine learning, amongst other disciplines. The Next Generation Humanitarian Technologies we are prototyping at QCRI are powered by Humanitarian Computing research and development (R&D).

Screen Shot 2013-09-04 at 3.00.05 AM

My QCRI colleagues and I  just launched the first ever Humanitarian Computing Library which is publicly available here. The purpose of this library, or wiki, is to consolidate existing and future research that relate to Humanitarian Computing in order to support the development of next generation humanitarian tech. The repository currently holds over 500 publications that span topics such as Crisis Management, Trust and Security, Software and Tools, Geographical Analysis and Crowdsourcing. These publications are largely drawn from (but not limited to) peer-reviewed papers submitted at leading conferences around the world. We invite you to add your own research on humanitarian computing to this growing collection of resources.

Many thanks to my colleague ChaTo (project lead) and QCRI interns Rahma and Nada from Qatar University for spearheading this important project. And a special mention to student Rachid who also helped.

bio