Category Archives: Humanitarian Technologies

Help Tag Tweets from Typhoon Pablo to Support UN Disaster Response!

Update: Summary of digital humanitarian response efforts available here.

The United Nations Office for the Coordination of Humanitarian Affairs (OCHA) has just activated the Digital Humanitarian Network (DHN) to request support in response to Typhoo Pablo. They also need your help! Read on!

pablopic

The UN has asked for pictures and videos of the damage to be collected from tweets posted over the past 48 hours. These pictures/videos need to be geo-tagged if at all possible, and time-stamped. The Standby Volunteer Task Force (SBTF) and Humanity Road (HR), both members of Digital Humanitarians, are thus collaborating to provide the UN with the requested data, which needs to be submitted by today 10pm 11pm New York time, 5am Geneva time tomorrow. Given this very short turn around time, we only have 10 hours (!), the Digital Humani-tarian Network needs your help!

Pybossa Philippines

The SBTF has partnered with colleagues at PyBossa to launch this very useful microtasking platform for you to assist the UN in these efforts. No prior experience necessary. Click here or on the display above to see just how easy it is to support the disaster relief operations on the ground.

A very big thanks to Daniel Lombraña González from PyBossa for turning this around at such short notice! If you have any questions about this project or with respect to volunteering, please feel free to add a comment to this blog post below. Even if you only have time tag one tweet, it counts! Please help!

Some background information on this project is available here.

Digital Humanitarian Response to Typhoon Pablo in Philippines

Update: Please help the UN! Tag tweets to support disaster response!

The purpose of this post is to keep notes on our efforts to date with the aim of revisiting these at a later time to write a more polished blog post on said efforts. By “Digital Humanitarian Response” I mean the process of using digital tech-nologies to aid disaster response efforts.

pablo-photos

My colleagues and I at QCRI have been collecting disaster related tweets on Typhoon Pablo since Monday. More specifically, we’ve been collecting those tweets with the hashtags officially endorsed by the government. There were over 13,000 relevant tweets posted on Tuesday alone. We then paid Crowdflower workers to micro-task the tagging of these hash-tagged tweets based on the following categories (click picture to zoom in):

Crowdflower

Several hundred tweets were processed during the first hour. On average, about 750 tweets were processed per hour. Clearly, we’d want that number to be far higher, (hence the need to combine micro-tasking with automated algorithms, as explained in the presentation below). In any event, the micro-tasking could also be accelerated if we increased the pay to Crowdflower workers. As it is, the total cost for processing the 13,000+ tweets came to about $250.

The database of processed tweets was then shared (every couple hours) with the Standby Volunteer Task Force (SBTF). SBTF volunteers (“Mapsters”) only focused on tweets that had been geo-tagged and tagged as relevant (e.g., “Casaualties,” “Infrastructure Damage,” “Needs/Asks,” etc.) by Crowdflower workers. SBTF volunteers then mapped these tweets on a Crowdmap as part of a training exercise for new Mapsters.

Geofeedia Pablo

We’re now talking with a humanitarian colleague in the Philippines who asked whether we can identify pictures/videos shared on social media that show damage, bridges down, flooding, etc. The catch is that these need to have a  location and time/date for them to be actionable. So I went on Geofeedia and scraped the relevant content available there (which Mapsters then added to the Crowdmap). One constraint of Geofeedia (and many other such platforms), however, is that they only map content that has been geo-tagged by users posting said content. This means we may be missing the majority of relevant content.

So my colleagues at QCRI are currently pulling all tweets posted today (Wed-nesday) and running an automated algorithm to identify tweets with URLs/links. We’ll ask Crowdflower workers to process the most recent tweets (and work backwards) by tagging those that: (1) link to pictures/video of damage/flooding, and (2) have geographic information. The plan is to have Mapsters add those tweets to the Crowdmap and to share the latter with our humanitarian colleague in the Philippines.

There are several parts of the above workflows that can (and will) be improved. I for one have already learned a lot just from the past 24 hours. But this is the subject of a future blog post as I need to get back to the work at hand.

Analyzing Disaster Tweets from Major Thai Floods

The 2011 Thai Floods was one of the country’s worst disasters in recent history.  The flooding began in July and lasted until December. Over 13 million people were affected. More than 800 were killed. The World Bank estimated $45 billion in total economic damage. This new study, “The Role of Twitter during a Natural Disaster: Case Study of 2011 Thai Flood,” analyzes how twitter was used during these major floods.

The number of tweets increase significantly in October, which is when the flooding reached parts of the Bangkok Metropolitan area. The month before (Sept-to-Oct) also a notable increase of tweets, which may “demonstrate that Thais were using Twitter to search for realtime and practical information that traditional media could not provide during the natural disaster period.”

To better understand the type of information shared on Twitter during the floods, the authors analyzed 175,551 tweets that used the hashtag #thaiflood. They removed “retweets” and duplicates, yielding a dataset of 64,582 unique tweets. Using keyword analysis and a rule based approach, the authors auto-matically classified these tweets into 5 categories:

Situational Announcements and Alerts: Tweets about up-to-date situational and location-based information related to the flood such as water levels, traffic conditions and road conditions in certain areas. In addition, emergency warnings from authorities advising citizens to evacuate areas, seek shelter or take other protective measures are also included.

Support Announcements: Tweets about free parking availability, free emergency survival kits distribution and free consulting services for home repair, etc.

Requests for Assistance: Tweets requesting any types of assistance; such as food, water, medical supplies, volunteers or transportation.

Requests for Information: Tweets including general inquiries related to the flood and flood relief such as inquiries for telephone numbers of relevant authorities, regarding the current situation in specific locations and about flood damage compensation.

Other: Tweets including all other messages, such as general comments; complaints and expressions of opinions.

The results of this analysis are shown in the figures below. The first shows the number of tweets per each category, while the second shows the distribution of these categories over time.

Messages posted during the first few weeks “included current water levels in certain areas and roads; announcements for free parking availability; requests for volunteers to make sandbags and pack emergency survival kits; announce-ments for evacuation in certain areas and requests for boats, food, water supplies and flood donation information. For the last few weeks when water started to recede, Tweet messages included reports on areas where water had receded, information on home cleaning andrepair and guidance regarding the process to receive flood damage compensation from the government.”

To determine the credibility of tweets, the authors identify the top 10 most re-tweeted users during the floods. They infer that the most retweeted tweets signal that the content of said tweets is perceived as credible. “The majority of these top users are flood/disaster related government or private organizations.” Siam Arsa, one of the leading volunteer networks helping flood victims in Thailand, was one of the top users ranked by retweets. The group utilizes social media on both Facebook  (www.facebook.com/siamarsa) and Twitter (@siamarsa) to share information about flooding and related volunteer work.”

In conclusion, “if the government plans to implement social media as a tool for disaster response, it would be well advised to prepare some measures or pro-tocols that help officials verify incoming information and eliminate false information. The  citizens should also be educated to take caution when receiving news and information via social media, and to think carefully about the potential effect before disseminating certain content.”

Gov Twitter

My QCRI colleagues and I are collecting tweets about Typhoon Pablo, which is making landfall in the Philippines. We’re specifically tracking tweets with one or more of the following hashtags: #PabloPh, #reliefPH and #rescuePH, which the government is publicly encouraging Filipinos to use. We hope to carry out an early analysis of these tweets to determine which ones provide situational aware-ness. The purpose of this applied action research is to ultimately develop a real-time dashboard for humanitarian response. This explains why we launched this Library of Crisis Hashtags. For further reading, please see this post on “What Percentage of Tweets Generated During a Crisis Are Relevant for Humanitarian Response?”

To Tweet or Not To Tweet During a Disaster?

Yes, only a small percentage of tweets generated during a disaster are directly relevant and informative for disaster response. No, this doesn’t mean we should dismiss Twitter as a source for timely, disaster-related information. Why? Because our efforts ought to focus on how that small percentage of informative tweets can be increased. What incentives or policies can be put in place? The following tweets by the Filipino government may shed some light.

Gov Twitter Pablo

The above tweet was posted three days before Typhoon Bopha (designated Pablo locally) made landfall in the Philippines. In the tweet below, the government directly and publicly encourages Filipinos to use the #PabloPH hashtag and to follow the Philippine Atmospheric, Geophysical & Astronomical Services Admin-istration (PAGASA) twitter feed, @dost_pagasa, which has over 400,000 follow-ers and also links to this official Facebook page.

Gov Twitter

The government’s official Twitter handle (@govph) is also retweeting tweets posted by The Presidential Communications Development and Strategic Plan-ning Office (@PCDCSO). This office is the “chief message-crafting body of the Office of the President.” In one such retweet (below), the office encourages those on Twitter to use different hashtags for different purposes (relief vs rescue). This mimics the use of official emergency numbers for different needs, e.g., police, fire, Ambulance, etc.

Twitter Pablo Gov

Given this kind of enlightened disaster response leadership, one would certainly expect that the quality of tweets received will be higher than without government endorsement. My team and I at QCRI are planning to analyze these tweets to de-termine whether or not this is the case. In the meantime, I expect we’ll see more examples of self-organized disaster response efforts using these hashtags, as per the earlier floods in August, which I blogged about here: Crowdsourcing Crisis Response following the Philippine Floods. This tech-savvy self-organization dynamic is important since the government itself may be unable to follow up on every tweeted request.

Using E-Mail Data to Estimate International Migration Rates

As is well known, “estimates of demographic flows are inexistent, outdated, or largely inconsistent, for most countries.” I would add costly to that list as well. So my QCRI colleague Ingmar Weber co-authored a very interesting study on the use of e-mail data to estimate international migration rates.

The study analyzes a large sample of Yahoo! emails sent by 43 million users between September 2009 and June 2011. “For each message, we know the date when it was sent and the geographic location from where it was sent. In addition, we could link the message with the person who sent it, and with the user’s demographic information (date of birth and gender), that was self reported when he or she signed up for a Yahoo! account. We estimated the geographic location from where each email message was sent using the IP address of the user.”

The authors used data on existing migration rates for a dozen countries and international statistics on Internet diffusion rates by age and gender in order to correct for selection bias. For example, “estimated number of migrants, by age group and gender, is multiplied by a correction factor to adjust for over-representation of more educated and mobile people in groups for which the Internet penetration is low.” The graphs below are estimates of age and gender-specific immigration rates for the Philippines. “The gray area represents the size of the bias correction.” This means that “without any correction for bias, the point estimates would be at the upper end of the gray area.” These methods “correct for the fact that the group of users in the sample, although very large, is not representative of the entire population.”

The results? Ingmar and his co-author Emilio Zagheni were able to “estimate migration rates that are consistent with the ones published by those few countries that compile migration statistics. By using the same method for all geographic regions, we obtained country statistics in a consistent way, and we generated new information for those countries that do not have registration systems in place (e.g., developing countries), or that do not collect data on out-migration (e.g., the United States).” Overall, the study documented a “global trend of increasing mobility,” which is “growing at a faster pace for females than males. The rate of increase for different age groups varies across countries.”

The authors argue that this approach could also be used in the context of “natural” disasters and man-made disasters. In terms of future research, they are interested in evaluating “whether sending a high proportion of e-mail messages to a particular country (which is a proxy for having a strong social network in the country) is related to the decision of actually moving to the country.” Naturally, they are also interested in analyzing Twitter data. “In addition to mobility or migration rates, we could evaluate sentiments pro or against migration for different geographic areas. This would help us understand how sentiments change near an international border or in regions with different migration rates and economic conditions.”

I’m very excited to have Ingmar at QCRI so we can explore these ideas further and in the context of humanitarian and development challenges. I’ve been dis-cussing similar research ideas with my colleagues at UN Global Pulse and there may be a real sweet spot for collaboration here, particularly with the recently launched Pulse Lab in Jakarta.” The possibility of collaborating with my collea-gues at Flowminder could also be really interesting given their important study of population movement following the Haiti Earthquake. In conclusion, I fully share the authors’ sentiment when they highlight the fact that it is “more and more important to develop models for data sharing between private com-panies and the academic world, that allow for both protection of users’ privacy & private companies’ interests, as well as reproducibility in scientific publishing.”

What Percentage of Tweets Generated During a Crisis Are Relevant for Humanitarian Response?

More than half-a-million tweets were generated during the first three days of Hurricane Sandy and well over 400,000 pictures were shared via Instagram. Last year, over one million tweets were generated every five minutes on the day that Japan was struck by a devastating earthquake and tsunami. Humanitarian organi-zations are ill-equipped to manage this volume and velocity of information. In fact, the lack of analysis of this “Big Data” has spawned all kinds of suppositions about the perceived value—or lack thereof—that social media holds for emer-gency response operations. So just what percentage of tweets are relevant for humanitarian response?

One of the very few rigorous and data-driven studies that addresses this question is Dr. Sarah Vieweg‘s 2012 doctoral dissertation on “Situational Awareness in Mass Emergency: Behavioral and Linguistic Analysis of Disaster Tweets.” After manually analyzing four distinct disaster datasets, Vieweg finds that only 8% to 20% of tweets generated during a crisis provide situational awareness. This implies that the vast majority of tweets generated during a crisis have zero added value vis-à-vis humanitarian response. So critics have good reason to be skeptical about the value of social media for disaster response.

At the same time, however, even if we take Vieweg’s lower bound estimate, 8%, this means that over 40,000 tweets generated during the first 72 hours of Hurricane Sandy may very well have provided increased situational awareness. In the case of Japan, more than 100,000 tweets generated every 5 minutes may have provided additional situational awareness. This volume of relevant infor-mation is much higher and more real-time than the information available to humanitarian responders via traditional channels.

Furthermore, preliminary research by QCRI’s Crisis Computing Team show that 55.8% of 206,764 tweets generated during a major disaster last year were “Informative,” versus 22% that were “Personal” in nature. In addition, 19% of all tweets represented “Eye-Witness” accounts, 17.4% related to information about “Casualty/Damage,” 37.3% related to “Caution/Advice,” while 16.6% related to “Donations/Other Offers.” Incidentally, the tweets were automatically classified using algorithms developed by QCRI. The accuracy rate of these ranged from 75%-81% for the “Informative Classifier,” for example. A hybrid platform could then push those tweets that are inaccurately classified to a micro-tasking platform for manual classification, if need be.

This research at QCRI constitutes the first phase of our work to develop a Twitter Dashboard for the Humanitarian Cluster System, which you can read more about in this blog post. We are in the process of analyzing several other twitter datasets in order to refine our automatic classifiers. I’ll be sure to share our preliminary observations and final analysis via this blog.

Crowdsourcing the Evaluation of Post-Sandy Building Damage Using Aerial Imagery

Update (Nov 2): 5,739 aerial images tagged by over 3,000 volunteers. Please keep up the outstanding work!

My colleague Schuyler Erle from Humanitarian OpenStreetMap  just launched a very interesting effort in response to Hurricane Sandy. He shared the info below via CrisisMappers earlier this morning, which I’m turning into this blog post to help him recruit more volunteers.

Schuyler and team just got their hands on the Civil Air Patrol’s (CAP) super high resolution aerial imagery of the disaster affected areas. They’ve imported this imagery into their Micro-Tasking Server MapMill created by Jeff Warren and are now asking volunteers to help tag the images in terms of the damage depicted in each photo. “The 531 images on the site were taken from the air by CAP over New York, New Jersey, Rhode Island, and Massachusetts on 31 Oct 2012.”

To access this platform, simply click here: http://sandy.hotosm.org. If that link doesn’t work,  please try sandy.locative.us.

“For each photo shown, please select ‘ok’ if no building or infrastructure damage is evident; please select ‘not ok’ if some damage or flooding is evident; and please select ‘bad’ if buildings etc. seem to be significantly damaged or underwater. Our *hope* is that the aggregation of the ok/not ok/bad ratings can be used to help guide FEMA resource deployment, or so was indicated might be the case during RELIEF at Camp Roberts this summer.”

A disaster response professional working in the affected areas for FEMA replied (via CrisisMappers) to Schuyler’s efforts to confirm that:

“[G]overnment agencies are working on exploiting satellite imagery for damage assessments and flood extents. The best way that you can help is to help categorize photos using the tool Schuyler provides […].  CAP imagery is critical to our decision making as they are able to work around some of the limitations with satellite imagery so that we can get an area of where the worst damage is. Due to the size of this event there is an overwhelming amount of imagery coming in, your assistance will be greatly appreciated and truly aid in response efforts.  Thank you all for your willingness to help.”

Schuyler notes that volunteers can click on the Grid link from the home page of the Micro-Tasking platform to “zoom in to the coastlines of Massachusetts or New Jersey” and see “judgements about building damages beginning to aggregate in US National Grid cells, which is what FEMA use operationally. Again, the idea and intention is that, as volunteers judge the level of damage evident in each photo, the heat map will change color and indicate at a glance where the worst damage has occurred.” See above screenshot.

Even if you just spend 5 or 10 minutes tagging the imagery, this will still go a long way to supporting FEMA’s response efforts. You can also help by spreading the word and recruiting others to your cause. Thank you!

What Was Novel About Social Media Use During Hurricane Sandy?

We saw the usual spikes in Twitter activity and the typical (reactive) launch of crowdsourced crisis maps. We also saw map mashups combining user-generated content with scientific weather data. Facebook was once again used to inform our social networks: “We are ok” became the most common status update on the site. In addition, thousands of pictures where shared on Instagram (600/minute), documenting both the impending danger & resulting impact of Hurricane Sandy. But was there anything really novel about the use of social media during this latest disaster?

I’m asking not because I claim to know the answer but because I’m genuinely interested and curious. One possible “novelty” that caught my eye was this FrankenFlow experiment to “algorithmically curate” pictures shared on social media. Perhaps another “novelty” was the embedding of webcams within a number of crisis maps, such as those below launched by #HurricaneHacker and Team Rubicon respectively.

Another “novelty” that struck me was how much focus there was on debunking false information being circulated during the hurricane—particularly images. The speed of this debunking was also striking. As regular iRevolution readers will know, “information forensics” is a major interest of mine.

This Tumblr post was one of the first to emerge in response to the fake pictures (30+) of the hurricane swirling around the social media whirlwind. Snopes.com also got in on the action with this post. Within hours, The Atlantic Wire followed with this piece entitled “Think Before You Retweet: How to Spot a Fake Storm Photo.” Shortly after, Alexis Madrigal from The Atlantic published this piece on “Sorting the Real Sandy Photos from the Fakes,” like the one below.

These rapid rumor-bashing efforts led BuzzFeed’s John Herman to claim that Twitter acted as a truth machine: “Twitter’s capacity to spread false information is more than cancelled out by its savage self-correction.” This is not the first time that journalists or researchers have highlighted Twitter’s tendency for self-correction. This peer-reviewed, data-driven study of disaster tweets generated during the 2010 Chile Earthquake reports the same finding.

What other novelties did you come across? Are there other interesting, original and creative uses of social media that ought to be documented for future disaster response efforts? I’d love to hear from you via the comments section below. Thanks!

MAQSA: Social Analytics of User Responses to News

Designed by QCRI in partnership with MIT and Al-Jazeera, MAQSA provides an interactive topic-centric dashboard that summarizes news articles and user responses (comments, tweets, etc.) to these news items. The platform thus helps editors and publishers in newsrooms like Al-Jazeera’s better “understand user engagement and audience sentiment evolution on various topics of interest.” In addition, MAQSA “helps news consumers explore public reaction on articles relevant to a topic and refine their exploration via related entities, topics, articles and tweets.” The pilot platform currently uses Al-Jazeera data such as Op-Eds from Al-Jazeera English.

Given a topic such as “The Arab Spring,” or “Oil Spill”, the platform combines time, geography and topic to “generate a detailed activity dashboard around relevant articles. The dashboard contains an annotated comment timeline and a social graph of comments. It utilizes commenters’ locations to build maps of comment sentiment and topics by region of the world. Finally, to facilitate exploration, MAQSA provides listings of related entities, articles, and tweets. It algorithmically processes large collections of articles and tweets, and enables the dynamic specification of topics and dates for exploration.”

While others have tried to develop similar dashboards in the past, these have “not taken a topic-centric approach to viewing a collection of news articles with a focus on their user comments in the way we propose.” The team at QCRI has since added a number of exciting new features for Al-Jazeera to try out as widgets on their site. I’ll be sure to blog about these and other updates when they are officially launched. Note that other media companies (e.g., UK Guardian) will also be able to use this platform and widgets once they become public.

As always with such new initiatives, my very first thought and question is: how might we apply them in a humanitarian context? For example, perhaps MAQSA could be repurposed to do social analytics of responses from local stakeholders with respect to humanitarian news articles produced by IRIN, an award-winning humanitarian news and analysis service covering the parts of the world often under-reported, misunderstood or ignored. Perhaps an SMS component could also be added to a MAQSA-IRIN platform to facilitate this. Or perhaps there’s an application for the work that Internews carries out with local journalists and consumers of information around the world. What do you think?

Could Twitris+ Be Used for Disaster Response?

I recently had the pleasure of speaking with Hermant Purohit and colleagues who have been working on an interesting semantic social web application called Twitris+. A project of the the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Twitris+ uses “real-time monitoring and multi-faceted analysis of social signals to provide insights and a framework for situational awareness, in-depth event analysis and coordination, emergency response aid, reputation management etc.”

Twitris+ packs together quite an array of social computing features, integrating spatio-temporal-thematic dimensions, people-content network analysis and sentiment-emotion subjectivity analysis. The tool also aggregates a range of social data and web resources such as twitter, online news, Wikipedia pages, other multimedia content, etc., in addition to SMS data, for which the team was recently granted a patent.

Unlike many other social media platforms I’ve reviewed over recent months, Twitris+ geo-tags content at the tweet-level rather than at the bio level. That is, many platforms simply geo-code tweets based on where a person says s/he is as per their Twitter bio. Accurately and comprehensively geo-referencing social media content is of course no trivial matter. Since many tweets do not include geographic information, colleagues at GeoIQ are seeking to infer geographic information after analyzing a given stream of tweets, for example.

I look forward to continuing my conversations with Hermant and team. Indeed, I am particularly interested to see which emergency management organizations begin to pilot the platform to enhance their situational awareness during a crisis. Their feedback will be invaluable to Twitris+ and to many of us in the humani-tarian technology space.