Tag Archives: Disaster

Zooniverse: The Answer to Big (Crisis) Data?

Both humanitarian and development organizations are completely unprepared to deal with the rise of “Big Crisis Data” & “Big Development Data.” But many still hope that Big Data is but an illusion. Not so, as I’ve already blogged here, here and here. This explains why I’m on a quest to tame the Big Data Beast. Enter Zooniverse. I’ve been a huge fan of Zooniverse for as long as I can remember, and certainly long before I first mentioned them in this post from two years ago. Zooniverse is a citizen science platform that evolved from GalaxyZoo in 2007. Today, Zooniverse “hosts more than a dozen projects which allow volunteers to participate in scientific research” (1). So, why do I have a major “techie crush” on Zooniverse?

Oh let me count the ways. Zooniverse interfaces are absolutely gorgeous, making them a real pleasure to spend time with; they really understand user-centered design and motivations. The fact that Zooniverse is conversent in multiple disciplines is incredibly attractive. Indeed, the platform has been used to produce rich scientific data across multiple fields such as astronomy, ecology and climate science. Furthermore, this citizen science beauty has a user-base of some 800,000 registered volunteers—with an average of 500 to 1,000 new volunteers joining every day! To place this into context, the Standby Volunteer Task Force (SBTF), a digital humanitarian group has about 1,000 volunteers in total. The open source Zooniverse platform also scales like there’s no tomorrow, enabling hundreds of thousands to participate on a single deployment at any given time. In short, the software supporting these pioneering citizen science projects is well tested and rapidly customizable.

At the heart of the Zooniverse magic is microtasking. If you’re new to microtasking, which I often refer to as “smart crowdsourcing,” this blog post provides a quick introduction. In brief, Microtasking takes a large task and breaks it down into smaller microtasks. Say you were a major (like really major) astro-nomy buff and wanted to tag a million galaxies based on whether they are spiral or elliptical galaxies. The good news? The kind folks at the Sloan Digital Sky Survey have already sent you a hard disk packed full of telescope images. The not-so-good news? A quick back-of-the-envelope calculation reveals it would take 3-5 years, working 24 hours/day and 7 days/week to tag a million galaxies. Ugh!

Screen Shot 2013-03-25 at 4.11.14 PM

But you’re a smart cookie and decide to give this microtasking thing a go. So you upload the pictures to a microtasking website. You then get on Facebook, Twitter, etc., and invite (nay beg) your friends (and as many strangers as you can find on the suddenly-deserted digital streets), to help you tag a million galaxies. Naturally, you provide your friends, and the surprisingly large number good digital Samaritans who’ve just show up, with a quick 2-minute video intro on what spiral and elliptical galaxies look like. You explain that each participant will be asked to tag one galaxy image at a time by simply by clicking the “Spiral” or “Elliptical” button as needed. Inevitably, someone raises their hands to ask the obvious: “Why?! Why in the world would anyone want to tag a zillion galaxies?!”

Well, only cause analyzing the resulting data could yield significant insights that may force a major rethink of cosmology and our place in the Universe. “Good enough for us,” they say. You breathe a sigh of relief and see them off, cruising towards deep space to bolding go where no one has gone before. But before you know it, they’re back on planet Earth. To your utter astonishment, you learn that they’re done with all the tagging! So you run over and check the data to see if they’re pulling your leg; but no, not only are 1 million galaxies tagged, but the tags are highly accurate as well. If you liked this little story, you’ll be glad to know that it happened in real life. GalaxyZoo, as the project was called, was the flash of brilliance that ultimately launched the entire Zooniverse series.

Screen Shot 2013-03-25 at 3.23.53 PM

No, the second Zooniverse project was not an attempt to pull an Oceans 11 in Las Vegas. One of the most attractive features of many microtasking platforms such as Zooniverse is quality control. Think of slot machines. The only way to win big is by having three matching figures such as the three yellow bells in the picture above (righthand side). Hit the jackpot and the coins will flow. Get two out three matching figures (lefthand side), and some slot machines may toss you a few coins for your efforts. Microtasking uses the same approach. Only if three participants tag the same picture of a galaxy as being a spiral galaxy does that data point count. (Of course, you could decide to change the requirement from 3 volunteers to 5 or even 20 volunteers). This important feature allows micro-tasking initiatives to ensure a high standard of data quality, which may explain why many Zooniverse projects have resulted in major scientific break-throughs over the years.

The Zooniverse team is currently running 15 projects, with several more in the works. One of the most recent Zooniverse deployments, Planet Four, received some 15,000 visitors within the first 60 seconds of being announced on BBC TV. Guess how many weeks it took for volunteers to tag over 2,000,0000 satellite images of Mars? A total of 0.286 weeks, i.e., forty-eight hours! Since then, close to 70,000 volunteers have tagged and traced well over 6 million Martian “dunes.” For their Andromeda Project, digital volunteers classified over 7,500 star clusters per hour, even though there was no media or press announce-ment—just one newsletter sent to volunteers. Zooniverse de-ployments also involve tagging earth-based pictures (in contrast to telescope imagery). Take this Serengeti Snapshot deployment, which invited volunteers to classify animals using photographs taken by 225 motion-sensor cameras in Tanzania’s Serengeti National Park. Volunteers swarmed this project to the point that there are no longer any pictures left to tag! So Zooniverse is eagerly waiting for new images to be taken in Serengeti and sent over.

Screen Shot 2013-03-23 at 7.49.56 PM

One of my favorite Zooniverse features is Talk, an online discussion tool used for all projects to provide a real-time interface for volunteers and coordinators, which also facilitates the rapid discovery of important features. This also allows for socializing, which I’ve found to be particularly important with digital humanitarian deployments (such as these). One other major advantage of citizen science platforms like Zooniverse is that they are very easy to use and therefore do not require extensive prior-training (think slot machines). Plus, participants get to learn about new fields of science in the process. So all in all, Zooniverse makes for a great date, which is why I recently reached out to the team behind this citizen science wizardry. Would they be interested in going out (on a limb) to explore some humanitarian (and development) use cases? “Why yes!” they said.

Microtasking platforms have already been used in disaster response, such as MapMill during Hurricane SandyTomnod during the Somali Crisis and CrowdCrafting during Typhoon Pablo. So teaming up with Zooniverse makes a whole lot of sense. Their microtasking software is the most scalable one I’ve come across yet, it is open source and their 800,000 volunteer user-base is simply unparalleled. If Zooniverse volunteers can classify 2 million satellite images of Mars in 48 hours, then surely they can do the same for satellite images of disaster-affected areas on Earth. Volunteers responding to Sandy created some 80,000 assessments of infrastructure damage during the first 48 hours alone. It would have taken Zooniverse just over an hour. Of course, the fact that the hurricane affected New York City and the East Coast meant that many US-based volunteers rallied to the cause, which may explain why it only took 20 minutes to tag the first batch of 400 pictures. What if the hurricane had hit a Caribbean instead? Would the surge of volunteers may have been as high? Might Zooniverse’s 800,000+ standby volunteers also be an asset in this respect?

Screen Shot 2013-03-23 at 7.42.22 PM

Clearly, there is huge potential here, and not only vis-a-vis humanitarian use-cases but development one as well. This is precisely why I’ve already organized and coordinated a number of calls with Zooniverse and various humanitarian and development organizations. As I’ve been telling my colleagues at the United Nations, World Bank and Humanitarian OpenStreetMap, Zooniverse is the Ferrari of Microtasking, so it would be such a big shame if we didn’t take it out for a spin… you know, just a quick test-drive through the rugged terrains of humanitarian response, disaster preparedness and international development. 

bio

Postscript: As some iRevolution readers may know, I am also collaborating with the outstanding team at  CrowdCrafting, who have also developed a free & open-source microtasking platform for citizen science projects (also for disaster response here). I see Zooniverse and CrowCrafting as highly syner-gistic and complementary. Because CrowdCrafting is still in early stages, they fill a very important gap found at the long tail. In contrast, Zooniverse has been already been around for half-a-decade and can caters to very high volume and high profile citizen science projects. This explains why we’ll all be getting on a call in the very near future. 

GeoFeedia: Ready for Digital Disaster Response

GeoFeedia was not originally designed to support humanitarian operations. But last year’s blog post on the potential of GeoFeedia for crisis mapping caught the interest of CEO Phil Harris. So he kindly granted the Standby Volunteer Task Force (SBTF) free access to the platform. In return, we provided his team with feedback on what features (listed here) would make GeoFeedia more useful for digital disaster response. This was back in summer 2012. I recently learned that they’ve been quite busy since. Indeed, I had the distinct pleasure of sharing the stage with Phil and his team at this superb conference on social media for emergency management. After listening to their talk, I realized it was high time to publish an update on GeoFeedia, especially since we had used the tool just two months earlier in response to Typhoon Pablo, one of the worst disasters to hit the Philippines in the past 100 years.

The 1-minute video is well worth watching if you’re new to GeoFeedia. The plat-form enables hyper local searches for information by location across multiple social media channels such as Twitter, Youtube, Flickr, Picasa & now Instagram. One of my favorite GeoFeedia features is the awesome geofeed (digital fence), which you can learn more about here. So what’s new besides Instagram? Well, the first suggestion I made last year was to provide users with the option of searching by both location and topic, rather than just location alone. And presto, this now possible, which means that digital humanitarians today can zoom into a disaster-affected area and filter by social media type, date and hashtag. This makes the geofeed feature even more compelling for crisis response, especially since geofeeds can also be saved and shared.

The vast majority of social media monitoring tools out there first filter by key-word and hashtag. Only later do they add location. As Phil points out, this mean they easily miss 70% of hyper local social media reports. Most users and org-anizations, who pay hefty licensing fees to uses these platforms, are typically unaware of this. The fact that GeoFeedia first filters by location is not an accident. This recent study (PDF) of the 2012 London Olympics showed that social media users posted close to 170,000 geo-tagged to Twitter, Instagram, Flickr, Picasa and YouTube during the games. But only 31% of these geo-tagged posts contained any Olympic-specific keywords and/or hashtags! So they decided to analyze another large event and again found the number of results drop by about 70% when not first filtering by location. Phil argues that people in a crisis situation obviously don’t wait for keywords or hashtags to form; so he expects this drop to happen for disasters as well. “Traditional keyword and hashtag search thus be complemented with a geo-graphical search in order to provide a full picture of social media content that is contextually relevant to an event.”

Screen Shot 2013-03-23 at 4.42.25 PM

One of my other main recommendations to Phil & team last year had to do with analytics. There is a strong need for an “Analytics function that produces summary statistics and trends analysis for a geofeed of interest. This is where Geofeedia could better capture temporal dynamics by including charts, graphs and simple time-series analysis to depict how events have been unfolding over the past hour vs 12 hours, 24 hours, etc.” Well sure enough, one of GeoFeedia’s major new features is a GeoAnalytics Dashboard; an interface that enables users to discover temporal trends and patterns in social media—and to do so by geofeed. This means a user can now draw a geofeed around a specific area of interest in a given disaster zone and search for pictures that capture major infrastructure damage on a specified date that contain tags or descriptions with the words “#earthquake”, “damage,” “buildings,” etc. As Phil rightly points out, this provides a “huge time advantage during a crisis to give a yet another filtered layer of intelligence; in effect, social media that is highly relevant and actionable ‘bubbling-up to the top’ of the pile.” 

Analytics Screen Shot - CES Data

I truly am a huge fan of the GeoFeedia platform. Plus, Phil & team have been very responsive to our interests in using their tool for disaster response. So I’m ex-cited to see which features they build out next. They’ve already got a “data portability” functionality that enables data export. Users can also publish content from GeoFeedia directly to their own social networks. Moreover, the filtered content produced by geofeeds can also be shared with individual who do not have a GeoFeedia account. In any event, I hope the team will take into account two items from my earlier wish list—namely Sentiment Analysis and GeoAlerts.

A Sentiment Analysis feature would capture the general mood and sentiment  expressed hyper-locally within a defined geofeed in real-time. The automated Geo-Alerts feature would make the geofeed king. A GeoAlerts functionality would enable users to trigger specific actions based on different kinds of social media traffic within a given geofeed of interest. For example, I’d like to be notified if the number of pictures posted within my geofeed that are tagged with the words “#earthquake” and “damage,” increases by more than 20% in any given hour. Similarly, one could set a geofeed’s GeoAlert for a 10% increase in the number of tweets with the words “cholera” and “diarrhea” (these need not be in English, by the way) in any given 10-minute period. Users would then receive GeoAlerts via automated emails, Tweets and/or SMS’s. This feature would in effect make the GeoFeedia more of a mobile and “hands free” platform, like Waze for example.

My first blog post on GeoFeedia was entitled “GeoFeedia: Next Generation Crisis Mapping Technology?” The answer today is a definite “Yes!” While the platform was not originally designed with disaster response in mind, the team has since been adding important features that make the tool increasingly useful for humanitarian applications. And GeoFeedia has plans for more exciting develop-ments in 2013. Their commitment to innovation and strong continued interest in supporting digital disaster response is why I’m hoping to work more closely with them in the years to come. For example, our AIDR (Artificial Intelligence for Disaster Response) platform would really add a strong Machine Learning com-ponent to GeoFeedia’s search function, in effect enabling the tool to go beyond simple keyword search.

Bio

MatchApp: Next Generation Disaster Response App?

Disaster response apps have multiplied in recent years. I’ve been  reviewing the most promising ones and have found that many cater to  professional responders and organizations. While empowering paid professionals is a must, there has been little focus on empowering the real first responders, i.e., the disaster-affected communities themselves. To this end, there is always a dramatic mismatch in demand for responder services versus supply, which is why crises are brutal audits for humanitarian organizations. Take this Red Cross survey, which found that 74% of people who post a need on social media during a disaster expect a response within an hour. But paid responders cannot be everywhere at the same time during a disaster. The response needs to be decentralized and crowdsourced.

Screen Shot 2013-02-27 at 4.08.03 PM

In contrast to paid responders, the crowd is always there. And most survivals following a disaster are thanks to local volunteers and resources, not external aid or relief. This explains why FEMA Administrator Craig Fugate has called on the public to become a member of the team. Decentralization is probably the only way for emergency response organizations to improve their disaster audits. As many seasoned humanitarian colleagues of mine have noted over the years, the majority of needs that materialize during (and after) a disaster do not require the attention of paid disaster responders with an advanced degree in humanitarian relief and 10 years of experience in Haiti. We are not all affected in the same way when disaster strikes, and those less affected are often very motivated and capable at responding to the basic needs of those around them. After all, the real first responders are—and have always been—the local communities themselves, not the Search and Rescue Teams that parachutes in 36 hours later.

In other words, local self-organized action is a natural response to disasters. Facilitated by social capital, self-organized action can accelerate both response & recovery. A resilient community is therefore one with ample capacity for self-organization. To be sure, if a neighborhood can rapidly identify local needs and quickly match these with available resources, they’ll rebound more quickly than those areas with less capacity for self-organized action. The process is a bit like building a large jigsaw puzzle, with some pieces standing for needs and others for resources. Unlike an actual jigsaw puzzle, however, there can be hundreds of thousands of pieces and very limited time to put them together correctly.

This explains why I’ve long been calling for a check-in & match.com smartphone app for local collective disaster response. The talk I gave (above) at Where 2.0 in 2011 highlights this further as do the blog posts below.

Check-In’s with a Purpose: Applications for Disaster Response
http://iRevolution.net/2011/02/16/checkins-for-disaster-response

Maps, Activism & Technology: Check-In’s with a Purpose
http://iRevolution.net/2011/02/05/check-ins-with-a-purpose

Why Geo-Fencing Will Revolutionize Crisis Mapping
http://iRevolution.net/2011/08/21/geo-fencing-crisis-mapping

How to Crowdsource Crisis Response
http://iRevolution.net/2011/09/14/crowdsource-crisis-response

The Crowd is Always There
http://iRevolution.net/2010/08/14/crowd-is-always-there

Why Crowdsourcing and Crowdfeeding may be the Answer
http://iRevolution.net/2010/12/29/crowdsourcing-crowdfeeding

Towards a Match.com for Economic Resilience
http://iRevolution.net/2012/07/04/match-com-for-economic-resilience

This “MatchApp” could rapidly match hyper local needs with resources (material & informational) available locally or regionally. Check-in’s (think Foursquare) can provide an invaluable function during disasters. We’re all familiar with the command “In case of emergency break glass,” but what if: “In case of emergency, then check-in”? Checking-in is space- and time-dependent. By checking in, I announce that I am at a given location at a specific time with a certain need (red button). This means that information relevant to my location, time, user-profile (and even vital statistics) can be customized and automatically pushed to my MatchApp in real-time. After tapping on red, MatchApp prompts the user to select what specific need s/he has. (Yes, the icons I’m using are from the MDGs and just placeholders). Note that the App we’re building is for Androids, not iPhones, so the below is for demonstration purposes only.

Screen Shot 2013-02-27 at 3.32.29 PM

But MatchApp will also enable users who are less (or not) affected by a disaster to check-in and offer help (by tapping the green button). This is where the match-making algorithm comes to play. There are various (compatible options) in this respect. The first, and simplest, is to use a greedy algorithm. This  algorithm select the very first match available (which may not be the most optimal one in terms of location). A more sophisticated approach is to optimize for the best possible match (which is a non-trivial challenge in advanced computing). As I’m a big fan of Means of Exchange, which I have blogged about here, MatchApp would also enable the exchange of goods via bartering–a mobile eBay for mutual-help during disasters.

Screen Shot 2013-02-27 at 3.34.17 PM

Once a match is made, the two individuals in question receive an automated alert notifying them about the match. By default, both users’ identities and exact locations are kept confidential while they initiate contact via the app’s instant messaging (IM) feature. Each user can decide to reveal their identity/location at any time. The IM feature thus enables  users to confirm that the match is indeed correct and/or still current. It is then up to the user requesting help to share her or his location if they feel comfortable doing so. Once the match has been responded to, the user who received help is invited to rate the individual who offered help (and vice versa, just like the Uber app, depicted on the left below).

Screen Shot 2013-02-27 at 3.49.04 PM

As a next generation disaster response app, MatchApp would include a number of additional data entry features. For example, users could upload geo-tagged pictures and video footage (often useful for damage assessments).  In terms of data consumption and user-interface design,  MatchApp would be modeled along the lines of the Waze crowdsourcing app (depicted on the right above) and thus designed to work mostly “hands-free” thanks to a voice-based interface. (It would also automatically sync up with Google Glasses).

In terms of verifying check-in’s and content submitted via MatchApp, I’m a big fan of InformaCam and would thus integrate the latter’s meta-data verification features into MatchApp: “the user’s current GPS coordinates, altitude, compass bearing, light meter readings, the signatures of neighboring devices, cell towers, and wifi networks; and serves to shed light on the exact circumstances and contexts under which the digital image was taken.” I’ve also long been interested in peer-to-peer meshed mobile communication solutions and would thus want to see an integration with the Splinternet app, perhaps. This would do away with the need for using cell phone towers should these be damaged following a disaster. Finally, MatchApp would include an agile dispatch-and-coordination feature to allow “Super Users” to connect and coordinate multiple volunteers at one time in response to one or more needs.

In conclusion, privacy and security are a central issue for all smartphone apps that share the features described above. This explains why reviewing the security solutions implemented by multiple dating websites (especially those dating services with a strong mobile component like the actual Match.com app) is paramount. In addition, reviewing  security measures taken by Couchsurfing, AirBnB and online classified adds such as Craig’s List is a must. There is also an important role for policy to play here: users who submit false misinformation to MatchApp could be held accountable and prosecuted. Finally, MatchApp would be free and open source, with a hyper-customizable, drag-and-drop front- and back-end.

bio

Update: Twitter Dashboard for Disaster Response

Project name: Artificial Intelligence for Disaster Response (AIDR). For a more recent update, please click here.

My Crisis Computing Team and I at QCRI have been working hard on the Twitter Dashboard for Disaster Response. We first announced the project on iRevolution last year. The experimental research we’ve carried out since has been particularly insightful vis-a-vis the opportunities and challenges of building such a Dashboard. We’re now using the findings from our empirical research to inform the next phase of the project—namely building the prototype for our humanitarian colleagues to experiment with so we can iterate and improve the platform as we move forward.

KnightDash

Manually processing disaster tweets is becoming increasingly difficult and unrealistic. Over 20 million tweets were posted during Hurricane Sandy, for example. This is the main problem that our Twitter Dashboard aims to solve. There are two ways to manage this challenge of Big (Crisis) Data: Advanced Computing and Human Computation. The former entails the use of machine learning algorithms to automatically tag tweets while the latter involves the use of microtasking, which I often refer to as Smart Crowdsourcing. Our Twitter Dashboard seeks to combine the best of both methodologies.

On the Advanced Computing side, we’ve developed a number of classifiers that automatically identify tweets that:

  • Contain informative content (in contrast to personal messages or information unhelpful for disaster response);
  • Are posted by eye-witnesses (as opposed to 2nd-hand reporting);
  • Include pictures, video footage, mentions from TV/radio
  • Report casualties and infrastructure damage;
  • Relate to people missing, seen and/or found;
  • Communicate caution and advice;
  • Call for help and important needs;
  • Offer help and support.

These classifiers are developed using state-of-the-art machine learning tech-niques. This simply means that we take a Twitter dataset of a disaster, say Hurricane Sandy, and develop clear definitions for “Informative Content,” “Eye-witness accounts,” etc. We use this classification system to tag a random sample of tweets from the dataset (usually 100+ tweets). We then “teach” algorithms to find these different topics in the rest of the dataset. We tweak said algorithms to make them as accurate as possible; much like training a dog new tricks like go-fetch (wink).

fetchball

We’ve found from this research that the classifiers are quite accurate but sensitive to the type of disaster being analyzed and also the country in which said disaster occurs. For example, a set of classifiers developed from tweets posted during Hurricane Sandy tend to be less accurate when applied to tweets posted for New Zealand’s earthquake. Each classifier is developed based on tweets posted during a specific disaster. In other words, while the classifiers can be highly accurate (i.e., tweets are correctly tagged as being damage-related, for example), they only tend to be accurate for the type of disaster they’ve been trained for, e.g., weather-related disasters (tornadoes), earth-related (earth-quakes) and water-related (floods).

So we’ve been busy trying to collect as many Twitter datasets of different disasters as possible, which has been particularly challenging and seriously time-consuming given Twitter’s highly restrictive Terms of Service, which prevents the direct sharing of Twitter datasets—even for humanitarian purposes. This means we’ve had to spend a considerable amount of time re-creating Twitter datasets for past disasters; datasets that other research groups and academics have already crawled and collected. Thank you, Twitter. Clearly, we can’t collect every single tweet for every disaster that has occurred over the past five years or we’ll never get to actually developing the Dashboard.

That said, some of the most interesting Twitter disaster datasets are of recent (and indeed future) disasters. Truth be told, tweets were still largely US-centric before 2010. But the international coverage has since increased, along with the number of new Twitter users, which almost doubled in 2012 alone (more neat stats here). This in part explains why more and more Twitter users actively tweet during disasters. There is also a demonstration effect. That is, the international media coverage of social media use during Hurricane Sandy, for example, is likely to prompt citizens in other countries to replicate this kind of pro-active social media use when disaster knocks on their doors.

So where does this leave us vis-a-vis the Twitter Dashboard for Disaster Response? Simply that a hybrid approach is necessary (see TEDx talk above). That is, the Dashboard we’re developing will have a number of pre-developed classifiers based on as many datasets as we can get our hands on (categorized by disaster type). In addition to that, the dashboard will also allow users to create their own classifiers on the fly by leveraging human computation. They’ll also be able to microtask the creation of new classifiers.

In other words, what they’ll do is this:

  • Enter a search query on the dashboard, e.g., #Sandy.
  • Click on “Create Classifier” for #Sandy.
  • Create a label for the new classifier, e.g., “Animal Rescue”.
  • Tag 50+ #Sandy tweets that convey content about animal rescue.
  • Click “Run Animal Rescue Classifier” on new incoming tweets.

The new classifier will then automatically tag incoming tweets. Of course, the classifier won’t get it completely right. But the beauty here is that the user can “teach” the classifier not to make the same mistakes, which means the classifier continues to learn and improve over time. On the geo-location side of things, it is indeed true that only ~3% of all tweets are geotagged by users. But this figure can be boosted to 30% using full-text geo-coding (as was done the TwitterBeat project). Some believe this figure can be doubled (towards 75%) by applying Google Translate to the full-text geo-coding. The remaining users can be queried via Twitter for their location and that of the events they are reporting.

So that’s where we’re at with the project. Ultimately, we envision these classifiers to be like individual apps that can be used/created, dragged and dropped on an intuitive widget-like dashboard with various data visualization options. As noted in my previous post, everything we’re building will be freely accessible and open source. And of course we hope to include classifiers for other languages beyond English, such as Arabic, Spanish and French. Again, however, this is purely experimental research for the time being; we want to be crystal clear about this in order to manage expectations. There is still much work to be done.

In the meantime, please feel free to get in touch if you have disaster datasets you can contribute to these efforts (we promise not to tell Twitter). If you’ve developed classifiers that you think could be used for disaster response and you’re willing to share them, please also get in touch. If you’d like to join this project and have the required skill sets, then get in touch, we may be able to hire you! Finally, if you’re an interested end-user or want to share some thoughts and suggestions as we embark on this next phase of the project, please do also get in touch. Thank you!

bio

Digital Humanitarian Response: Moving from Crowdsourcing to Microtasking

A central component of digital humanitarian response is the real-time monitor-ing, tagging and geo-location of relevant reports published on mainstream and social media. This has typically been a highly manual and time-consuming process, which explains why dozens if not hundreds of digital volunteers are often needed to power digital humanitarian response efforts. To coordinate these efforts, volunteers typically work off Google Spreadsheets which, needless to say, is hardly the most efficient, scalable or enjoyable interface to work on for digital humanitarian response.

complicated128

The challenge here is one of design. Google Spreadsheets was simply not de-signed to facilitate real-time monitoring, tagging and geo-location tasks by hundreds of digital volunteers collaborating synchronously and asynchronously across multiple time zones. The use of Google Spreadsheets not only requires up-front training of volunteers but also oversight and management. Perhaps the most problematic feature of Google Spreadsheets is the interface. Who wants to spend hours staring at cells, rows and columns? It is high time we take a more volunteer-centered design approach to digital humanitarian response. It is our responsibility to reduce the “friction” and make it as easy, pleasant and re-warding as possible for digital volunteers to share their time for the better good. While some deride the rise of “single-click activism,” we have to make it as easy as a double-click-of-the-mouse to support digital humanitarian efforts.

This explains why I have been actively collaborating with my colleagues behind the free & open-source micro-tasking platform, PyBossa. I often describe micro-tasking as “smart crowdsourcing”. Micro-tasking is simply the process of taking a large task and breaking it down into a series of smaller tasks. Take the tagging and geo-location of disaster tweets, for example. Instead of using Google Spread-sheets, tweets with designated hashtags can be imported directly into PyBossa where digital volunteers can tag and geo-locate said tweets as needed. As soon as they are processed, these tweets can be pushed to a live map or database right away for further analysis.

Screen Shot 2012-12-18 at 5.00.39 PM

The Standby Volunteer Task Force (SBTF) used PyBossa in the digital disaster response to Typhoon Pablo in the Philippines. In the above example, a volunteer goes to the PyBossa website and is presented with the next tweet. In this case: “Surigao del Sur: relief good infant needs #pabloPH [Link] #ReliefPH.” If a tweet includes location information, e.g., “Surigao del Sur,” a digital volunteer can simply copy & paste that information into the search box or  pinpoint the location in question directly on the map to generate the GPS coordinates. Click on the screenshot above to zoom in.

The PyBossa platform presents a number of important advantages when it comes to digital humanitarian response. One advantage is the user-friendly tutorial feature that introduces new volunteers to the task at hand. Furthermore, no prior experience or additional training is required and the interface itself can be made available in multiple languages. Another advantage is the built-in quality control mechanism. For example, one can very easily customize the platform such that every tweet is processed by 2 or 3 different volunteers. Why would we want to do this? To ensure consensus on what the right answers are when processing a tweet. For example, if three individual volunteers each tag a tweet as having a link that points to a picture of the damage caused by Typhoon Pablo, then we may find this to be more reliable than if only one volunteer tags a tweet as such. One additional advantage of PyBossa is that having 100 or 10,000 volunteers use the platform doesn’t require additional management and oversight—unlike the use of Google Spreadsheets.

There are many more advantages of using PyBossa, which is why my SBTF colleagues and I are collaborating with the PyBossa team with the ultimate aim of customizing a standby platform specifically for digital humanitarian response purposes. As a first step, however, we are working together to customize a PyBossa instance for the upcoming elections in Kenya since the SBTF was activated by Ushahidi to support the election monitoring efforts. The plan is to microtask the processing of reports submitted to Ushahidi in order to significantly accelerate and scale the live mapping process. Stay tuned to iRevolution for updates on this very novel initiative.

crowdflower-crowdsourcing-site

The SBTF also made use of CrowdFlower during the response to Typhoon Pablo. Like PyBossa, CrowdFlower is a micro-tasking platform but one developed by a for-profit company and hence primarily geared towards paying workers to complete tasks. While my focus vis-a-vis digital humanitarian response has chiefly been on (integrating) automated and volunteer-driven micro-tasking solutions, I believe that paid micro-tasking platforms also have a critical role to play in our evolving digital humanitarian ecosystem. Why? CrowdFlower has an unrivaled global workforce of more than 2 million contributors along with rigor-ous quality control mechanisms.

While this solution may not scale significanlty given the costs, I’m hoping that CrowdFlower will offer the Digital Humanitarian Network (DHN) generous discounts moving forward. Either way, identifying what kinds of tasks are best completed by paid workers versus motivated volunteers is a questions we must answer to improve our digital humanitarian workflows. This explains why I plan to collaborate with CrowdFlower directly to set up a standby platform for use by members of the Digital Humanitarian Network.

There’s one major catch with all microtasking platforms, however. Without well-designed gamification features, these tools are likely to have a short shelf-life. This is true of any citizen-science project and certainly relevant to digital human-itarian response as well, which explains why I’m a big, big fan of Zooniverse. If there’s a model to follow, a holy grail to seek out, then this is it. Until we master or better yet partner with the talented folks at Zooniverse, we’ll be playing catch-up for years to come. I will do my very best to make sure that doesn’t happen.

How to Create Resilience Through Big Data

Revised! I have edited this article several dozen times since posting the initial draft. I have also made a number of substantial changes to the flow of the article after discovering new connections, synergies and insights. In addition, I  have greatly benefited from reader feedback as well as the very rich conversa-tions that took place during the PopTech & Rockefeller workshop—a warm thank you to all participants for their important questions and feedback!

Introduction

I’ve been invited by PopTech and the Rockefeller Foundation to give the opening remarks at an upcoming event on interdisciplinary dimensions of resilience, which is  being hosted at Georgetown University. This event is connected to their new program focus on “Creating Resilience Through Big Data.” I’m absolutely de-lighted to be involved and am very much looking forward to the conversations. The purpose of this blog post is to summarize the presentation I intend to give and to solicit feedback from readers. So please feel free to use the comments section below to share your thoughts. My focus is primarily on disaster resilience. Why? Because understanding how to bolster resilience to extreme events will provide insights on how to also manage less extreme events, while the converse may not be true.

Big Data Resilience

terminology

One of the guiding questions for the meeting is this: “How do you understand resilience conceptually at present?” First, discourse matters.  The term resilience is important because it focuses not on us, the development and disaster response community, but rather on local at-risk communities. While “vulnerability” and “fragility” were used in past discourse, these terms focus on the negative and seem to invoke the need for external protection, overlooking the fact that many local coping mechanisms do exist. From the perspective of this top-down approach, international organizations are the rescuers and aid does not arrive until these institutions mobilize.

In contrast, the term resilience suggests radical self-sufficiency, and self-sufficiency implies a degree of autonomy; self-dependence rather than depen-dence on an external entity that may or may not arrive, that may or may not be effective, and that may or may not stay the course. The term “antifragile” just recently introduced by Nassim Taleb also appeals to me. Antifragile sys-tems thrive on disruption. But lets stick with the term resilience as anti-fragility will be the subject of a future blog post, i.e., I first need to finish reading Nassim’s book! I personally subscribe to the following definition of resilience: the capacity for self-organization; and shall expand on this shortly.

(See the Epilogue at the end of this blog post on political versus technical defini-tions of resilience and the role of the so-called “expert”. And keep in mind that poverty, cancer, terrorism etc., are also resilient systems. Hint: we have much to learn from pernicious resilience and the organizational & collective action models that render those systems so resilient. In their book on resilience, Andrew Zolli and Ann Marie Healy note the strong similarities between Al-Qaeda & tuber-culosis, one of which are the two systems’ ability to regulate their metabolism).

Hazards vs Disasters

In the meantime, I first began to study the notion of resilience from the context of complex systems and in particular the field of ecology, which defines resilience as “the capacity of an ecosystem to respond to a perturbation or disturbance by resisting damage and recovering quickly.” Now lets unpack this notion of perturbation. There is a subtle but fundamental difference between disasters (processes) and hazards (events); a distinction that Jean-Jacques Rousseau first articulated in 1755 when Portugal was shaken by an earthquake. In a letter to Voltaire one year later, Rousseau notes that, “nature had not built [process] the houses which collapsed and suggested that Lisbon’s high population density [process] contributed to the toll” (1). In other words, natural events are hazards and exogenous while disas-ters are the result of endogenous social processes. As Rousseau added in his note to Voltaire, “an earthquake occurring in wilderness would not be important to society” (2). That is, a hazard need not turn to disaster since the latter is strictly a product or calculus of social processes (structural violence).

And so, while disasters were traditionally perceived as “sudden and short lived events, there is now a tendency to look upon disasters in African countries in particular, as continuous processes of gradual deterioration and growing vulnerability,” which has important “implications on the way the response to disasters ought to be made” (3). (Strictly speaking, the technical difference between events and processes is one of scale, both temporal and spatial, but that need not distract us here). This shift towards disasters as processes is particularly profound for the creation of resilience, not least through Big Data. To under-stand why requires a basic introduction to complex systems.

complex systems

All complex systems tend to veer towards critical change. This is explained by the process of Self-Organized Criticality (SEO). Over time, non-equilibrium systems with extended degrees of freedom and a high level of nonlinearity become in-creasingly vulnerable to collapse. Social, economic and political systems certainly qualify as complex systems. As my “alma mater” the Santa Fe Institute (SFI) notes, “The archetype of a self-organized critical system is a sand pile. Sand is slowly dropped onto a surface, forming a pile. As the pile grows, avalanches occur which carry sand from the top to the bottom of the pile” (4). That is, the sand pile becomes increasingly unstable over time.

Consider an hourglass or sand clock as an illustration of self-organized criticality. Grains of sand sifting through the narrowest point of the hourglass represent individual events or natural hazards. Over time a sand pile starts to form. How this process unfolds depends on how society chooses to manage risk. A laisser-faire attitude will result in a steeper pile. And grain of sand falling on an in-creasingly steeper pile will eventually trigger an avalanche. Disaster ensues.

Why does the avalanche occur? One might ascribe the cause of the avalanche to that one grain of sand, i.e., a single event. On the other hand, a complex systems approach to resilience would associate the avalanche with the pile’s increasing slope, a historical process which renders the structure increasingly vulnerable to falling grains. From this perspective, “all disasters are slow onset when realisti-cally and locally related to conditions of susceptibility”. A hazard event might be rapid-onset, but the disaster, requiring much more than a hazard, is a long-term process, not a one-off event. The resilience of a given system is therefore not simply dependent on the outcome of future events. Resilience is the complex product of past social, political, economic and even cultural processes.

dealing with avalanches

Scholars like Thomas Homer-Dixon argue that we are becoming increasingly prone to domino effects or cascading changes across systems, thus increasing the likelihood of total synchronous failure. “A long view of human history reveals not regular change but spasmodic, catastrophic disruptions followed by long periods of reinvention and development.” We must therefore “reduce as much as we can the force of the underlying tectonic stresses in order to lower the risk of synchro-nous failure—that is, of catastrophic collapse that cascades across boundaries between technological, social and ecological systems” (5).

Unlike the clock’s lifeless grains of sand, human beings can adapt and maximize their resilience to exogenous shocks through disaster preparedness, mitigation and adaptation—which all require political will. As a colleague of mine recently noted, “I wish it were widely spread amongst society  how important being a grain of sand can be.” Individuals can “flatten” the structure of the sand pile into a less hierarchical but more resilience system, thereby distributing and diffusing the risk and size of an avalanche. Call it distributed adaptation.

operationalizing resilience

As already, the field of ecology defines  resilience as “the capacity of an ecosystem to respond to a perturbation or disturbance by resisting damage and recovering quickly.” Using this understanding of resilience, there are at least 2 ways create more resilient “social ecosystems”:

  1. Resist damage by absorbing and dampening the perturbation.
  2. Recover quickly by bouncing back or rather forward.

Resisting Damage

So how does a society resist damage from a disaster? As hinted earlier, there is no such thing as a “natural” disaster. There are natural hazards and there are social systems. If social systems are not sufficiently resilient to absorb the impact of a natural hazard such as an earthquake, then disaster unfolds. In other words, hazards are exogenous while disasters are the result of endogenous political, economic, social and cultural processes. Indeed, “it is generally accepted among environmental geographers that there is no such thing as a natural disaster. In every phase and aspect of a disaster—causes, vulnerability, preparedness, results and response, and reconstruction—the contours of disaster and the difference between who lives and dies is to a greater or lesser extent a social calculus” (6).

So how do we apply this understanding of disasters and build more resilient communities? Focusing on people-centered early warning systems is one way to do this. In 2006, the UN’s International Strategy for Disaster Reduction (ISDR) recognized that top-down early warning systems for disaster response were increasingly ineffective. They thus called for a more bottom-up approach in the form of people-centered early warning systems. The UN ISDR’s Global Survey of Early Warning Systems (PDF), defines the purpose of people-centered early warning systems as follows:

“… to empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.”

Information plays a central role here. Acting in sufficient time requires having timely information about (1) the hazard/s, (2) our resilience and (3) how to respond. This is where information and communication technologies (ICTs), social media and Big Data play an important role. Take the latter, for example. One reason for the considerable interest in Big Data is prediction and anomaly detection. Weather and climatic sensors provide meteorologists with the copious amounts of data necessary for the timely prediction of weather patterns and  early detection of atmospheric hazards. In other words, Big Data Analytics can be used to anticipate the falling grains of sand.

Now, predictions are often not correct. But the analysis of Big Data can also help us characterize the sand pile itself, i.e., our resilience, along with the associated trends towards self-organized criticality. Recall that complex systems tend towards instability over time (think of the hourglass above). Thanks to ICTs, social media and Big Data, we now have the opportunity to better characterize in real-time the social, economic and political processes driving our sand pile. Now, this doesn’t mean that we have a perfect picture of the road to collapse; simply that our picture is clearer than ever before in human history. In other words, we can better measure our own resilience. Think of it as the Quantified Self move-ment applied to an entirely different scale, that of societies and cities. The point is that Big Data can provide us with more real-time feedback loops than ever before. And as scholars of complex systems know, feedback loops are critical for adaptation and change. Thanks to social media, these loops also include peer-to-peer feedback loops.

An example of monitoring resilience in real-time (and potentially anticipating future changes in resilience) is the UN Global Pulse’s project on food security in Indonesia. They partnered with Crimson Hexagon to forecast food prices in Indonesia by analyzing tweets referring to the price of rice. They found an inter-esting relationship between said tweets and government statistics on food price inflation. Some have described the rise of social media as a new nervous system for the planet, capturing the pulse of our social systems. My colleagues and I at QCRI are therefore in the process of appling this approach to the study of the Arabic Twittersphere. Incidentally, this is yet another critical reason why Open Data is so important (check out the work of OpenDRI, Open Data for Resilience Initiative. See also this post on Demo-cratizing ICT for Development with DIY Innovation and Open Data). More on open data and data philanthropy in the conclusion.

Finally, new technologies can also provide guidance on how to respond. Think of Foursquare but applied to disaster response. Instead of “Break Glass in Case of Emergency,” how about “Check-In in Case of Emergency”? Numerous smart-phone apps such as Waze already provide this kind of at-a-glance, real-time situational awareness. It is only a matter of time until humanitarian organiza-tions develop disaster response apps that will enable disaster-affected commu-nities to check-in for real time guidance on what to do given their current location and level of resilience. Several disaster preparedness apps already exist. Social computing and Big Data Analytics can power these apps in real-time.

Quick Recovery

As already noted, there are at least two ways create more resilient “social eco-systems”. We just discussed the first: resisting damage by absorbing and dam-pening the perturbation.  The second way to grow more resilient societies is by enabling them to rapidly recover following a disaster.

As Manyena writes, “increasing attention is now paid to the capacity of disaster-affected communities to ‘bounce back’ or to recover with little or no external assistance following a disaster.” So what factors accelerate recovery in eco-systems in general? In ecological terms, how quickly the damaged part of an ecosystem can repair itself depends on how many feedback loops it has to the non- (or less-) damaged parts of the ecosystem(s). These feedback loops are what enable adaptation and recovery. In social ecosystems, these feedback loops can be comprised of information in addition to the transfer of tangible resources.  As some scholars have argued, a disaster is first of all “a crisis in communicating within a community—that is, a difficulty for someone to get informed and to inform other people” (7).

Improving ways for local communities to communicate internally and externally is thus an important part of building more resilient societies. Indeed, as Homer-Dixon notes, “the part of the system that has been damaged recovers by drawing resources and information from undamaged parts.” Identifying needs following a disaster and matching them to available resources is an important part of the process. Indeed, accelerating the rate of (1) identification; (2) matching and, (3) allocation, are important ways to speed up overall recovery.

This explains why ICTs, social media and Big Data are central to growing more resilient societies. They can accelerate impact evaluations and needs assessments at the local level. Population displacement following disasters poses a serious public health risk. So rapidly identifying these risks can help affected populations recover more quickly. Take the work carried out by my colleagues at Flowminder, for example. They  empirically demonstrated that mobile phone data (Big Data!) can be used to predict population displacement after major disasters. Take also this study which analyzed call dynamics to demonstrate that telecommunications data could be used to rapidly assess the impact of earthquakes. A related study showed similar results when analyzing SMS’s and building damage Haiti after the 2010 earthquake.

haiti_overview_570

Resilience as Self-Organization and Emergence

Connection technologies such as mobile phones allow individual “grains of sand” in our societal “sand pile” to make necessary connections and decisions to self-organize and rapidly recover from disasters. With appropriate incentives, pre-paredness measures and policies, these local decisions can render a complex system more resilient. At the core here is behavior change and thus the importance of understanding behavior change models. Recall  also Thomas Schelling’s observation that micro-motives can lead to macro-behavior. To be sure, as Thomas Homer-Dixon rightly notes, “Resilience is an emergent property of a system—it’s not a result of any one of the system’s parts but of the synergy between all of its parts.  So as a rough and ready rule, boosting the ability of each part to take care of itself in a crisis boosts overall resilience.” (For complexity science readers, the notions of transforma-tion through phase transitions is relevant to this discussion).

In other words, “Resilience is the capacity of the affected community to self-organize, learn from and vigorously recover from adverse situations stronger than it was before” (8). This link between resilience and capacity for self-organization is very important, which explains why a recent and major evaluation of the 2010 Haiti Earthquake disaster response promotes the “attainment of self-sufficiency, rather than the ongoing dependency on standard humanitarian assistance.” Indeed, “focus groups indicated that solutions to help people help themselves were desired.”

The fact of the matter is that we are not all affected in the same way during a disaster. (Recall the distinction between hazards and disasters discussed earlier). Those of use who are less affected almost always want to help those in need. Herein lies the critical role of peer-to-peer feedback loops. To be sure, the speed at which the damaged part of an ecosystem can repair itself depends on how many feedback loops it has to the non- (or less-) damaged parts of the eco-system(s). These feedback loops are what enable adaptation and recovery.

Lastly, disaster response professionals cannot be every where at the same time. But the crowd is always there. Moreover, the vast majority of survivals following major disasters cannot be attributed to external aid. One study estimates that at most 10% of external aid contributes to saving lives. Why? Because the real first responders are the disaster-affected communities themselves, the local popula-tion. That is, the real first feedback loops are always local. This dynamic of mutual-aid facilitated by social media is certainly not new, however. My colleagues in Russia did this back in 2010 during the major forest fires that ravaged their country.

While I do have a bias towards people-centered interventions, this does not mean that I discount the importance of feedback loops to external actors such as traditional institutions and humanitarian organizations. I also don’t mean to romanticize the notion of “indigenous technical knowledge” or local coping mechanism. Some violate my own definition of human rights, for example. However, my bias stems from the fact that I am particularly interested in disaster resilience within the context of areas of limited statehood where said institutions and organizations are either absent are ineffective. But I certainly recognize the importance of scale jumping, particularly within the context of social capital and social media.

RESILIENCE THROUGH SOCIAL CAPITAL

Information-based feedback loops general social capital, and the latter has been shown to improve disaster resilience and recovery. In his recent book entitled “Building Resilience: Social Capital in Post-Disaster Recovery,” Daniel Aldrich draws on both qualitative and quantitative evidence to demonstrate that “social resources, at least as much as material ones, prove to be the foundation for resilience and recovery.” His case studies suggest that social capital is more important for disaster resilience than physical and financial capital, and more important than conventional explanations. So the question that naturally follows given our interest in resilience & technology is this: can social media (which is not restricted by geography) influence social capital?

Social Capital

Building on Daniel’s research and my own direct experience in digital humani-tarian response, I argue that social media does indeed nurture social capital during disasters. “By providing norms, information, and trust, denser social networks can implement a faster recovery.” Such norms also evolve on Twitter, as does information sharing and trust building. Indeed, “social ties can serve as informal insurance, providing victims with information, financial help and physical assistance.” This informal insurance, “or mutual assistance involves friends and neighbors providing each other with information, tools, living space, and other help.” Again, this bonding is not limited to offline dynamics but occurs also within and across online social networks. Recall the sand pile analogy. Social capital facilitates the transformation of the sand pile away (temporarily) from self-organized criticality. On a related note vis-a-vis open source software, “the least important part of open source software is the code.” Indeed, more important than the code is the fact that open source fosters social ties, networks, communities and thus social capital.

(Incidentally, social capital generated during disasters is social capital that can subsequently be used to facilitate self-organization for non-violent civil resistance and vice versa).

RESILIENCE through big data

My empirical research on tweets posted during disasters clearly shows that while many use twitter (and social media more generally) to post needs during a crisis, those who are less affected in the social ecosystem will often post offers to help. So where does Big Data fit into this particular equation? When disaster strikes, access to information is equally important as access to food and water. This link between information, disaster response and aid was officially recognized by the Secretary General of the International Federation of Red Cross & Red Crescent Societies in the World Disasters Report published in 2005. Since then, disaster-affected populations have become increasingly digital thanks to the very rapid and widespread adoption of mobile technologies. Indeed, as a result of these mobile technologies, affected populations are increasingly able to source, share and generate a vast amount of information, which is completely transforming disaster response.

In other words, disaster-affected communities are increasingly becoming the source of Big (Crisis) Data during and following major disasters. There were over 20 million tweets posted during Hurricane Sandy. And when the major earth-quake and Tsunami hit Japan in early 2011, over 5,000 tweets were being posted every secondThat is 1.5 million tweets every 5 minutes. So how can Big Data Analytics create more resilience in this respect? More specifically, how can Big Data Analytics accelerate disaster recovery? Manually monitoring millions of tweets per minute is hardly feasible. This explains why I often “joke” that we need a local Match.com for rapid disaster recovery. Thanks to social computing, artifi-cial intelligence, machine learning and Big Data Analytics, we can absolutely develop a “Match.com” for rapid recovery. In fact, I’m working on just such a project with my colleagues at QCRI. We are also developing algorithms to auto-matically identify informative and actionable information shared on Twitter, for example. (Incidentally, a by-product of developing a robust Match.com for disaster response could very well be an increase in social capital).

There are several other ways that advanced computing can create disaster resilience using Big Data. One major challenge is digital humanitarian response is the verification of crowdsourced, user-generated content. Indeed, misinforma-tion and rumors can be highly damaging. If access to information is tantamount to food access as noted by the Red Cross, then misinformation is like poisoned food. But Big Data Analytics has already shed some light on how to develop potential solutions. As it turns out, non-credible disaster information shared on Twitter propagates differently than credible information, which means that the credibility of tweets could be predicted automatically.

Conclusion

In sum, “resilience is the critical link between disaster and development; monitoring it [in real-time] will ensure that relief efforts are supporting, and not eroding […] community capabilities” (9). While the focus of this blog post has been on disaster resilience, I believe the insights provided are equally informa-tive for less extreme events.  So I’d like to end on two major points. The first has to do with data philanthropy while the second emphasizes the critical importance of failing gracefully.

Big Data is Closed and Centralized

A considerable amount of “Big Data” is Big Closed and Centralized Data. Flow-minder’s study mentioned above draws on highly proprietary telecommunica-tions data. Facebook data, which has immense potential for humanitarian response, is also closed. The same is true of Twitter data, unless you have millions of dollars to pay for access to the full Firehose, or even Decahose. While access to the Twitter API is free, the number of tweets that can be downloaded and analyzed is limited to several thousand a day. Contrast this with the 5,000 tweets per second posted after the earthquake and Tsunami in Japan. We therefore need some serious political will from the corporate sector to engage in “data philanthropy”. Data philanthropy involves companies sharing proprietary datasets for social good. Call it Corporate Social Responsibility (CRS) for digital humanitarian response. More here on how this would work.

Failing Gracefully

Lastly, on failure. As noted, complex systems tend towards instability, i.e., self-organized criticality, which is why Homer-Dixon introduces the notion of failing gracefully. “Somehow we have to find the middle ground between dangerous rigidity and catastrophic collapse.” He adds that:

“In our organizations, social and political systems, and individual lives, we need to create the possibility for what computer programmers and disaster planners call ‘graceful’ failure. When a system fails gracefully, damage is limited, and options for recovery are preserved. Also, the part of the system that has been damaged recovers by drawing resources and information from undamaged parts.” Homer-Dixon explains that “breakdown is something that human social systems must go through to adapt successfully to changing conditions over the long term. But if we want to have any control over our direction in breakdown’s aftermath, we must keep breakdown constrained. Reducing as much as we can the force of underlying tectonic stresses helps, as does making our societies more resilient. We have to do other things too, and advance planning for breakdown is undoubtedly the most important.”

As Louis Pasteur famously noted, “Chance favors the prepared mind.” Preparing for breakdown is not defeatist or passive. Quite on the contrary, it is wise and pro-active. Our hubris—including our current infatuation with Bid Data—all too often clouds our better judgment. Like Macbeth, rarely do we seriously ask our-selves what we would do “if we should fail.” The answer “then we fail” is an option. But are we truly prepared to live with the devastating consequences of total synchronous failure?

In closing, some lingering (less rhetorical) questions:

  • How can resilience can be measured? Is there a lowest common denominator? What is the “atom” of resilience?
  • What are the triggers of resilience, creative capacity, local improvisation, regenerative capacity? Can these be monitored?
  • Where do the concepts of “lived reality” and “positive deviance” enter the conversation on resilience?
  • Is resiliency a right? Do we bear a responsibility to render systems more resilient? If so, recalling that resilience is the capacity to self-organize, do local communities have the right to self-organize? And how does this differ from democratic ideals and freedoms?
  • Recent research in social-psychology has demonstrated that mindfulness is an amplifier of resilience for individuals? How can be scaled up? Do cultures and religions play a role here?
  • Collective memory influences resilience. How can this be leveraged to catalyze more regenerative social systems?

bio

Epilogue: Some colleagues have rightfully pointed out that resilience is ultima-tely political. I certainly share that view, which is why this point came up in recent conversations with my PopTech colleagues Andrew Zolli & Leetha Filderman. Readers of my post will also have noted my emphasis on distinguishing between hazards and disasters; that the latter are the product of social, economic and political processes. As noted in my blog post, there are no natural disastersTo this end, some academics rightly warn that “Resilience is a very technical, neutral, apolitical term. It was initially designed to characterize systems, and it doesn’t address power, equity or agency…  Also, strengthening resilience is not free—you can have some winners and some losers.”

As it turns out, I have a lot say about the political versus technical argument. First of all, this is hardly a new or original argument but nevertheless an important one. Amartya Senn discussed this issue within the context of famines decades ago, noting that famines do not take place in democracies. In 1997, Alex de Waal published his seminal book, “Famine Crimes: Politics and the Disaster Relief In-dustry in Africa.” As he rightly notes, “Fighting famine is both a technical and political challenge.” Unfortunately, “one universal tendency stands out: technical solutions are promoted at the expense of political ones.” There is also a tendency to overlook the politics of technical actions, muddle or cover political actions with technical ones, or worse, to use technical measures as an excuse not to undertake needed political action.

De Waal argues that the use of the term “governance” was “an attempt to avoid making the political critique too explicit, and to enable a focus on specific technical aspects of government.” In some evaluations of development and humanitarian projects, “a caveat is sometimes inserted stating that politics lies beyond the scope of this study.” To this end, “there is often a weak call for ‘political will’ to bridge the gap between knowledge of technical measures and action to implement them.” As de Waal rightly notes, “the problem is not a ‘missing link’ but rather an entire political tradition, one manifestation of which is contemporary international humanitarianism.” In sum, “technical ‘solutions’ must be seen in the political context, and politics itself in the light of the domi-nance of a technocratic approach to problems such as famine.”

From a paper I presented back in 2007: “the technological approach almost always serves those who seek control from a distance.” As a result of this technological drive for pole position, a related “concern exists due to the separation of risk evaluation and risk reduction between science and political decision” so that which is inherently politically complex becomes depoliticized and mechanized. In Toward a Rational Society (1970), the German philosopher Jürgen Habermas describes “the colonization of the public sphere through the use of instrumental technical rationality. In this sphere, complex social problems are reduced to technical questions, effectively removing the plurality of contending perspectives.”

To be sure, Western science tends to pose the question “How?” as opposed to “Why?”What happens then is that “early warning systems tend to be largely conceived as hazard-focused, linear, topdown, expert driven systems, with little or no engagement of end-users or their representatives.” As De Waal rightly notes, “the technical sophistication of early warning systems is offset by a major flaw: response cannot be enforced by the populace. The early warning information is not normally made public.”  In other words, disaster prevention requires “not merely identifying causes and testing policy instruments but building a [social and] political movement” since “the framework for response is inherently political, and the task of advocacy for such response cannot be separated from the analytical tasks of warning.”

Recall my emphasis on people-centered early warning above and the definition of resilience as capacity for self-organization. Self-organization is political. Hence my efforts to promote greater linkages between the fields of nonviolent action and early warning years ago. I have a paper (dated 2008) specifically on this topic should anyone care to read. Anyone who has read my doctoral dissertation will also know that I have long been interested in the impact of technology on the balance of power in political contexts. A relevant summary is available here. Now, why did I not include all this in the main body of my blog post? Because this updated section already runs over 1,000 words.

In closing, I disagree with the over-used criticism that resilience is reactive and about returning to initial conditions. Why would we want to be reactive or return to initial conditions if the latter state contributed to the subsequent disaster we are recovering from? When my colleague Andrew Zolli talks about resilience, he talks about “bouncing forward”, not bouncing back. This is also true of Nassim Taleb’s term antifragility, the ability to thrive on disruption. As Homer-Dixon also notes, preparing to fail gracefully is hardly reactive either.

Social Media = Social Capital = Disaster Resilience?

Do online social networks generate social capital, which, in turn, increases resilience to disasters? How might one answer this question? For example, could we analyze Twitter data to capture levels of social capital in a given country? If so, do countries with higher levels of social capital (as measured using Twitter) demonstrate greater resiliences to disasters?

Twitter Heatmap Hurricane

These causal loops are fraught with all kinds of intervening variables, daring assumptions and econometric nightmares. But the link between social capital and disaster resilience is increasingly accepted. In “Building Resilience: Social Capital in Post-Disaster Recover,” Daniel Aldrich draws on both qualitative and quantita-tive evidence to demonstrate that “social resources, at least as much as material ones, prove to be the foundation for resilience and recovery.” A concise summary of his book is available in my previous blog post.

So the question that follows is whether the link between social media, i.e., online social networks and social capital can be established. “Although real-world organizations […] have demonstrated their effectiveness at building bonds, virtual communities are the next frontier for social capital-based policies,” writes Aldrich. Before we jump into the role of online social networks, however, it is important to recognize the function of “offline” communities in disaster response and resilience.

iran-reliefs

“During the disaster and right after the crisis, neighbors and friends—not private firms, government agencies, or NGOs—provide the necessary resources for resilience.” To be sure, “the lack of systematic assistance from government and NGOs [means that] neighbors and community groups are best positioned to undertake efficient initial emergency aid after a disaster. Since ‘friends, family, or coworkers of victims and also passersby are always the first and most effective responders, “we should recognize their role on the front line of disasters.”

In sum, “social ties can serve as informal insurance, providing victims with information, financial help and physical assistance.” This informal insurance, “or mutual assistance involves friends and neighbors providing each other with information, tools, living space, and other help.” Data driven research on tweets posted during disasters reveal that many provide victims with information, help, tools, living space, assistance and other help. But this support is also provided to complete strangers since it is shared openly and publicly on Twitter. “[…] Despite—or perhaps because of—horrendous conditions after a crisis, survivors work together to solve their problems; […] the amount of (bounding) social capital seems to increase under difficult conditions.” Again, this bonding is not limited to offline dynamics but occurs also within and across online social networks. The tweet below was posted in the aftermath of Hurricane Sandy.

Sandy Tweets Mutual Aid

“By providing norms, information, and trust, denser social networks can implement a faster recovery.” Such norms also evolve on Twitter, as does information sharing and trust building. So is the degree of activity on Twitter directly proportional to the level of community resilience?

This data-driven study, “Do All Birds Tweet the Same? Characterizing Twitter Around the World,” may shed some light in this respect. The authors, Barbara Poblete, Ruth Garcia, Marcelo Mendoza and Alejandro Jaimes, analyze various aspects of social media–such as network structure–for the ten most active countries on Twitter. In total, the working dataset consisted close to 5 million users and over 5 billion tweets. The study is the largest one carried out to date on Twitter data, “and the first one that specifically examines differences across different countries.”

Screen Shot 2012-11-30 at 6.19.45 AM

The network statistics per country above reveals that Japan, Canada, Indonesia and South Korea have highest percentage of reciprocity on Twitter. This is important because according to Poblet et al., “Network reciprocity tells us about the degree of cohesion, trust and social capital in sociology.” In terms of network density, “the highest values correspond to South Korea, Netherlands and Australia.” Incidentally, the authors find that “communities which tend to be less hierarchical and more reciprocal, also displays happier language in their content updates. In this sense countries with high conversation levels (@) … display higher levels of happiness too.”

If someone is looking for a possible dissertation topic, I would recommend the following comparative case study analysis. Select two of the four countries with highest percentage of reciprocity on Twitter: Japan, Canada, Indonesia and South Korea. The two you select should have a close “twin” country. By that I mean a country that has many social, economic and political factors in common. The twin countries should also be in geographic proximity to each other since we ultimately want to assess how they weather similar disasters. The paired can-didates that come to mind are thus: Canada & US and Indonesia & Malaysia.

Next, compare the countries’ Twitter networks, particularly degrees of  recipro-city since this metric appears to be a suitable proxy for social capital. For example, Canada’s reciprocity score is 26% compared to 19% for the US. In other words, quite a difference. Next, identify recent  disasters that both countries have experienced. Do the affected cities in the respective countries weather the disasters differently? Is one community more resilient than the other? If so, do you find a notable quantitative difference in their Twitter networks and degrees of reciprocity? If so, does a subsequent comparative qualitative analysis support these findings?

As cautioned earlier, these causal loops are fraught with all kinds of intervening variables, daring assumptions and econometric nightmares. But if anyone wants to brave the perils of applied social science research, and finds the above re-search questions of interest, then please do get in touch!

Debating the Value of Tweets For Disaster Response (Intelligently)

With every new tweeted disaster comes the same old question: what is the added value of tweets for disaster response? Only a handful of data-driven studies actually bother to move the debate beyond anecdotes. It is thus high time that a meta-level empirical analysis of the existing evidence be conducted. Only then can we move towards a less superficial debate on the use of social media for disaster response and emergency management.

In her doctoral research Dr. Sarah Vieweg found that between 8% and 24% of disaster tweets she studied “contain information that provides tactical, action-able information that can aid people in making decisions, advise others on how to obtain specific information from various sources, or offer immediate post-impact help to those affected by the mass emergency.” Two of the disaster datasets that Vieweg analyzed were the Red River Floods of 2009 and 2010. The tweets from the 2010 disaster resulted in a small increase of actionable tweets (from ~8% to ~9%). Perhaps Twitter users are becoming more adept at using Twitter during crises? The lowest number of actionable tweets came from the Red River Floods of 2009, whereas the highest came from the Haiti Earthquake of 2010. Again, there is variation—this time over space.

In this separate study, over 64,000 tweets generated during Thailand’s major floods in 2011 were analyzed. The results indicate that about 39% of these tweets belonged to the “Situational Awareness and Alerts” category. “Twitter messages in this category include up-to-date situational and location-based information related to the flood such as water levels, traffic conditions and road conditions in certain areas. In addition, emergency warnings from authorities advising citizens to evacuate areas, seek shelter or take other protective measures are also included.” About 8% of all tweets (over 5,000 unique tweets) were “Requests for Assistance,” while 5% were “Requests for Information Categories.”

TwitterDistribution

In this more recent study, researchers mapped flood-related tweets and found a close match between that resulting map and the official government flood map. In the map below, tweets were normalized, such that values greater than one mean more tweets than would be expected in normal Twitter traffic. “Unlike many maps of online phenomena, careful analysis and mapping of Twitter data does NOT simply mirror population densities. Instead con-centration of twitter activity (in this case tweets containing the keyword flood) seem to closely reflect the actual locations of floods and flood alerts even when we simply look at the total counts.” This also implies that a relatively high number of flood-related tweets must have contained accurate information.

TwitterMapUKfloods

Shifting from floods to fires, this earlier research analyzed some 1,700 tweets generated during Australia’s worst bushfire in history. About 65% of the tweets had “factual details,” i.e., “more than three of every five tweets had useful infor-mation.” In addition, “Almost 22% of the tweets had geographical data thus identifying location of the incident which is critical in crisis reporting.” Around 7% of the tweets were seeking information, help or answers. Finally, close to 5% (about 80 tweets) were considered “directly actionable.”

Preliminary findings from applied research that I am carrying out with my Crisis Computing team at QCRI also reveal variation in value. In one disaster dataset we studied, up to 56% of the tweets were found to be informative. But in two other datasets, we found the number of informative tweets to be very low. Meanwhile, a recent Pew Research study found that 34% of tweets during Hurricane Sandy “involved news organizations providing content, government sources offering information, people sharing their own eyewitness accounts and still more passing along information posted by others.” In addition, “fully 25% [of tweets] involved people sharing photos and videos,” thus indicating “the degree to which visuals have become a more common element of this realm.”

Finally, this recent study analyzed over 35 million tweets posted by ~8 million users based on current trending topics. From this data, the authors identified 14 major events reflected in the tweets. These included the UK riots, Libya crisis, Virginia earthquake and Hurricane Irene, for example. The authors found that “on average, 30% of the content about an event, provides situational awareness information about the event, while 14% was spam.”

So what can we conclude from these few studies? Simply that the value of tweets for disaster response can vary considerably over time and space. The debate should thus not center around whether tweets yield added value for disaster response but rather what drives this variation in value. Identifying these drivers may enable those with influence to incentivize high-value tweets.

This interesting study, “Do All Birds Tweet the Same? Characterizing Twitter Around the World,” reveals some very interesting drivers. The social network analysis (SNA) of some 5 million users and 5 billion tweets across 10 countries reveals that “users in the US give Twitter a more informative purpose, which is reflected in more globalized communities, which are more hierarchical.” The study is available here (PDF). This American penchant for posting “informative” tweets is obviously not universal. To this end, studying network typologies on Twitter may yield further insights on how certain networks can be induced—at a structural level—to post more informative tweets following major disasters.

Twitter Pablo Gov

Regardless of network typology, however, policy still has an important role to play in incentivizing high-value tweets. To be sure, if demand for such tweets is not encouraged, why would supply follow? Take the forward-thinking approach by the Government of the Philippines, for example. The government actively en-couraged users to use specific hashtags for disaster tweets days before Typhoon Pablo made landfall. To make this kind of disaster reporting via twitter more actionable, the Government could also encourage the posting of pictures and the use of a structured reporting syntax—perhaps a simplified version of the Tweak the Tweet approach. Doing so would not only provide the government with greater situational awareness, it would also facilitate self-organized disaster response initiatives.

In closing, perhaps we ought to keep in mind that even if only, say, 0.001% of the 20 million+ tweets generated during the first five days of Hurricane Sandy were actionable and only half of these were accurate, this would still mean over a thousand informative, real-time tweets, or about 15,000 words, or 25 pages of single-space, relevant, actionable and timely disaster information.

PS. While the credibility and veracity of tweets is an important and related topic of conversation, I have already written at length about this.

Does Social Capital Drive Disaster Resilience?

The link between social capital and disaster resilience is increasingly accepted. In “Building Resilience: Social Capital in Post-Disaster Recover,” Daniel Aldrich draws on both qualitative and quantitative evidence to demonstrate that “social resources, at least as much as material ones, prove to be the foundation for re-silience and recovery.” His case studies suggest that social capital is more important for disaster resilience than physical and financial capital, and more im-portant than conventional explanations.

Screen Shot 2012-11-30 at 6.03.23 AM

Aldrich argues that social capital catalyzes increased “participation among networked members; providing information and knowledge to individuals in the group; and creating trustworthiness.” The author goes so far as using “the phrases social capital and social networks nearly interchangeably.” He finds that “higher levels of social capital work together more effectively to guide resources to where they are needed.” Surveys confirm that “after disasters, most survivors see social connections and community as critical for their recovery.” To this end, “deeper reservoirs of social capital serve as informal insurance and mutual assistance for survivors,” helping them “overcome collective action constraints.”

Capacity for self-organization is thus intimately related to resilience since “social capital can overcome obstacles to collective action that often prevent groups from accomplishing their goals.” In other words, “higher levels of social capital reduce transaction costs, increase the probability of collective action, and make cooperation among individuals more likely.” Social capital is therefore “an asset, a functioning propensity for mutually beneficial collective action […].”

In contrast, communities exhibiting “less resilience fail to mobilize collectively and often must wait for recover guidance and assistance […].”  This implies that vulnerable populations are not solely characterized in terms of age, income, etc., but in terms of “their lack of connections and embeddedness in social networks.” Put differently, “the most effective—and perhaps least expensive—way to mitigate disasters is to create stronger bonds between individuals in vulnerable populations.”

Social Capital

The author brings conceptual clarity to the notion of social capital when he unpacks the term into Bonding Capital, Bridging Capital and Linking Capital. The figure above explains how these differ but relate to each other. The way this relates and applies to digital humanitarian response is explored in this blog post.

Statistics on First Tweets to Report the #Japan Earthquake (Updated)

Update: The first (?) YouTube video of earthquake shared on Twitter.

An 7.3 magnitude earthquake just struck 300km off the eastern coast of Japan, prompting a tsunami warning for Japan’s Miyagi Prefecture. The quake struck at 5.18pm local time (3.18am New York Time). Twitter’s team in Japan have just launched this page of recommended hashtags. There are currently over 1,200 tweets per minute being posted in Tokyo, according to this site.

Screen Shot 2012-12-07 at 4.20.49 AM

Hashtags.org has the following graph on the frequency of tweets carrying the Japan #hashtag over the past 24 hours:

Screen Shot 2012-12-07 at 4.27.52 AM

The first tweets to report the earthquake on twitter using the hashtag #Japan were posted at 5.19pm local time (3.19am New York). You can click on each for the original link.

Screen Shot 2012-12-07 at 4.07.43 AM

Screen Shot 2012-12-07 at 4.08.05 AM Screen Shot 2012-12-07 at 4.08.20 AM

Screen Shot 2012-12-07 at 4.17.53 AM

Screen Shot 2012-12-07 at 4.16.11 AM

 Screen Shot 2012-12-07 at 4.10.35 AM Screen Shot 2012-12-07 at 4.10.55 AM Screen Shot 2012-12-07 at 4.11.16 AM

These tweets were each posted within 2 minutes of the earthquake. I will update this blog post when I get more relevant details.