Category Archives: Disaster Resilience

Social Media for Disaster Response – Done Right!

To say that Indonesia’s capital is prone to flooding would be an understatement. Well over 40% of Jakarta is at or below sea level. Add to this a rapidly growing population of over 10 million and you have a recipe for recurring disasters. Increasing the resilience of the city’s residents to flooding is thus imperative. Resilience is the capacity of affected individuals to self-organize effectively, which requires timely decision-making based on accurate, actionable and real-time information. But Jakarta is also flooded with information during disasters. Indeed, the Indonesian capital is the world’s most active Twitter city.

JK1

So even if relevant, actionable information on rising flood levels could somehow be gleaned from millions of tweets in real-time, these reports could be inaccurate or completely false. Besides, only 3% of tweets on average are geo-located, which means any reliable evidence of flooding reported via Twitter is typically not actionable—that is, unless local residents and responders know where waters are rising, they can’t take tactical action in a timely manner. These major challenges explain why most discount the value of social media for disaster response.

But Digital Humanitarians in Jakarta aren’t your average Digital Humanitarians. These Digital Jedis recently launched one of the most promising humanitarian technology initiatives I’ve seen in years. Code named Peta Jakarta, the project takes social media and digital humanitarian action to the next level. Whenever someone posts a tweet with the word banjir (flood), they receive an automated tweet reply from @PetaJkt inviting them to confirm whether they see signs of flooding in their area: “Flooding? Enable geo-location, tweet @petajkt #banjir and check petajakarta.org.” The user can confirm their report by turning geo-location on and simply replying with the keyword banjir or flood. The result gets added to a live, public crisis map, like the one below.

Credit: Peta Jakarta

Over the course of the 2014/2015 monsoon season, Peta Jakarta automatically sent 89,000 tweets to citizens in Jakarta as a call to action to confirm flood conditions. These automated invitation tweets served to inform the user about the project and linked to the video below (via Twitter Cards) to provide simple instructions on how to submit a confirmed report with approximate flood levels. If a Twitter user forgets to turn on the geo-location feature of their smartphone, they receive an automated tweet reminding them to enable geo-location and resubmit their tweet. Finally, the platform “generates a thank you message confirming the receipt of the user’s report and directing them to PetaJakarta.org to see their contribution to the map.” Note that the “overall aim of sending programmatic messages is not to simply solicit a high volume of replies, but to reach active, committed citizen-users willing to participate in civic co-management by sharing nontrivial data that can benefit other users and government agencies in decision-making during disaster scenarios.”

A report is considered verified when a confirmed geo-tagged tweet includes a picture of the flooding, like in the tweet below. These confirmed and verified tweets get automatically mapped and also shared with Jakarta’s Emergency Management Agency (BPBD DKI Jakarta). The latter are directly involved in this initiative since they’re “regularly faced with the difficult challenge of anticipating & responding to floods hazards and related extreme weather events in Jakarta.” This direct partnership also serves to limit the “Data Rot Syndrome” where data is gathered but not utilized. Note that Peta Jakarta is able to carry out additional verification measures by manually assessing the validity of tweets and pictures by cross-checking other Twitter reports from the same district and also by monitoring “television and internet news sites, to follow coverage of flooded areas and cross-check reports.”

Screen Shot 2015-06-29 at 2.38.54 PM

During the latest monsoon season, Peta Jakarta “received and mapped 1,119 confirmed reports of flooding. These reports were formed by 877 users, indicating an average tweet to user ratio of 1.27 tweets per user. A further 2,091 confirmed reports were received without the required geolocation metadata to be mapped, highlighting the value of the programmatic geo-location ‘reminders’ […]. With regard to unconfirmed reports, Peta Jakarta recorded and mapped a total of 25,584 over the course of the monsoon.”

The Live Crisis Maps could be viewed via two different interfaces depending on the end user. For local residents, the maps could be accessed via smartphone with the visual display designed specifically for more tactical decision-making, showing flood reports at the neighborhood level and only for the past hour.

PJ2

For institutional partners, the data is visualized in more aggregate terms for strategic decision-making based trends-analysis and data integration. “When viewed on a desktop computer, the web-application scaled the map to show a situational overview of the city.”

Credit: Peta Jakarta

Peta Jakarta has “proven the value and utility of social media as a mega-city methodology for crowdsourcing relevant situational information to aid in decision-making and response coordination during extreme weather events.” The initiative enables “autonomous users to make independent decisions on safety and navigation in response to the flood in real-time, thereby helping increase the resilience of the city’s residents to flooding and its attendant difficulties.” In addition, by “providing decision support at the various spatial and temporal scales required by the different actors within city, Peta Jakarta offers an innovative and inexpensive method for the crowdsourcing of time-critical situational information in disaster scenarios.” The resulting confirmed and verified tweets were used by BPBD DKI Jakarta to “cross-validate formal reports of flooding from traditional data sources, supporting the creation of information for flood assessment, response, and management in real-time.”


My blog post is based several conversations I had with Peta Jakarta team and on this white paper, which was just published a week ago. The report runs close to 100 pages and should absolutely be considered required reading for all Digital Humanitarians and CrisisMappers. The paper includes several dozen insights which a short blog post simply cannot do justice to. If you can’t find the time to read the report, then please see the key excerpts below. In a future blog post, I’ll describe how the Peta Jakarta team plans to leverage UAVs to complement social media reporting.

  • Extracting knowledge from the “noise” of social media requires designed engagement and filtering processes to eliminate unwanted information, reward valuable reports, and display useful data in a manner that further enables users, governments, or other agencies to make non-trivial, actionable decisions in a time-critical manner.
  • While the utility of passively-mined social media data can offer insights for offline analytics and derivative studies for future planning scenarios, the critical issue for frontline emergency responders is the organization and coordination of actionable, real-time data related to disaster situations.
  • User anonymity in the reporting process was embedded within the Peta Jakarta project. Whilst the data produced by Twitter reports of flooding is in the public domain, the objective was not to create an archive of users who submitted potentially sensitive reports about flooding events, outside of the Twitter platform. Peta Jakarta was thus designed to anonymize reports collected by separating reports from their respective users. Furthermore, the text content of tweets is only stored when the report is confirmed, that is, when the user has opted to send a message to the @petajkt account to describe their situation. Similarly, when usernames are stored, they are encrypted using a one-way hash function.
  • In developing the Peta Jakarta brand as the public face of the project, it was important to ensure that the interface and map were presented as community-owned, rather than as a government product or academic research tool. Aiming to appeal to first adopters—the young, tech-savvy Twitter-public of Jakarta—the language used in all the outreach materials (Twitter replies, the outreach video, graphics, and print advertisements) was intentionally casual and concise. Because of the repeated recurrence of flood events during the monsoon, and the continuation of daily activities around and through these flood events, the messages were intentionally designed to be more like normal twitter chatter and less like public service announcements.
  • It was important to design the user interaction with PetaJakarta.org to create a user experience that highlighted the community resource element of the project (similar to the Waze traffic app), rather than an emergency or information service. With this aim in mind, the graphics and language are casual and light in tone. In the video, auto-replies, and print advertisements, PetaJakarta.org never used alarmist or moralizing language; instead, the graphic identity is one of casual, opt-in, community participation.
  • The most frequent question directed to @petajkt on Twitter was about how to activate the geo-location function for tweets. So far, this question has been addressed manually by sending a reply tweet with a graphic instruction describing how to activate geo-location functionality.
  • Critical to the success of the project was its official public launch with, and promotion by, the Governor. This endorsement gave the platform very high visibility and increased legitimacy among other government agencies and public users; it also produced a very successful media event, which led substantial media coverage and subsequent public attention.

  • The aggregation of the tweets (designed to match the spatio-temporal structure of flood reporting in the system of the Jakarta Disaster Management Agency) was still inadequate when looking at social media because it could result in their overlooking reports that occurred in areas of especially low Twitter activity. Instead, the Agency used the @petajkt Twitter stream to direct their use of the map and to verify and cross-check information about flood-affected areas in real-time. While this use of social media was productive overall, the findings from the Joint Pilot Study have led to the proposal for the development of a more robust Risk Evaluation Matrix (REM) that would enable Peta Jakarta to serve a wider community of users & optimize the data collection process through an open API.
  • Developing a more robust integration of social media data also means leveraging other potential data sets to increase the intelligence produced by the system through hybridity; these other sources could include, but are not limited to, government, private sector, and NGO applications (‘apps’) for on- the-ground data collection, LIDAR or UAV-sourced elevation data, and fixed ground control points with various types of sensor data. The “citizen-as- sensor” paradigm for urban data collection will advance most effectively if other types of sensors and their attendant data sources are developed in concert with social media sourced information.

Social Media Generates Social Capital: Implications for City Resilience and Disaster Response

A new empirical and peer-reviewed study provides “the first evidence that online networks are able to produce social capital. In the case of bonding social capital, online ties are more effective in forming close networks than theory predicts.” Entitled, “Tweeting Alone? An Analysis of Bridging and Bonding Social Capital in Online Networks,” the study analyzes Twitter data generated during three large events: “the Occupy movement in 2011, the IF Campaign in 2013, and the Chilean Presidential Election of the same year.”

cityres

What is the relationship between social media and social capital formation? More specifically, how do connections established via social media—in this case Twitter—lead to the formation of two specific forms of social capital, bridging and bonding capital? Does the interplay between bridging and bonding capital online differ to what we see in face-to-face world interactions?

“Bonding social capital exists in the strong ties occurring within, often homogeneous, groups—families, friendship circles, work teams, choirs, criminal gangs, and bowling clubs, for example. Bonding social capital acts as a social glue, building trust and norms within groups, but also potentially increasing intolerance and distrust of out-group members. Bridging social capital exists in the ties that link otherwise separate, often heterogeneous, groups—so for example, individuals with ties to other groups, messengers, or more generically the notion of brokers. Bridging social capital allows different groups to share and exchange information, resources, and help coordinate action across diverse interests.” The authors emphasize that “these are not either/or categories, but that in well-functioning societies the two types or dimensions develop together.”

The study uses social network analysis to measure bonding and bridging social capital. More specifically, they use two associated metrics as indicators of social capital: closure and brokerage. “Closure refers to the level of connectedness between particular groups of members within a broader network and encourages the formation of trust and collaboration. Brokerage refers to the existence of structural holes within a network that are ’bridged’ by a particular member of the network. Brokerage permits the transmission of information across the entire network. Social capital, then, is comprised of the combination of these two elements, which interact over time.”

The authors thus analyze the “observed values for closure and brokerage over time and compare them with different simulations based on theoretical network models to show how they compare to what we would expect offline. From this, [they provide an evaluation of the existence and formation of social capital in online networks.”

The results demonstrate that “online networks show evidence of social capital and these networks exhibit higher levels of closure than what would be expected based on theoretical models. However, the presence of organizations and professional brokers is key to the formation of bridging social capital. Similar to traditional (offline) conditions, bridging social capital in online networks does not exist organically and requires the purposive efforts of network members to connect across different groups. Finally, the data show interaction between closure and brokerage goes in the right direction, moving and growing together.”

These conclusions suggest that the same metrics—closure and brokerage—can be used to monitor “City Resilience” before, during and after major disasters. This is of particular interest to me since my team and I at QCRI are collaborating with the Rockefeller Foundation’s 100 Resilient Cities initiative to determine whether social media can indeed help monitor (proxy indicators of) resilience. Recent studies have shown that changes in employment, economic activity and mobility—each of which is are drivers of resilience—can be gleamed from social media.

While more research is needed, the above findings are compelling enough for us to move forward with Rockefeller on our joint project. So we’ll be launching AIRS in early 2015. AIRS, which stands for “Artificial Intelligence for Resilient Societies” is a free and open source platform specifically designed to enable Rockefeller’s partners cities to monitor proxy indicators of resilience on Twitter.

Bio

See also:

  • Using Social Media to Predict Disaster Resilience [link]
  • Social Media = Social Capital = Disaster Resilience? [link]
  • Does Social Capital Drive Disaster Resilience? [link]
  • Digital Social Capital Matters for Resilience & Response [link]

Disaster Tweets Coupled With UAV Imagery Give Responders Valuable Data on Infrastructure Damage

My colleague Leysia Palen recently co-authored an important study (PDF) on tweets posted during last year’s major floods in Colorado. As Leysia et al. write, “Because the flooding was widespread, it impacted many canyons and closed off access to communities for a long duration. The continued storms also prevented airborne reconnaissance. During this event, social media and other remote sources of information were sought to obtain reconnaissance information […].”

1coloflood

The study analyzed 212,672 unique tweets generated by 57,049 unique Twitter users. Of these tweets, 2,658 were geo-tagged. The researchers combed through these geo-tagged tweets for any information on infrastructure damage. A sample of these are included below (click to enlarge). Leysia et al. were particularly interested in geo-tagged tweets with pictures of infrastructure damage.

Screen Shot 2014-09-07 at 3.17.34 AM

They overlaid these geo-tagged pictures on satellite and UAV/aerial imagery of the disaster-affected areas. The latter was captured by Falcon UAV. The satellite and aerial imagery provided the researchers with an easy way to distinguish between vegetation and water. “Most tweets appeared to fall primarily within the high flood hazard zones. Most bridges and roads that were located in the flood plains were expected to experience a high risk of damage, and the tweets and remote data confirmed this pattern.” According to Shideh Dashti, an assistant professor of civil, environmental and architectural engineering, and one of the co-authors, “we compared those tweets to the damage reported by engineering reconnaissance teams and they were well correlated.”

falcon uav flooding

To this end, by making use of real-time reporting by those affected in a region, including their posting of visual data,” Leysia and team “show that tweets may be used to directly support engineering reconnaissance by helping to digitally survey a region and navigate optimal paths for direct observation.” In sum, the results of this study demonstrate “how tweets, particularly with postings of visual data and references to location, may be used to directly support geotechnical experts by helping to digitally survey the affected region and to navigate optimal paths through the physical space in preparation for direct observation.”

Since the vast majority of tweets are not geo-tagged, GPS coordinates for potentially important pictures in these tweets are not available. The authors thus recommend looking into using natural language processing (NLP) techniques to “expose hazard-specific and site-specific terms and phrases that the layperson uses to report damage in situ.” They also suggest that a “more elaborate campaign that instructs people how to report such damage via tweets […] may help get better reporting of damage across a region.”

These findings are an important contribution to the humanitarian computing space. For us at QCRI, this research suggests we may be on the right track with MicroMappers, a crowdsourcing (technically a microtasking) platform to filter and geo-tag social media content including pictures and videos. MicroMappers was piloted last year in response to Typhoon Haiyan. We’ve since been working on improving the platform and extending it to also analyze UAV/aerial imagery. We’ll be piloting this new feature in coming weeks. Ultimately, our aim is for MicroMappers to create near real-time Crisis Maps that provide an integrated display of relevant Tweets, pictures, videos and aerial imagery during disasters.

Bio

See also:

  • Using AIDR to Automatically Collect & Analyze Disaster Tweet [link]
  • Crisis Map of UAV Videos for Disaster Response [link]
  • Humanitarians in the Sky: Using UAVs for Disaster Response [link]
  • Digital Humanitarian Response: Why Moving from Crowdsourcing to Microtasking is Important [link]

Latest Findings on Disaster Resilience: From Burma to California via the Rockefeller Foundation

I’ve long been interested in disaster resilience particularly when considered through the lens of self-organization. To be sure, the capacity to self-organize is an important feature of resilient societies. So what facilitates self-organization? There are several factors, of course, but the two I’m most interested in are social capital and communication technologies. My interest in disaster resilience also explains why one of our Social Innovation Tracks at QCRI is specifically focused on resilience. So I’m always on the lookout for new research on resilience. The purpose of this blog post is to summarize the latest insights.

Screen Shot 2014-05-12 at 4.23.33 PM

This new report (PDF) on Burma assesses the influence of social capital on disaster resilience. More specifically, the report focuses on the influence of bonding, bridging and linking social capital on disaster resilience in remote rural communities in the Ayerwaddy Region of Myanmar. Bonding capital refers to ties that are shared between individuals with common characteristics characteristics such as religion or ethnicity. Bridging capital relates to ties that connect individuals with those outside their immediate communities. These ties could be the result of shared geographical space, for example. Linking capital refers to vertical links between a community and individuals or groups outside said community. The relationship between a village and the government or a donor and recipients, for example.

As the report notes, “a balance of bonding, bridging and linking capitals is important of social and economic stability as well as resilience. It will also play a large role in a community’s ability to reduce their risk of disaster and cope with external shocks as they play a role in resource management, sustainable livelihoods and coping strategies.” In fact, “social capital can be a substitute for a lack of government intervention in disaster planning, early warning and recovery.” The study also notes that “rural communities tend to have stronger social capital due to their geographical distance from government and decision-making structures necessitating them being more self-sufficient.”

Results of the study reveal that villages in the region are “mutually supportive, have strong bonding capital and reasonably strong bridging capital […].” This mutual support “plays a part in reducing vulnerability to disasters in these communities.” Indeed, “the strong bonding capital found in the villages not only mobilizes communities to assist each other in recovering from disasters and building community coping mechanisms, but is also vital for disaster risk reduction and knowledge and information sharing. However, the linking capital of villages is “limited and this is an issue when it comes to coping with larger scale problems such as disasters.”

sfres

Meanwhile, in San Francisco, a low-income neighborhood is  building a culture of disaster preparedness founded on social capital. “No one had to die [during Hurricane Katrina]. No one had to even lose their home. It was all a cascading series of really bad decisions, bad planning, and corrupted social capital,” says Homsey, San Francisco’s director of neighborhood resiliency who spearheads the city’s Neighborhood Empowerment Network (NEN). The Network takes a different approach to disaster preparedness—it is reflective, not prescriptive. The group also works to “strengthen the relationships between governments and the community, nonprofits and other agencies [linking capital]. They make sure those relationships are full of trust and reciprocity between those that want to help and those that need help.” In short, they act as a local Match.com for disaster preparedness and response.

Providence Baptist Church of San Francisco is unusual because unlike most other American churches, this one has a line item for disaster preparedness. Hodge, who administrates the church, takes issue with the government’s disaster plan for San Francisco. “That plan is to evacuate the city. Our plan is to stay in the city. We aren’t going anywhere. We know that if we work together before a major catastrophe, we will be able to work together during a major catastrophe.” This explains why he’s teaming up with the Neighborhood Network (NEN) which will “activate immediately after an event. It will be entirely staffed and managed by the community, for the community. It will be a hyper-local, problem-solving platform where people can come with immediate issues they need collective support for,” such as “evacuations, medical care or water delivery.”

Screen Shot 2014-05-12 at 4.27.06 PM

Their early work has focused on “making plans to protect the neighborhood’s most vulnerable residents: its seniors and the disabled.” Many of these residents have thus received “kits that include a sealable plastic bag to stock with prescription medication, cash, phone numbers for family and friends. They also have door-hangers to help speed up search-and-rescue efforts (above pics).

Lastly, colleagues at the Rockefeller Foundation have just released their long-awaited City Resilience Framework after several months of extensive fieldwork, research and workshops in six cities: Cali, Columbia; Concepción, Chile; New Orleans, USA; Cape Town, South Africa; Surat, India; and Semarang, Indonesia. “The primary purpose of the fieldwork was to understand what contributes to resilience in cities, and how resilience is understood from the perspective of different city stakeholder groups in different contexts. The results are depicted in the graphic below, which figures the 12 categories identified by Rockefeller and team (in yellow).

City Resilience Framework

These 12 categories are important because “one must be able to relate resilience to other properties that one has some means of ascertaining, through observation.” The four categories that I’m most interested in observing are:

Collective identity and mutual support: this is observed as active community engagement, strong social networks and social integration. Sub-indicators include community and civic participation, social relationships and networks, local identity and culture and integrated communities.

Empowered stakeholders: this is underpinned by education for all, and relies on access to up-to-date information and knowledge to enable people and organizations to take appropriate action. Sub-indicators include risk monitoring & alerts and communication between government & citizens.

Reliable communications and mobility: this is enabled by diverse and affordable multi-modal transport systems and information and communication technology (ICT) networks, and contingency planning. Sub-indicators include emergency communication services.

Effective leadership and management: this relates to government, business and civil society and is recognizable in trusted individuals, multi-stakeholder consultation, and evidence-based decision-making. Sub-indicators include emergency capacity and coordination.

How am I interested in observing these drivers of resilience? Via social media. Why? Because that source of information is 1) available in real-time; 2) enables two-way communication; and 3) remains largely unexplored vis-a-vis disaster resilience. Whether or not social media can be used as a reliable proxy to measure resilience is still very much a  research question at this point—meaning more research is required to determine whether social media can indeed serve as a proxy for city resilience.

As noted above, one of our Social Innovation research tracks at QCRI is on resilience. So we’re currently reviewing the list of 32 cities that the Rockefeller Foundation’s 100 Resilient Cities project is partnering with to identify which have a relatively large social media footprint. We’ll then select three cities and begin to explore whether collective identity and mutual support can be captured via the social media activity in each city. In other words, we’ll be applying data science & advanced computing—specifically computational social science—to explore whether digital data can shed light on city resilience. Ultimately, we hope our research will support the Rockefeller Foundation’s next phase in their 100 Resilient Cities project: the development of a Resilient City Index.

Bio

See also:

  • How to Create Resilience Through Big Data [link]
  • Seven Principles for Big Data & Resilience Projects [link]
  • On Technology and Building Resilient Societies [link]
  • Using Social Media to Predict Disaster Resilience [link]
  • Social Media = Social Capital = Disaster Resilience? [link]
  • Does Social Capital Drive Disaster Resilience? [link]
  • Failing Gracefully in Complex Systems: A Note on Resilience [link]
  • Big Data, Lord of the Rings and Disaster Resilience [link]

The Best of iRevolution in 2013

iRevolution crossed the 1 million hits mark in 2013, so big thanks to iRevolution readers for spending time here during the past 12 months. This year also saw close to 150 new blog posts published on iRevolution. Here is a short selection of the Top 15 iRevolution posts of 2013:

How to Create Resilience Through Big Data
[Link]

Humanitarianism in the Network Age: Groundbreaking Study
[Link]

Opening Keynote Address at CrisisMappers 2013
[Link]

The Women of Crisis Mapping
[Link]

Data Protection Protocols for Crisis Mapping
[Link]

Launching: SMS Code of Conduct for Disaster Response
[Link]

MicroMappers: Microtasking for Disaster Response
[Link]

AIDR: Artificial Intelligence for Disaster Response
[Link]

Social Media, Disaster Response and the Streetlight Effect
[Link]

Why the Share Economy is Important for Disaster Response
[Link]

Automatically Identifying Fake Images on Twitter During Disasters
[Link]

Why Anonymity is Important for Truth & Trustworthiness Online
[Link]

How Crowdsourced Disaster Response Threatens Chinese Gov
[Link]

Seven Principles for Big Data and Resilience Projects
[Link]

#NoShare: A Personal Twist on Data Privacy
[Link]

I’ll be mostly offline until February 1st, 2014 to spend time with family & friends, and to get started on a new exciting & ambitious project. I’ll be making this project public in January via iRevolution, so stay tuned. In the meantime, wishing iRevolution readers a very Merry Happy Everything!

santahat

Seven Principles for Big Data and Resilience Projects

Authored by Kate Crawford, Patrick MeierClaudia PerlichAmy Luers, Gustavo Faleiros and Jer Thorp, 2013 PopTech & Rockefeller Foundation Bellagio Fellows

Update: See also “Big Data, Communities and Ethical Resilience: A Framework for Action” written by the above Fellows and available here (PDF).

Bellagio Fellows

The following is a draft “Code of Conduct” that seeks to provide guidance on best practices for resilience building projects that leverage Big Data and Advanced Computing. These seven core principles serve to guide data projects to ensure they are socially just, encourage local wealth- & skill-creation, require informed consent, and be maintainable over long timeframes. This document is a work in progress, so we very much welcome feedback. Our aim is not to enforce these principles on others but rather to hold ourselves accountable and in the process encourage others to do the same. Initial versions of this draft were written during the 2013 PopTech & Rockefeller Foundation workshop in Bellagio, August 2013.

1. Open Source Data Tools

Wherever possible, data analytics and manipulation tools should be open source, architecture independent and broadly prevalent (R, python, etc.). Open source, hackable tools are generative, and building generative capacity is an important element of resilience. Data tools that are closed prevent end-users from customizing and localizing them freely. This creates dependency on external experts which is a major point of vulnerability. Open source tools generate a large user base and typically have a wider open knowledge base. Open source solutions are also more affordable and by definition more transparent. Open Data Tools should be highly accessible and intuitive to use by non-technical users and those with limited technology access in order to maximize the number of participants who can independently use and analyze Big Data.

2. Transparent Data Infrastructure

Infrastructure for data collection and storage should operate based on transparent standards to maximize the number of users that can interact with the infrastructure. Data infrastructure should strive for built-in documentation, be extensive and provide easy access. Data is only as useful to the data scientist as her/his understanding of its collection is correct. This is critical for projects to be maintained over time, regardless of team membership, otherwise projects will collapse when key members leave. To allow for continuity, the infrastructure has to be transparent and clear to a broad set of analysts – independent of the tools they bring to bear. Solutions such as hadoop, JSON formats and the use of clouds are potentially suitable.

3. Develop and Maintain Local Skills

Make “Data Literacy” more widespread. Leverage local data labor and build on existing skills. The key and most constraint ingredient to effective data solutions remains human skill/knowledge and needs to be retained locally. In doing so, consider cultural issues and language. Catalyze the next generation of data scientists and generate new required skills in the cities where the data is being collected. Provide members of local communities with hands-on experience; people who can draw on local understanding and socio-cultural context. Longevity of Big Data for Resilience projects depends on the continuity of local data science teams that maintain an active knowledge and skills base that can be passed on to other local groups. This means hiring local researchers and data scientists and getting them to build teams of the best established talent, as well as up-and-coming developers and designers. Risks emerge when non-resident companies are asked to spearhead data programs that are connected to local communities. They bring in their own employees, do not foster local talent over the long-term, and extract value from the data and the learning algorithms that are kept by the company rather than the local community.

4. Local Data Ownership

Use Creative Commons and licenses that state that data is not to be used for commercial purposes. The community directly owns the data it generates, along with the learning algorithms (machine learning classifiers) and derivatives. Strong data protection protocols need to be in place to protect identities and personally identifying information. Only the “Principle of Do No Harm” can trump consent, as explicitly stated by the International Committee of the Red Cross’s Data Protection Protocols (ICRC 2013). While the ICRC’s data protection standards are geared towards humanitarian professionals, their core protocols are equally applicable to the use of Big Data in resilience projects. Time limits on how long the data can be used for should be transparently stated. Shorter frameworks should always be preferred, unless there are compelling reasons to do otherwise. People can give consent for how their data might be used in the short to medium term, but after that, the possibilities for data analytics, predictive modelling and de-anonymization will have advanced to a state that cannot at this stage be predicted, let alone consented to.

5. Ethical Data Sharing

Adopt existing data sharing protocols like the ICRC’s (2013). Permission for sharing is essential. How the data will be used should be clearly articulated. An opt in approach should be the preference wherever possible, and the ability for individuals to remove themselves from a data set after it has been collected must always be an option. Projects should always explicitly state which third parties will get access to data, if any, so that it is clear who will be able to access and use the data. Sharing with NGOs, academics and humanitarian agencies should be carefully negotiated, and only shared with for-profit companies when there are clear and urgent reasons to do so. In that case, clear data protection policies must be in place that will bind those third parties in the same way as the initial data gatherers. Transparency here is key: communities should be able to see where their data goes, and a complete list of who has access to it and why.

6. Right Not To Be Sensed

Local communities have a right not to be sensed. Large scale city sensing projects must have a clear framework for how people are able to be involved or choose not to participate. All too often, sensing projects are established without any ethical framework or any commitment to informed consent. It is essential that the collection of any sensitive data, from social and mobile data to video and photographic records of houses, streets and individuals, is done with full public knowledge, community discussion, and the ability to opt out. One proposal is the #NoShare tag. In essence, this principle seeks to place “Data Philanthropy” in the hands of local communities and in particular individuals. Creating clear informed consent mechanisms is a requisite for data philanthropy.

7. Learning from Mistakes

Big Data and Resilience projects need to be open to face, report, and discuss failures. Big Data technology is still very much in a learning phase. Failure and the learning and insights resulting from it should be accepted and appreciated. Without admitting what does not work we are not learning effectively as a community. Quality control and assessment for data-driven solutions is notably harder than comparable efforts in other technology fields. The uncertainty about quality of the solution is created by the uncertainty inherent in data. Even good data scientist are struggling to assess the upside potential of incremental efforts on the quality of a solution. The correct analogy is more one a craft rather a science. Similar to traditional crafts, the most effective way is to excellence is to learn from ones mistakes under the guidance of a mentor with a collective knowledge of experiences of both failure and success.

Big Data, Disaster Resilience and Lord of the Rings

The Shire is a local community of Hobbits seemingly disconnected from the systemic changes taking place in Middle Earth. They are a quiet, self-sufficient community with high levels of social capital. Hobbits are not interested in “Big Data”; their world is populated by “Small Data” and gentle action. This doesn’t stop the “Eye of Sauron” from sensing this small harmless hamlet, however. During Gandalf’s visit, the Hobbits learn that all is not well in the world outside the Shire. The changing climate, deforestation and land degradation is wholly unnatural and ultimately threatens their own way of life.

shire2

Gandalf leads a small band of Hobbits (bonding social capital) out of the Shire to join forces with other peoples of Middle Earth (bridging social capital) in what he calls “The Fellowship of the Ring” (resilience in diversity). Together, they must overcome personal & collective adversity and travel to Mordor to destroy the one ring that rules them all. Only then will Sauron’s “All Seeing Eye” cease sensing and oppressing the world of Middle Earth.

fellowship

I’m definitely no expert on J. R. R Tolken or The Lord of the Rings, but I’ve found that literature and indeed mythology often hold up important mirrors to our modern societies and remind us that the perils we face may not be entirely new. This implies that cautionary tales of the past may still bear some relevance today. The hero’s journey speaks to the human condition, and mythology serves as a evidence of human resilience. These narratives carry deep truths about the human condition, our shortcomings and redeeming qualities. Mythologies, analogies and metaphors help us make sense of our world; we ignore them at our own risk.

This is why I’ve employed the metaphor of the Shire (local communities) and Big Data (Eye of Sauron) during recent conversations on Big Data and Community Resilience. There’s been push-back of late against Big Data, with many promoting the notion of Small Data. “For many problems and questions, small data in itself is enough” (1). Yes, for specific problems: locally disconnected problems. But we live in an increasingly interdependent and connected world with coupled systems that run the risk of experiencing synchronous failure and collapse. Our sensors cannot be purely local since the resilience of our communities is no longer mostly place-based. This is where the rings come in.

eye_of_sauron

Frodo’s ring allows him to sense change well beyond the Shire and at the same time mask his local presence. But using the ring allows him to be sensed and hunted by Sauron. The same is true of Google and social media platforms like Facebook. We have no ways to opt out from being sensed if we wish to use these platforms. Community-generated content, our digital fingerprints, belong to the Great Eye, not to the Shire. This excellent piece on the Political Economy of Twitter clearly demonstrates that an elite few control user-generated content. The true owners of social media data are the platform providers, not the end users. In sum, “only corporate actors and regulators—who possess both the intellectual and financial resources to succeed in this race—can afford to participate,” which means “that the emerging data market will be shaped according to their interests.” Of course, the scandal surrounding PRISM makes Sauron’s “All Seeing Eye” even more palpable.

So when we say that we have more data than ever before in human history, it behooves us to ask “Who is we? And to what end?” Does the Shire have access to greater data than ever before thanks to Sauron? Hardly. Is this data used by Sauron to support community resilience? Fat chance. Local communities are excluded; they are observers, unwilling participants in a centralized system that ultimately undermines trust and their own resilience. Hobbits deserve the right not to be sensed. This should be a non-negotiable. They also deserve the right to own and manage their own “Small Data” themselves; that is, data generated by the community, for the community. We need respectful, people-centered data protection protocols like those developed by Open Paths. Community resilience ought to be ethical community resilience.

To be sure, we need to place individual data-sharing decisions in the hands of individuals rather than external parties. In addition to Open Paths, Creative Commons is an excellent example of what is possible. Why not extend that framework to personal and social media data? Why not include a temporal element to these licenses, as hinted in this blog post last year. That is, something like SnapChat where the user decides for herself how long the data should be accessible and usable. Well it turns out that these discussions and related conversations are taking place thanks to my fellow PopTech and Rockefeller Foundation Fellows. Stay tuned for updates. The ideas presented above are the result of our joint brainstorming sessions, and certainly not my ideas alone (but I take full blame for The Lord of the Rings analogy given my limited knowledge of said books!).

In closing, a final reference to The Lord of the Rings: Gandalf (who is a translational leader) didn’t empower the Hobbits, he took them on a journey that built on their existing capacities for resilience. That is, we cannot empower others, we can only provide them with the means to empower themselves. In sum, “Not all those who wander are lost.”

bio

ps. I’m hoping my talented fellows Kate Crawford, Gustavo Faleiros, Amy Luers, Claudia Perlich and Jer Thorp will chime in, improve my Lord of the Rings analogy and post comments in full Elvish script.