Category Archives: Crowdsourcing

Digital Humanitarians and The Theory of Crowd Capital

An iRevolution reader very kindly pointed me to this excellent conceptual study: “The Theory of Crowd Capital”. The authors’ observations and insights resonate with me deeply given my experience in crowdsourcing digital humanitarian response. Over two years ago, I published this blog post in which I wrote that, “The value of Crisis Mapping may at times have less to do with the actual map and more with the conversations and new collaborative networks catalyzed by launching a Crisis Mapping project. Indeed, this in part explains why the Standby Volunteer Task Force (SBTF) exists in the first place.” I was not very familiar with the concept of social capital at the time, but that’s precisely what I was describing. I’ve since written extensively about the very important role that social capital plays in disaster resilience and digital humanitarian response. But I hadn’t taken the obvious next step: “Crowd Capital.”

Screen Shot 2013-03-30 at 4.34.09 PM

John Prpić and Prashant Shukla, the authors of “The Theory of Crowd Capital,” find inspiration in F. A. Hayek, “who in 1945 wrote a seminal work titled: The Use of Knowledge in Society. In this work, Hayek describes dispersed knowledge as:

“The knowledge of the circumstances of which we must make use never exists in concentrated or integrated form but solely as the dispersed bits of incomplete and frequently contradictory knowledge which all the separate individuals possess. […] Every individual has some advantage over all others because he possesses unique information of which beneficial use might be made, but of which use can be made only if the decisions depending on it are left to him or are made with his active cooperation.”

“Crowd Capability,” according to John and Prashant, “is what enables an organization to tap this dispersed knowledge from individuals. More formally, they define Crowd Capability as an “organizational level capability that is defined by the structure, content, and process of an organizations engagement with the dispersed knowledge of individuals—the Crowd.” From their perspective, “it is this engagement of dispersed knowledge through Crowd Capability efforts that endows organizations with data, information, and knowledge previously unavailable to them; and the internal processing of this, in turn, results in the generation of Crowd Capital within the organization.”

In other words, “when an organization defines the structure, content, and processes of its engagement with the dispersed knowledge of individuals, it has created a Crowd Capability, which in turn, serves to generate Crowd Capital.” And so, the authors contend, a Crowd Capable organization “puts in place the structure, content, and processes to access Hayek’s dispersed knowledge from individuals, each of whom has some informational advantage over the other, and thus forming a Crowd for the organization.” Note that a crowd can “exist inside of an organization, exist external to the organization, or a combination of the latter and the former.”

Screen Shot 2013-03-30 at 4.30.05 PM

The “Structure” component of Crowd Capability connotes “the geographical divisions and functional units within an organization, and the technological means that they employ to engage a Crowd population for the organization.” The structure component of Crowd Capability is always an Information-Systems-mediated phenomenon. The “Content” of Crowd Capability constitutes “the knowledge, information or data goals that the organization seeks from the population,” while the “Processes” of Crowd Capability are defined as “the internal procedures that the organization will use to organize, filter, and integrate the incoming knowledge, information, and/or data.” The authors observe that in each Crowd Capital case they’ve analyzed , “an organization creates the structure, content, and/or process to engage the knowledge of dispersed individuals through Information Systems.”

Like the other forms of capital, “Crowd Capital requires investments (for example in Crowd Capability), and potentially pays literal or figurative dividends, and hence, is endowed with typical ‘capital-like’ qualities.” But the authors are meticulous when they distinguish Crowd Capital from Intellectual Capital, Human Capital, Social Capital, Political Capital, etc. The main distinguishing factor is that Crowd Capability is strictly an Information-Systems-mediated phenomenon. “This is not to say that Crowd Capability could not be leveraged to create Social Capital for an organization. It likely could, however, Crowd Capability does not require Social Capital to function.”

That said, I would opine that Crowd Capability can function better thanks to Social Capital. Indeed, Social Capital can influence the “structure”, “content” and “processes” integral to Crowd Capability. And so, while the authors argue that  “Crowd Capital can be accrued without such relationship and network concerns” that are typical to Social Capital, I would counter that the presence of Social Capital certainly does not take away Crowd Capability but quite on the contrary builds greater capability. Otherwise, Crowd Capability is little else than the cultivation of cognitive surplus in which crowd workers can never unite. The Matrix comes to mind. So this is where my experience in crowdsourcing digital humanitarian response makes me diverge from the authors’ conceptualization of “Crowd Capital.” Take the Blue Pill to stay in the disenfranchised version of Crowd Capital; or take the Red Pill if you want to build the social capital required to hack the system.

MatrixBluePillRedPill

To be sure, the authors of Crowd Capital Theory point to Google’s ReCaptcha system for book digitization to demonstrate that Crowd Capability does not require a network of relationships for the accrual of Crowd Capital.” While I understand the return on investment to society both in the form of less spam and more digitized books, this mediated information system is authoritarian. One does not have a choice but to comply, unless you’re a hacker, perhaps. This is why I share Jonathan Zittrain’s point about “The future of the Internet and How To Stop It.” Zittrain promotes the notion of a “Generative Technologies,” which he defines as having the ability “to produce unprompted, user-driven change.”

Krisztina Holly makes a related argument in her piece on crowdscaling. “Like crowdsourcing, crowdscaling taps into the energy of people around the world that want to contribute. But while crowdsourcing pulls in ideas and content from outside the organization, crowdscaling grows and scales its impact outward by empowering the success of others.” Crowdscaling is possible when Crowd Capa-bility generates Crowd Capital by the crowd, for the crowd. In contrast, said crowd cannot hack or change a ReCaptcha requirement if they wish to proceed to the page they’re looking for. In The Matrix, Crowd Capital accrues most directly to The Matrix rather than to the human cocoons being farmed for their metrics. In the same vein, Crowd Capital generated by ReCaptcha accrues most directly to Google Inc. In short, ReCaptcha doesn’t even ask the question: “Blue Pill or Red Pill?” So is it only a matter of time until the users that generate the Crowd Capital unite and revolt, as seems to be the case with the lawsuit against CrowdFlower?

I realize that the authors may have intended to take the conversation on Crowd Capital in a different direction. But they do conclude with a number of inter-esting, open-ended questions that suggest various “flavors” of Crowd Capital are possible, and not just the dark one I’ve just described. I for one will absolutely make use of the term Crowd Capital, but will flavor it based on my experience with digital humanitarias, which suggests a different formula: Social Capital + Social Media + Crowdsourcing = Crowd Capital. In short, I choose the Red Pill.

bio

Automatically Extracting Disaster-Relevant Information from Social Media

Latest update on AIDR available here

My team and I at QCRI have just had this paper (PDF) accepted at the Social Web for Disaster Management workshop at the World Wide Web (WWW 2013) conference in Rio next month. The paper relates directly to our Artificial Intelligence for Disaster Response (AIDR) project. One of our main missions at QCRI is to develop open source and freely available next generation humanitarian technologies to better manage Big (Crisis) Data. Over 20 million tweets and half-a-million Instagram pictures were posted during Hurricane Sandy, for example. In Japan, more 2,000 tweets were posted every second the day after the devastating earthquake and Tsunami struck the Eastern Coast. Recent empirical studies have shown that an important percentage of tweets posted during disaster are informative and even actionable. The challenge before  us is how to find those proverbial needles in the haystack and how to do so in as close to real-time as possible.

Screen Shot 2013-04-01 at 11.22.09 AM

So we analyzed disaster tweets posted during Hurricane Sandy (2012) and the Joplin Tornado (2011). We demonstrate that disaster-relevant information can be automatically extracted from these datasets. The results indicate that 40% to 80% of tweets that contain disaster-related information can be automatically detected. We also demonstrate that we can correctly identify the type of disaster information 80% to 90% of the time. This means, for example, that once we identify a disaster tweet, we can automatically correctly determine whether that tweet was written by an eyewitness 80%-90% of the time. Because these classifiers are developed using machine learning, they get more accurate with more data. This explains why we are building AIDR. Our aim is not to replace human involvement and oversight but to take much of the weight off the shoulders of humans.

The classifiers we’ve developed automatically identify tweets that are personal in nature and those that are informative—that is, tweets that are of interest to others beyond the author’s immediate circle. We also created classifiers to differentiate between informative content shared by eye-witnesses versus content that is simply recycled by other sources such as the media. What’s more, we also created classifiers to distinguish between various types of informative content. Additionally to classifying, we extract key phrases from each tweet. A key phrase summarizes the essential message of a tweet on a few words, allowing for better visualization/aggregation of content. Below, we list real-world examples of tweets on each class. The underlined text is what the extraction system finds to be the key phrase of each tweet:

Caution and Advice: message conveys/reports information about some warning or a piece of advice about a possible hazard.

  • .@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges unaffected at this time. Bridges must close by 7pm. #Sandy

Casualties and Damage: message mentions casualties or infrastructure damage related to the disaster.

  • At least 39 dead; millions without power in Sandy’s aftermath. http//[Link].

Donations and Offers:  message speaks about goods or services offered or needed by the victims of an incident.

  • 400 Volunteers are needed for areas that #Sandy destroyed.
  • I want to volunteer to help the hurricane Sandy victims. If anyone knows how I can get involved please let me know!

People Missing, Found, or Seen: message reports about a missing or found person affected by an incident, or reports reaction or visit of a celebrity.

  • rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hours after they got separated from their mom when car submerged in si. #sandy #911buff
Information Sources: message points to information sources, photos, videos; or mentions a website, TV or radio station providing extensive coverage.
  • RT @NBCNewsPictures: Photos of the unbelievable scenes left in #Hurricane #Sandy’s wake http//[Link] #NYC #NJ 

National Geographic

The two metrics used to assess the results of our analysis are: “Detection Rate” and “Hit Ratio”. The best way explain these metrics is by way of analogy. The Detection Rate measures how good your fishing net is. If you know (thanks to sonar) that there are 10 fish in the pond and your net is good enough to catch all 10, then your Detection Rate is 100%. If you catch 8 out of 10, you rate is 80%. In other words, the Detection Rate is a measure of sensitivity. Now say you’ve designed the world’s first ever “Smart Net” which only catches salmon and thus leaves all other fish in the same pond alone. Now say you caught 5 fish and that you wanted salmon. If all 5 are salmon, your Hit Ratio is 100%. If only 2 of them are salmon, then your Hit Ratio is 40%. In other words, Hit Ratio is a measure of accuracy.

Turning to our results, the Detection Rate was higher for Joplin (78%) than for Sandy (41%). The Hit Ratio is also higher for Joplin (90%) than for Sandy (78%). In other words, our classifiers find the Sandy dataset more challenging to decode. That that said, the Hit Ratio is rather high in both cases, indicating that when our system extracts some part of the tweet, it is often the correct part. In sum, our approach can detect from 40% to 80% of the tweets containing disaster-related information and can correctly identify the specific type of disaster information 80% to 90% of the time. This means, for example, that once we identify a disaster tweet, we can automatically correctly determine whether that tweet was written by an eyewitness between 80% to 90% of the time. Because these classifiers are developed using machine learning, they get more accurate with more data. This explains why we are building AIDR. Our aim is not to replace human involvement and oversight but to significantly lessen the load on humans.

This tweet-level extraction is key to extracting more reliable high-level information. Observing, for instance, that a large number of tweets in similar locations report the same infrastructure as being damaged, may be a strong indicator that this is indeed the case. So we are very much continuing our research and working hard to increase both Detection Rates and Hit Ratios.

bio

See also:

Using Crowdsourcing to Counter the Spread of False Rumors on Social Media During Crises

My new colleague Professor Yasuaki Sakamoto at the Stevens Institute of Tech-nology (SIT) has been carrying out intriguing research on the spread of rumors via social media, particularly on Twitter and during crises. In his latest research, “Toward a Social-Technological System that Inactivates False Rumors through the Critical Thinking of Crowds,” Yasu uses behavioral psychology to under-stand why exposure to public criticism changes rumor-spreading behavior on Twitter during disasters. This fascinating research builds very nicely on the excellent work carried out by my QCRI colleague ChaTo who used this “criticism dynamic” to show that the credibility of tweets can be predicted (by topic) with-out analyzing their content. Yasu’s study also seeks to find the psychological basis for the Twitter’s self-correcting behavior identified by ChaTo and also John Herman who described Twitter as a  “Truth Machine” during Hurricane Sandy.

criticalthink

Twitter is still a relatively new platform, but the existence and spread of false rumors is certainly not. In fact, a very interesting study dated 1950 found that “in the past 1,000 years the same types of rumors related to earthquakes appear again and again in different locations.” Early academic studies on the spread of rumors revealed that “that psychological factors, such as accuracy, anxiety, and impor-tance of rumors, affect rumor transmission.” One such study proposed that the spread of a rumor “will vary with the importance of the subject to the individuals concerned times the ambiguity of the evidence pertaining to the topic at issue.” Later studies added “anxiety as another key element in rumormongering,” since “the likelihood of sharing a rumor was related to how anxious the rumor made people feel. At the same time, however, the literature also reveals that counter-measures do exist. Critical thinking, for example, decreases the spread of rumors. The literature defines critical thinking as “reasonable reflective thinking focused on deciding what to believe or do.”

“Given the growing use and participatory nature of social media, critical thinking is considered an important element of media literacy that individuals in a society should possess.” Indeed, while social media can “help people make sense of their situation during a disaster, social media can also become a rumor mill and create social problems.” As discussed above, psychological factors can influence rumor spreading, particularly when experiencing stress and mental pressure following a disaster. Recent studies have also corroborated this finding, confirming that “differences in people’s critical thinking ability […] contributed to the rumor behavior.” So Yasu and his team ask the following interesting question: can critical thinking be crowdsourced?

Screen Shot 2013-03-30 at 3.37.40 PM

“Not everyone needs to be a critical thinker all the time,” writes Yasu et al. As long as some individuals are good critical thinkers in a specific domain, their timely criticisms can result in an emergent critical thinking social system that can mitigate the spread of false information. This goes to the heart of the self-correcting behavior often observed on social media and Twitter in particular. Yasu’s insight also provides a basis for a bounded crowdsourcing approach to disaster response. More on this here, here and here.

“Related to critical thinking, a number of studies have paid attention to the role of denial or rebuttal messages in impeding the transmission of rumor.” This is the more “visible” dynamic behind the self-correcting behavior observed on Twitter during disasters. So while some may spread false rumors, others often try to counter this spread by posting tweets criticizing rumor-tweets directly. The following questions thus naturally arise: “Are criticisms on Twitter effective in mitigating the spread of false rumors? Can exposure to criticisms minimize the spread of rumors?”

Yasu and his colleagues set out to test the following hypotheses: Exposure to criticisms reduces people’s intent to spread rumors; which mean that ex-posure to criticisms lowers perceived accuracy, anxiety, and importance of rumors. They tested these hypotheses on 87 Japanese undergraduate and grad-uate students by using 20 rumor-tweets related to the 2011 Japan Earthquake and 10 criticism-tweets that criticized the corresponding rumor-tweets. For example:

Rumor-tweet: “Air drop of supplies is not allowed in Japan! I though it has already been done by the Self- Defense Forces. Without it, the isolated people will die! I’m trembling with anger. Please retweet!”

Criticism-tweet: “Air drop of supplies is not prohibited by the law. Please don’t spread rumor. Please see 4-(1)-4-.”

The researchers found that “exposing people to criticisms can reduce their intent to spread rumors that are associated with the criticisms, providing support for the system.” In fact, “Exposure to criticisms increased the proportion of people who stop the spread of rumor-tweets approximately 1.5 times [150%]. This result indicates that whether a receiver is exposed to rumor or criticism first makes a difference in her decision to spread the rumor. Another interpretation of the result is that, even if a receiver is exposed to a number of criticisms, she will benefit less from this exposure when she sees rumors first than when she sees criticisms before rumors.”

Screen Shot 2013-03-30 at 3.53.02 PM

Findings also revealed three psychological factors that were related to the differences in the spread of rumor-tweets: one’s own perception of the tweet’s accuracy, the anxiety cause by the tweet, and the tweet’s perceived importance. The results also indicate that “exposure to criticisms reduces the perceived accuracy of the succeeding rumor-tweets, paralleling the findings by previous research that refutations or denials decrease the degree of belief in rumor.” In addition, the perceived accuracy of criticism-tweets by those exposed to rumors first was significantly higher than the criticism-first group. The results were similar vis-à-vis anxiety. “Seeing criticisms before rumors reduced anxiety associated with rumor-tweets relative to seeing rumors first. This result is also consistent with previous research findings that denial messages reduce anxiety about rumors. Participants in the criticism-first group also perceived rumor-tweets to be less important than those in the rumor-first group.” The same was true vis-à-vis the perceived importance of a tweet. That said, “When the rumor-tweets are perceived as more accurate, the intent to spread the rumor-tweets are stronger; when rumor-tweets cause more anxiety, the intent to spread the rumor-tweets is stronger; when the rumor-tweets are perceived as more im-portance, the intent to spread the rumor-tweets is also stronger.”

So how do we use these findings to enhance the critical thinking of crowds and design crowdsourced verification platforms such as Verily? Ideally, such a platform would connect rumor tweets with criticism-tweets directly. “By this design, information system itself can enhance the critical thinking of the crowds.” That said, the findings clearly show that sequencing matters—that is, being exposed to rumor tweets first vs criticism tweets first makes a big differ-ence vis-à-vis rumor contagion. The purpose of a platform like Verily is to act as a repo-sitory for crowdsourced criticisms and rebuttals; that is, crowdsourced critical thinking. Thus, the majority of Verily users would first be exposed to questions about rumors, such as: “Has the Vincent Thomas Bridge in Los Angeles been destroyed by the Earthquake?” Users would then be exposed to the crowd-sourced criticisms and rebuttals.

In conclusion, the spread of false rumors during disasters will never go away. “It is human nature to transmit rumors under uncertainty.” But social-technological platforms like Verily can provide a repository of critical thinking and ed-ucate users on critical thinking processes themselves. In this way, we may be able to enhance the critical thinking of crowds.


bio

See also:

  • Wiki on Truthiness resources (Link)
  • How to Verify and Counter Rumors in Social Media (Link)
  • Social Media and Life Cycle of Rumors during Crises (Link)
  • How to Verify Crowdsourced Information from Social Media (Link)
  • Analyzing the Veracity of Tweets During a Crisis (Link)
  • Crowdsourcing for Human Rights: Challenges and Opportunities for Information Collection & Verification (Link)
  • The Crowdsourcing Detective: Crisis, Deception and Intrigue in the Twittersphere (Link)

GDACSmobile: Disaster Responders Turn to Bounded Crowdsourcing

GDACS, the Global Disaster Alert and Coordination System, sparked my interest in technology and disaster response when it was first launched back in 2004, which is why I’ve referred to GDACS in multiple blog posts since. This near real-time, multi-hazard monitoring platform is a joint initiative between the UN’s Office for the Coordination of Humanitarian Affairs (OCHA) and the European Commission (EC). GDACS serves to consolidate and improve the dissemination of crisis-related information including rapid mathematical analyses of expected disaster impact. The resulting risk information is distributed via Web and auto-mated email, fax and SMS alerts.

Screen Shot 2013-03-25 at 3.13.35 AM

I recently had the pleasure of connecting with two new colleagues, Daniel Link and Adam Widera, who are researchers at the University of Muenster’s European Research Center for Information Systems (ERCIS). Daniel and Adam have been working on GDACSmobile, a smartphone app that was initially developed to extend the reach of the GDACS portal. This project originates from a student project supervised by Daniel, Adam along with the Chair of the Center Bernd Hellingrath in cooperation with both Tom de Groeve from the Joint Research Center (JRC) and Minu Kumar Limbu, who is now with UNICEF Kenya.

GDACSmobile is intended for use by disaster responders and the general public, allowing for a combined crowdsourcing and “bounded crowdsourcing” approach to data collection and curation. This bounded approach was a deliberate design feature for GDACSmobile from the outset. I coined the term “bounded crowd-sourcing” four years ago (see this blog post from 2009). The “bounded crowd-sourcing” approach uses “snowball sampling” to grow a crowd of trusted reporters for the collection of crisis information. For example, one invites 5 (or more) trusted local reports to collect relevant information and subsequently ask each of these to invite 5 additional reporters who they fully trust; And so on, and so forth. I’m thrilled to see this term applied in practical applications such GDACSmobile. For more on this approach, please see these blog posts.

Bildschirmfoto 2013-03-25 um 13.47.21

GDACSmobile, which operates on all major mobile smartphones, uses a delibera-tely minimalist approach to situation reporting and can be used to collect info-rmation (via text & image) while offline. The collected data is then automatically transmitted when a connection becomes available. Users can also view & filter data via map view and in list form. Daniel and Adam are considering the addition of an icon-based data-entry interface instead of text-based data-entry since the latter is more cumbersome & time-consuming.

Bildschirmfoto 2013-03-24 um 22.15.28

Meanwhile, the server side of GDACSmobile facilitates administrative tasks such as the curation of data submitted by app users and shared on Twitter. Other social media platforms may be added in the future, such as Flickr, to retrieve relevant pictures from disaster-affected areas (similar to GeoFeedia). The server-side moderation feature is used to ensure high data quality standards. But the ERCIS researchers are also open to computational solutions, which is one reason GDACSmobile is not a ‘data island’ and why other systems for computational analysis, microtasking etc., can be used to process the same dataset. The server also “offers a variety of JSON services to allow ‘foreign’ systems to access the data. […] SQL queries can also be used with admin access to the server, and it would be very possible to export tables to spreadsheets […].” 

I very much look forward to following GDACSmobile’s progress. Since Daniel and Adam have designed their app to be open and are also themselves open to con-sidering computational solutions, I have already begun to discuss with them our AIDR project (Artificial Intelligence for Disaster Response) project at the Qatar Computing Research Institute (QCRI). I believe that making the ADIR-GDACS interoperable would make a whole lot of sense. Until then, if you’re going to this year’s International Conference on Information Systems for Crisis Response and Management (ISCRAM 2013) in May, then be sure to participate in the workshop (PDF) that Daniel and Adam are running there. The side-event will present the state of the art and future trends of rapid assessment tools to stimulate a conver-sation on current solutions and developments in mobile tech-nologies for post-disaster data analytics and situational awareness. My colleague Dr. Imran Muhammad from QCRI will also be there to present findings from our crisis computing research, so I highly recommend connecting with him.

Bio

Zooniverse: The Answer to Big (Crisis) Data?

Both humanitarian and development organizations are completely unprepared to deal with the rise of “Big Crisis Data” & “Big Development Data.” But many still hope that Big Data is but an illusion. Not so, as I’ve already blogged here, here and here. This explains why I’m on a quest to tame the Big Data Beast. Enter Zooniverse. I’ve been a huge fan of Zooniverse for as long as I can remember, and certainly long before I first mentioned them in this post from two years ago. Zooniverse is a citizen science platform that evolved from GalaxyZoo in 2007. Today, Zooniverse “hosts more than a dozen projects which allow volunteers to participate in scientific research” (1). So, why do I have a major “techie crush” on Zooniverse?

Oh let me count the ways. Zooniverse interfaces are absolutely gorgeous, making them a real pleasure to spend time with; they really understand user-centered design and motivations. The fact that Zooniverse is conversent in multiple disciplines is incredibly attractive. Indeed, the platform has been used to produce rich scientific data across multiple fields such as astronomy, ecology and climate science. Furthermore, this citizen science beauty has a user-base of some 800,000 registered volunteers—with an average of 500 to 1,000 new volunteers joining every day! To place this into context, the Standby Volunteer Task Force (SBTF), a digital humanitarian group has about 1,000 volunteers in total. The open source Zooniverse platform also scales like there’s no tomorrow, enabling hundreds of thousands to participate on a single deployment at any given time. In short, the software supporting these pioneering citizen science projects is well tested and rapidly customizable.

At the heart of the Zooniverse magic is microtasking. If you’re new to microtasking, which I often refer to as “smart crowdsourcing,” this blog post provides a quick introduction. In brief, Microtasking takes a large task and breaks it down into smaller microtasks. Say you were a major (like really major) astro-nomy buff and wanted to tag a million galaxies based on whether they are spiral or elliptical galaxies. The good news? The kind folks at the Sloan Digital Sky Survey have already sent you a hard disk packed full of telescope images. The not-so-good news? A quick back-of-the-envelope calculation reveals it would take 3-5 years, working 24 hours/day and 7 days/week to tag a million galaxies. Ugh!

Screen Shot 2013-03-25 at 4.11.14 PM

But you’re a smart cookie and decide to give this microtasking thing a go. So you upload the pictures to a microtasking website. You then get on Facebook, Twitter, etc., and invite (nay beg) your friends (and as many strangers as you can find on the suddenly-deserted digital streets), to help you tag a million galaxies. Naturally, you provide your friends, and the surprisingly large number good digital Samaritans who’ve just show up, with a quick 2-minute video intro on what spiral and elliptical galaxies look like. You explain that each participant will be asked to tag one galaxy image at a time by simply by clicking the “Spiral” or “Elliptical” button as needed. Inevitably, someone raises their hands to ask the obvious: “Why?! Why in the world would anyone want to tag a zillion galaxies?!”

Well, only cause analyzing the resulting data could yield significant insights that may force a major rethink of cosmology and our place in the Universe. “Good enough for us,” they say. You breathe a sigh of relief and see them off, cruising towards deep space to bolding go where no one has gone before. But before you know it, they’re back on planet Earth. To your utter astonishment, you learn that they’re done with all the tagging! So you run over and check the data to see if they’re pulling your leg; but no, not only are 1 million galaxies tagged, but the tags are highly accurate as well. If you liked this little story, you’ll be glad to know that it happened in real life. GalaxyZoo, as the project was called, was the flash of brilliance that ultimately launched the entire Zooniverse series.

Screen Shot 2013-03-25 at 3.23.53 PM

No, the second Zooniverse project was not an attempt to pull an Oceans 11 in Las Vegas. One of the most attractive features of many microtasking platforms such as Zooniverse is quality control. Think of slot machines. The only way to win big is by having three matching figures such as the three yellow bells in the picture above (righthand side). Hit the jackpot and the coins will flow. Get two out three matching figures (lefthand side), and some slot machines may toss you a few coins for your efforts. Microtasking uses the same approach. Only if three participants tag the same picture of a galaxy as being a spiral galaxy does that data point count. (Of course, you could decide to change the requirement from 3 volunteers to 5 or even 20 volunteers). This important feature allows micro-tasking initiatives to ensure a high standard of data quality, which may explain why many Zooniverse projects have resulted in major scientific break-throughs over the years.

The Zooniverse team is currently running 15 projects, with several more in the works. One of the most recent Zooniverse deployments, Planet Four, received some 15,000 visitors within the first 60 seconds of being announced on BBC TV. Guess how many weeks it took for volunteers to tag over 2,000,0000 satellite images of Mars? A total of 0.286 weeks, i.e., forty-eight hours! Since then, close to 70,000 volunteers have tagged and traced well over 6 million Martian “dunes.” For their Andromeda Project, digital volunteers classified over 7,500 star clusters per hour, even though there was no media or press announce-ment—just one newsletter sent to volunteers. Zooniverse de-ployments also involve tagging earth-based pictures (in contrast to telescope imagery). Take this Serengeti Snapshot deployment, which invited volunteers to classify animals using photographs taken by 225 motion-sensor cameras in Tanzania’s Serengeti National Park. Volunteers swarmed this project to the point that there are no longer any pictures left to tag! So Zooniverse is eagerly waiting for new images to be taken in Serengeti and sent over.

Screen Shot 2013-03-23 at 7.49.56 PM

One of my favorite Zooniverse features is Talk, an online discussion tool used for all projects to provide a real-time interface for volunteers and coordinators, which also facilitates the rapid discovery of important features. This also allows for socializing, which I’ve found to be particularly important with digital humanitarian deployments (such as these). One other major advantage of citizen science platforms like Zooniverse is that they are very easy to use and therefore do not require extensive prior-training (think slot machines). Plus, participants get to learn about new fields of science in the process. So all in all, Zooniverse makes for a great date, which is why I recently reached out to the team behind this citizen science wizardry. Would they be interested in going out (on a limb) to explore some humanitarian (and development) use cases? “Why yes!” they said.

Microtasking platforms have already been used in disaster response, such as MapMill during Hurricane SandyTomnod during the Somali Crisis and CrowdCrafting during Typhoon Pablo. So teaming up with Zooniverse makes a whole lot of sense. Their microtasking software is the most scalable one I’ve come across yet, it is open source and their 800,000 volunteer user-base is simply unparalleled. If Zooniverse volunteers can classify 2 million satellite images of Mars in 48 hours, then surely they can do the same for satellite images of disaster-affected areas on Earth. Volunteers responding to Sandy created some 80,000 assessments of infrastructure damage during the first 48 hours alone. It would have taken Zooniverse just over an hour. Of course, the fact that the hurricane affected New York City and the East Coast meant that many US-based volunteers rallied to the cause, which may explain why it only took 20 minutes to tag the first batch of 400 pictures. What if the hurricane had hit a Caribbean instead? Would the surge of volunteers may have been as high? Might Zooniverse’s 800,000+ standby volunteers also be an asset in this respect?

Screen Shot 2013-03-23 at 7.42.22 PM

Clearly, there is huge potential here, and not only vis-a-vis humanitarian use-cases but development one as well. This is precisely why I’ve already organized and coordinated a number of calls with Zooniverse and various humanitarian and development organizations. As I’ve been telling my colleagues at the United Nations, World Bank and Humanitarian OpenStreetMap, Zooniverse is the Ferrari of Microtasking, so it would be such a big shame if we didn’t take it out for a spin… you know, just a quick test-drive through the rugged terrains of humanitarian response, disaster preparedness and international development. 

bio

Postscript: As some iRevolution readers may know, I am also collaborating with the outstanding team at  CrowdCrafting, who have also developed a free & open-source microtasking platform for citizen science projects (also for disaster response here). I see Zooniverse and CrowCrafting as highly syner-gistic and complementary. Because CrowdCrafting is still in early stages, they fill a very important gap found at the long tail. In contrast, Zooniverse has been already been around for half-a-decade and can caters to very high volume and high profile citizen science projects. This explains why we’ll all be getting on a call in the very near future. 

GeoFeedia: Ready for Digital Disaster Response

GeoFeedia was not originally designed to support humanitarian operations. But last year’s blog post on the potential of GeoFeedia for crisis mapping caught the interest of CEO Phil Harris. So he kindly granted the Standby Volunteer Task Force (SBTF) free access to the platform. In return, we provided his team with feedback on what features (listed here) would make GeoFeedia more useful for digital disaster response. This was back in summer 2012. I recently learned that they’ve been quite busy since. Indeed, I had the distinct pleasure of sharing the stage with Phil and his team at this superb conference on social media for emergency management. After listening to their talk, I realized it was high time to publish an update on GeoFeedia, especially since we had used the tool just two months earlier in response to Typhoon Pablo, one of the worst disasters to hit the Philippines in the past 100 years.

The 1-minute video is well worth watching if you’re new to GeoFeedia. The plat-form enables hyper local searches for information by location across multiple social media channels such as Twitter, Youtube, Flickr, Picasa & now Instagram. One of my favorite GeoFeedia features is the awesome geofeed (digital fence), which you can learn more about here. So what’s new besides Instagram? Well, the first suggestion I made last year was to provide users with the option of searching by both location and topic, rather than just location alone. And presto, this now possible, which means that digital humanitarians today can zoom into a disaster-affected area and filter by social media type, date and hashtag. This makes the geofeed feature even more compelling for crisis response, especially since geofeeds can also be saved and shared.

The vast majority of social media monitoring tools out there first filter by key-word and hashtag. Only later do they add location. As Phil points out, this mean they easily miss 70% of hyper local social media reports. Most users and org-anizations, who pay hefty licensing fees to uses these platforms, are typically unaware of this. The fact that GeoFeedia first filters by location is not an accident. This recent study (PDF) of the 2012 London Olympics showed that social media users posted close to 170,000 geo-tagged to Twitter, Instagram, Flickr, Picasa and YouTube during the games. But only 31% of these geo-tagged posts contained any Olympic-specific keywords and/or hashtags! So they decided to analyze another large event and again found the number of results drop by about 70% when not first filtering by location. Phil argues that people in a crisis situation obviously don’t wait for keywords or hashtags to form; so he expects this drop to happen for disasters as well. “Traditional keyword and hashtag search thus be complemented with a geo-graphical search in order to provide a full picture of social media content that is contextually relevant to an event.”

Screen Shot 2013-03-23 at 4.42.25 PM

One of my other main recommendations to Phil & team last year had to do with analytics. There is a strong need for an “Analytics function that produces summary statistics and trends analysis for a geofeed of interest. This is where Geofeedia could better capture temporal dynamics by including charts, graphs and simple time-series analysis to depict how events have been unfolding over the past hour vs 12 hours, 24 hours, etc.” Well sure enough, one of GeoFeedia’s major new features is a GeoAnalytics Dashboard; an interface that enables users to discover temporal trends and patterns in social media—and to do so by geofeed. This means a user can now draw a geofeed around a specific area of interest in a given disaster zone and search for pictures that capture major infrastructure damage on a specified date that contain tags or descriptions with the words “#earthquake”, “damage,” “buildings,” etc. As Phil rightly points out, this provides a “huge time advantage during a crisis to give a yet another filtered layer of intelligence; in effect, social media that is highly relevant and actionable ‘bubbling-up to the top’ of the pile.” 

Analytics Screen Shot - CES Data

I truly am a huge fan of the GeoFeedia platform. Plus, Phil & team have been very responsive to our interests in using their tool for disaster response. So I’m ex-cited to see which features they build out next. They’ve already got a “data portability” functionality that enables data export. Users can also publish content from GeoFeedia directly to their own social networks. Moreover, the filtered content produced by geofeeds can also be shared with individual who do not have a GeoFeedia account. In any event, I hope the team will take into account two items from my earlier wish list—namely Sentiment Analysis and GeoAlerts.

A Sentiment Analysis feature would capture the general mood and sentiment  expressed hyper-locally within a defined geofeed in real-time. The automated Geo-Alerts feature would make the geofeed king. A GeoAlerts functionality would enable users to trigger specific actions based on different kinds of social media traffic within a given geofeed of interest. For example, I’d like to be notified if the number of pictures posted within my geofeed that are tagged with the words “#earthquake” and “damage,” increases by more than 20% in any given hour. Similarly, one could set a geofeed’s GeoAlert for a 10% increase in the number of tweets with the words “cholera” and “diarrhea” (these need not be in English, by the way) in any given 10-minute period. Users would then receive GeoAlerts via automated emails, Tweets and/or SMS’s. This feature would in effect make the GeoFeedia more of a mobile and “hands free” platform, like Waze for example.

My first blog post on GeoFeedia was entitled “GeoFeedia: Next Generation Crisis Mapping Technology?” The answer today is a definite “Yes!” While the platform was not originally designed with disaster response in mind, the team has since been adding important features that make the tool increasingly useful for humanitarian applications. And GeoFeedia has plans for more exciting develop-ments in 2013. Their commitment to innovation and strong continued interest in supporting digital disaster response is why I’m hoping to work more closely with them in the years to come. For example, our AIDR (Artificial Intelligence for Disaster Response) platform would really add a strong Machine Learning com-ponent to GeoFeedia’s search function, in effect enabling the tool to go beyond simple keyword search.

Bio

A Research Framework for Next Generation Humanitarian Technology and Innovation

Humanitarian donors and organizations are increasingly championing innovation and the use of new technologies for humanitarian response. DfID, for example, is committed to using “innovative techniques and technologies more routinely in humanitarian response” (2011). In a more recent strategy paper, DfID confirmed that it would “continue to invest in new technologies” (2012). ALNAP’s important report on “The State of the Humanitarian System” documents the shift towards greater innovation, “with new funds and mechanisms designed to study and support innovation in humanitarian programming” (2012). A forthcoming land-mark study by OCHA makes the strongest case yet for the use and early adoption of new technologies for humanitarian response (2013).

picme8

These strategic policy documents are game-changers and pivotal to ushering in the next wave of humanitarian technology and innovation. That said, the reports are limited by the very fact that the authors are humanitarian professionals and thus not necessarily familiar with the field of advanced computing. The purpose of this post is therefore to set out a more detailed research framework for next generation humanitarian technology and innovation—one with a strong focus on information systems for crisis response and management.

In 2010, I wrote this piece on “The Humanitarian-Technology Divide and What To Do About It.” This divide became increasingly clear to me when I co-founded and co-directed the Harvard Humanitarian Initiative’s (HHI) Program on Crisis Mapping & Early Warning (2007-2009). So I co-founded the annual Inter-national CrisisMappers Conference series in 2009 and have continued to co-organize this unique, cross-disciplinary forum on humanitarian technology. The CrisisMappers Network also plays an important role in bridging the humanitarian and technology divide. My decision to join Ushahidi as Director of Crisis Mapping (2009-2012) was a strategic move to continue bridging the divide—and to do so from the technology side this time.

The same is true of my move to the Qatar Computing Research Institute (QCRI) at the Qatar Foundation. My experience at Ushahidi made me realize that serious expertise in Data Science is required to tackle the major challenges appearing on the horizon of humanitarian technology. Indeed, the key words missing from the DfID, ALNAP and OCHA innovation reports include: Data Science, Big Data Analytics, Artificial Intelligence, Machine Learning, Machine Translation and Human Computing. This current divide between the humanitarian and data science space needs to be bridged, which is precisely why I joined the Qatar Com-puting Research Institute as Director of Innovation; to develop and prototype the next generation of humanitarian technologies by working directly with experts in Data Science and Advanced Computing.

bridgetech

My efforts to bridge these communities also explains why I am co-organizing this year’s Workshop on “Social Web for Disaster Management” at the 2013 World Wide Web conference (WWW13). The WWW event series is one of the most prestigious conferences in the field of Advanced Computing. I have found that experts in this field are very interested and highly motivated to work on humanitarian technology challenges and crisis computing problems. As one of them recently told me: “We simply don’t know what projects or questions to prioritize or work on. We want questions, preferably hard questions, please!”

Yet the humanitarian innovation and technology reports cited above overlook the field of advanced computing. Their policy recommendations vis-a-vis future information systems for crisis response and management are vague at best. Yet one of the major challenges that the humanitarian sector faces is the rise of Big (Crisis) Data. I have already discussed this here, here and here, for example. The humanitarian community is woefully unprepared to deal with this tidal wave of user-generated crisis information. There are already more mobile phone sub-scriptions than people in 100+ countries. And fully 50% of the world’s population in developing countries will be using the Internet within the next 20 months—the current figure is 24%. Meanwhile, close to 250 million people were affected by disasters in 2010 alone. Since then, the number of new mobile phone subscrip-tions has increased by well over one billion, which means that disaster-affected communities today are increasingly likely to be digital communities as well.

In the Philippines, a country highly prone to “natural” disasters, 92% of Filipinos who access the web use Facebook. In early 2012, Filipinos sent an average of 2 billion text messages every day. When disaster strikes, some of these messages will contain information critical for situational awareness & rapid needs assess-ment. The innovation reports by DfID, ALNAP and OCHA emphasize time and time again that listening to local communities is a humanitarian imperative. As DfID notes, “there is a strong need to systematically involve beneficiaries in the collection and use of data to inform decision making. Currently the people directly affected by crises do not routinely have a voice, which makes it difficult for their needs be effectively addressed” (2012). But how exactly should we listen to millions of voices at once, let alone manage, verify and respond to these voices with potentially life-saving information? Over 20 million tweets were posted during Hurricane Sandy. In Japan, over half-a-million new users joined Twitter the day after the 2011 Earthquake. More than 177 million tweets about the disaster were posted that same day, i.e., 2,000 tweets per second on average.

Screen Shot 2013-03-20 at 1.42.25 PM

Of course, the volume and velocity of crisis information will vary from country to country and disaster to disaster. But the majority of humanitarian organizations do not have the technologies in place to handle smaller tidal waves either. Take the case of the recent Typhoon in the Philippines, for example. OCHA activated the Digital Humanitarian Network (DHN) to ask them to carry out a rapid damage assessment by analyzing the 20,000 tweets posted during the first 48 hours of Typhoon Pablo. In fact, one of the main reasons digital volunteer networks like the DHN and the Standby Volunteer Task Force (SBTF) exist is to provide humanitarian organizations with this kind of skilled surge capacity. But analyzing 20,000 tweets in 12 hours (mostly manually) is one thing, analyzing 20 million requires more than a few hundred dedicated volunteers. What’s more, we do not have the luxury of having months to carry out this analysis. Access to information is as important as access to food; and like food, information has a sell-by date.

We clearly need a research agenda to guide the development of next generation humanitarian technology. One such framework is proposed her. The Big (Crisis) Data challenge is composed of (at least) two major problems: (1) finding the needle in the haystack; (2) assessing the accuracy of that needle. In other words, identifying the signal in the noise and determining whether that signal is accurate. Both of these challenges are exacerbated by serious time con-straints. There are (at least) two ways too manage the Big Data challenge in real or near real-time: Human Computing and Artificial Intelligence. We know about these solutions because they have already been developed and used by other sectors and disciplines for several years now. In other words, our information problems are hardly as unique as we might think. Hence the importance of bridging the humanitarian and data science communities.

In sum, the Big Crisis Data challenge can be addressed using Human Computing (HC) and/or Artificial Intelligence (AI). Human Computing includes crowd-sourcing and microtasking. AI includes natural language processing and machine learning. A framework for next generation humanitarian technology and inno-vation must thus promote Research and Development (R&D) that apply these methodologies for humanitarian response. For example, Verily is a project that leverages HC for the verification of crowdsourced social media content generated during crises. In contrast, this here is an example of an AI approach to verification. The Standby Volunteer Task Force (SBTF) has used HC (micro-tasking) to analyze satellite imagery (Big Data) for humanitarian response. An-other novel HC approach to managing Big Data is the use of gaming, something called Playsourcing. AI for Disaster Response (AIDR) is an example of AI applied to humanitarian response. In many ways, though, AIDR combines AI with Human Computing, as does MatchApp. Such hybrid solutions should also be promoted   as part of the R&D framework on next generation humanitarian technology. 

There is of course more to humanitarian technology than information manage-ment alone. Related is the topic of Data Visualization, for example. There are also exciting innovations and developments in the use of drones or Unmanned Aerial Vehicles (UAVs), meshed mobile communication networks, hyper low-cost satellites, etc.. I am particularly interested in each of these areas will continue to blog about them. In the meantime, I very much welcome feedback on this post’s proposed research framework for humanitarian technology and innovation.

 bio

Crisis Mapping, Neogeography and the Delusion of Democratization

Professor Muki Haklay kindly shared with me this superb new study in which he questions the alleged democratization effects of Neogeography. As my colleague Andrew Turner explained in 2006, “Neogeography means ‘new geography’ and consists of a set of techniques and tools that fall outside the realm of traditional GIS, Geographic Information Systems. […] Essentially, Neogeography is about people using and creating their own maps, on their own terms and by combining elements of an existing toolset. Neogeography is about sharing location information with friends & visitors, helping shape context, and conveying under-standing through knowledge of place.” To this end, as Muki writes, “it is routinely argued that the process of producing and using geographical information has been fundamentally democratized.” For example, as my colleague Nigel Snoad argued in 2011, “[…] Google, Microsoft and OpenStreetMap have really demo-cratized mapping.” Other CrisisMappers, including myself, have made similar arguments over the years.

neogeo1

Muki explores this assertion by delving into the various meanings of demo-cratization. He adopts the specific notion of democratization that “evokes ideas about participation, equality, the right to influence decision making, support to individual and group rights, access to resources and opportunities, etc.” With this definition in hand, Muki argues that “using this stronger interpretation of democratization reveals the limitation of current neogeographic practices and opens up the possibility of considering alternative development of technologies that can, indeed, be considered democratizing.” To explore this further, he turns to Andrew Feenberg‘s critical philosophy of technology. Feenberg identifies “four main streams of thought on the essence of technology and its linkage to society: instrumentalism, determinism, substantivism & critical theory.”

Screen Shot 2013-03-16 at 6.19.43 PM

Feenberg’s own view is constructivist, “emphasizing that technology development is humanly controlled and encapsulates values and politics; it should thus be open to democratic control and intervention.” In other words, “technology can and should be seen as a result of political negotiations that lead to its production and use. In too many cases, the complexities of technological systems are used to concentrate power within small groups of technological, financial, and political elites and to prevent the wider body of citizens from meaningful participation in shaping it and deciding what role it should have in the everyday.” Furthermore, “Feenberg highlights that technology encapsulates an ambivalence between the ‘conservation of hierarchy’, which most technologies promote and reproduce—hence the continuity in power structures in advanced capitalist societies despite technological upheaval—and ‘democratic rationalisation’, which are the aspects of new technologies that undermine existing power structures and allow new opportunities for marginalized or ignored groups to assert themselves.”

To this end, Feenberg calls for a “deep democratization” of technology as an alternative to technocracy. “Instead of popular agency appearing as an anomaly and an interference, it would be normalized and incorporated into the standard procedures of technical design.” In other words, deep democratization is about empowerment: “providing the tools that will allow increased control over the technology by those in disadvantaged and marginalized positions in society.” Muki contrasts this with neogeography, which is “mostly represented in a decon-textualised way—as the citation in the introduction from Turner’s (2006) Intro-duction to Neogeography demonstrates: it does not discuss who the people are who benefit and whether there is a deeper purpose, beyond fun, for their engage-ment in neogeography.” And so, as neogeographers would have it, since “there is nothing that prevents anyone, anytime, and anywhere, and for any purpose from using the system, democratization has been achieved.” Or maybe not. Enter the Digital Divides.

digidivide

Yes, there are multiple digital divides. Differential access to computers & comm-unication technology is just one. “Beyond this, there is secondary digital ex-clusion, which relates to the skills and abilities of people to participate in online activities beyond rudimentary browsing.” Related to this divide is the one between the “Data Haves” and the “Data Have Nots”. There is also an important divide in speed—as anyone who has worked in say Liberia will have experienced—it takes a lot longer to upload/download/transfer content than in Luxembourg. “In summary, the social, economic, structural, and technical evidence should be enough to qualify and possibly withdraw the democratization claims that are attached to neogeographic practices.”

That said, the praxis of neogeography still has democratic potential. “To address the potential of democratization within neogeographic tools, we need to return to Feenberg’s idea of deep democratization  and the ability of ordinary citizens to direct technical codes and influence them so that they can include alternative meanings and values. By doing so, we can explore the potential of neogeographic practices to support democratisation in its fuller sense. At the very least, citizens should be able to reuse existing technology and adapt it so that it can be used to their own goals and to represent their own values.” So Muki adds a “Hierarchy of Hacking” to Feeberg’s conceptual framework, i.e., the triangle below.

Screen Shot 2013-03-16 at 7.03.49 PM

While the vast majority can participate in a conversation about what to map (Meaning), only a “small technical elite within society” can contribute to “Deep Technical Hacking,” which “requires very significant technical knowledge in creating new geographic data collection tools, setting up servers, and configuring database management systems.” Muki points to Map Kibera as an example of Deep Technical Hacking. I would add that “Meaning Hacking” is often hijacked by “Deep Technical Hackers” who tend to be the ones introducing-and-controlling local neogeography projects despite their “best” intentions. But the fact is this: Deep Tech Hackers typically have little to no actual experience in community development and are often under pressure to hype up blockbuster-like successes at fancy tech conferences in the US. This may explain why most take full owner-ship over all decisions having to do with Meaning- and Use-Hacking right from the start of a project. See this blog post’s epilogue, for more on this dynamic.

One success story, however, is Liberia’s Innovation Lab (iLab). My field visit to Monrovia in 2011 made me realize just how many completely wrong assumptions I had about the use of neogeography platforms in developing countries. Instead of parachuting in and out, the co-founders of iLab became intimately familiar with the country by spending a considerable amount of time in Monrovia and outside the capital city to understand the social, political and historical context in which they were introducing neogeography. And so, while they initially expected to provide extensive training on neogeography platforms right off the bat, they quickly realized that this was the wrong approach entirely for several reasons. As Muki observers, “Because of the reduced barriers, neogeography does offer some increased level of democratization but, to fulfill this potential, it requires careful implementation that takes into account social and political aspects,” which is precisely what the team at the iLab have done and continue to do impressively well. Note that one of the co-founders is a development expert, not a technology hacker. And while the other is a hacker, he spent several years working in Liberia. (Another equally impressive success story is this one from Brazil’s Mare shantytown).

blank

I thus fully subscribe to Muki’s hacking approach and made a very similar ar-gument in this 2011 blog post: “Democratizing ICT for Development with DIY Innovation and Open Data.” I directly challenged the “participatory” nature of these supposedly democratizing technologies and in effect questioned whether Deep Technical Hackers really do let go of control vis-a-vis the hacking of “Meaning” and “Use”. While I used Ushahidi as an example of a DIY platform, it is clear from Muki’s study that Ushahidi like other neogeography platforms also falls way short of deep democratization and hack-ability. That said, as I wrote then, “it is worth remembering that the motivations driving this shift [towards neogeography] are more important than any one technology. For example, recall the principles behind the genesis of the Ushahidi platform: Democratizing information flows and access; promoting Open Data and Do it Yourself (DIY) Innovation with free, highly hackable (i.e., open source) technology; letting go of control.” In other words, the democratizing potential should not be dismissed outright even if we’re not quite there yet (or ever).

As I noted in 2011,  hackable and democratizing technologies ought to be like a “choose your own adventure game. The readers, not the authors, finish the story. They are the main characters who bring the role playing games and stories to life.” This explains why I introduced the notion a “Fischer Price Theory of Tech-nology” five years ago at this meeting with Andrew Turner and other colleagues. As argued then, “What our colleagues in the tech-world need to keep in mind is that the vast majority of our partners in the field have never taken a computer science or software engineering course. […] The onus thus falls on the techies to produce the most simple, self-explanatory, intuitive interfaces.”

I thus argued that neogeography platforms ought to be as easy to use (and yes hack) as simple as computer games, which is why I was excited to see the latest user interface (UI) developments for OpenStreetMap (image below). Of course, as Muki has ably demonstrated, UI design is just the tip of the iceberg vis-a-vis democratization effects. But democratization is both relative and a process, and neogeography platforms are unlikely to become less democratizing over time, for instance. While some platforms still have a long road ahead with respect to reaching their perceived potential (if ever), a few instances may already have made in-roads in terms of their local political effects as argued here and in my doctoral dissertation.

OSMneogeo

Truly hackable technology, however, needs to go beyond the adventure story and Fischer Price analogies described above. The readers should have the choice of becoming authors before they even have a story in mind, while gamers should have the option of creating their own games in the first place. In other words, as Muki argues, “the artful alteration of technology beyond the goals of its original design or intent,” enables “Deep Democratization.” To this end, “Freely pro-viding the hackable building blocks for DIY Innovation is one way to let go of control and democratize [neogeography platforms],” not least if the creators can make a business out of their buildings. 

Muki concludes by noting that, “the main error in the core argument of those who promote [neogeography] as a democratic force is the assumption that, by increasing the number of people who utilise geographic information in different ways and gain access to geographic technology, these users have been em-powered and gained more political and social control. As demonstrated in this paper, neogeography has merely opened up the collection and use of this information to a larger section of the affluent, educated, and powerful part of society.”  What’s more, “The control over the information is kept, by and large, by major corporations and the participant’s labor is enrolled in the service of these corporations, leaving the issue of payback for this effort a moot point. Significantly, the primary intention of the providers of the tools is not to empower communities or to include marginalized groups, as they do not re-present a major source of revenue.” I argued this exact point here a year ago.

bio

Analyzing Tweets Posted During Mumbai Terrorist Attacks

Over 1 million unique users posted more than 2.7 million tweets in just 3 days following the triple bomb blasts that struck Mumbai on July 13, 2011. Out of these, over 68,000 tweets were “original tweets” (in contrast to retweets) and related to the bombings. An analysis of these tweets yielded some interesting patterns. (Note that the Ushahidi Map of the bombings captured ~150 reports; more here).

One unique aspect of this study (PDF) is the methodology used to assess the quality of the Twitter dataset. The number of tweets per user was graphed in order to test for a power law distribution. The graph below shows the log distri-bution of the number of tweets per user. The straight lines suggests power law behavior. This finding is in line with previous research done on Twitter. So the authors conclude that the quality of the dataset is comparable to the quality of Twitter datasets used in other peer-reviewed studies.

I find this approach intriguing because Professor Michael Spagat, Dr. Ryan Woodard and I carried out related research on conflict data back in 2006. One fascinating research question that emerges from all this, and which could be applied to twitter datasets, is whether the slope of the power law says anything about the type of conflict/disaster being tweeted about, the expected number of casualties or even the propagation of rumors.  If you’re interested in pursuing this research question (and have worked with power laws before), please do get in touch. In the meantime, I challenge the authors’ suggestion that a power law distribution necessarily says anything about the quality or reliability of the underlying data. Using the casualty data from SyriaTracker (which is also used by USAID in their official crisis maps), my colleague Dr. Ryan Woodard showed that this dataset does not follow a power law distribution—even thought it is one of the most reliable on Syria.

Syria_PL

Moving on to the content analysis of the Mumbai blast tweets:  “The number of URLs and @-mentions in tweets increase during the time of the crisis in com-parison to what researchers have exhibited for normal circumstances.” The table below lists the top 10 URLs shared on Twitter. Inter-estingly, the link to a Google Spreadsheet was amongst the most shared resource. Created by Twitter user Nitin Sagar, the spreadsheet was used to “coordinate relief operation among people. Within hours hundreds of people registered on the sheet via Twitter. People asked for or off ered help on that spreadsheet for many hours.”

The analysis also reveals that “the number of tweets or updates by authority users (those with large number of followers) are very less, i.e., majority of content generated on Twitter during the crisis comes from non authority users.”  In addition, tweets generated by authority users have a high level of retweets. The results also indicate that “the number of tweets generated by people with large follower base (who are generally like government owned accounts, cele-brities, media companies) were very few. Thus, the majority of content generated at the time of crisis was from unknown users. It was also observed that, though the number of posts were less by users with large number of followers, these posts registered high numbers of retweets.”

Rumors related to the blasts also spread through Twitter. For example, rumors began to circulate about a fourth bomb going off. “Some tweets even speci fied locations of 4th blast as Lemington street, Colaba and Charni. Around 500+ tweets and retweets were posted about this.” False rumors about hospital blood banks needing donations were also propagated via Twitter. “They were initiated by a user, @KapoorChetan and around 2,000 tweets and retweets were made regarding this by Twitter users.” The authors of the study believe that such false rumors and can be prevented if credible sources like the mainstream media companies and the government post updates on social media more frequently.

I did a bit of research on this and found that NDTV did use their twitter feed (which has over half-a-million followers) to counter these rumors. For example, “RT @ndtv: Mumbai police: Don’t believe rumours of more bombs. False rumours being spread deliberately.” Journalist Sonal Kalra also acted to counter rumors: “RT @sonalkalra: BBMs about bombs found in Delhi are FALSE. Pls pls don’t spread rumours. #mumbaiblasts.”

In conclusion, the study considers the “privacy threats during the Twitter activity after the blasts. People openly tweeted their phone numbers on social media websites like Twitter, since at such moment of crisis people wished to reach out to help others. But, long after the crisis was over, such posts still remained publicly available on the Internet.” In addition, “people also openly posted their blood group, home address, etc. on Twitter to off er help to victims of the blasts.” The Ushahidi Map also includes personal information. These data privacy and security issues continue to pose major challenges vis-a-vis the use of social media for crisis response.

Bio

See also: Did Terrorists Use Twitter to Increase Situational Awareness? [Link]

Humanitarian Technology and the Japan Earthquake (Updated)

My Internews colleagues have just released this important report on the role of communications in the 2011 Japan Earthquake. Independent reports like this one are absolutely key to building the much-needed evidence base of humanitarian technology. Internews should thus be applauded for investing in this important study. The purpose of my blog post is to highlight findings that I found most interesting and to fill some of the gaps in the report’s coverage.

sinsai_info

I’ll start with the gaps since there are far fewer of these. While the report does reference the Sinsai Crisis Map, it over looks a number of key points that were quickly identified in an email reply just 61 minutes after Internews posted the study on the CrisisMappers list-serve. These points were made by my Fletcher colleague Jeffrey Reynolds who spearheaded some of the digital response efforts from The Fletcher School in Boston:

“As one of the members who initiated crisis mapping effort in the aftermath of the Great East Japan Earthquake, I’d like to set the record straight on 4 points:

  • The crisis mapping effort started at the Fletcher School with students from Tufts, Harvard, MIT, and BU within a couple hours of the earthquake. We took initial feeds from the SAVE JAPAN! website and put them into the existing OpenStreetMap (OSM) for Japan. This point is not to take credit, but to underscore that small efforts, distant from a catastrophe, can generate momentum – especially when the infrastructure in area/country in question is compromised.
  • Anecdotally, crisis mappers in Boston who have since returned to Japan told me that at least 3 people were saved because of the map.
  • Although crisis mapping efforts may not have been well known by victims of the quake and tsunami, the embassy community in Tokyo leveraged the crisis map to identify their citizens in the Tohuku region. As the proliferation of crisis map-like platforms continues, e.g., Waze, victims in future crises will probably gravitate to social media faster than they did in Japan. Social media, specifically crisis mapping, has revolutionized the role of victim in disasters–from consumer of services, to consumer of relief AND supplier of information.
  • The crisis mapping community would be wise to work with Twitter and other suppliers of information to develop algorithms that minimise noise and duplication of information.

Thank you for telling this important story about the March 11 earthquake. May it lead to the reduction of suffering in current crises and those to come.” Someone else on CrisisMappers noted that “the first OSM mappers of satellite imagery from Japan were the mappers from Haiti who we trained after their own string of catastrophes.” I believe Jeffrey is spot on and would only add the following point: According to Hal, the crisis map received over one million unique views in the weeks and months that followed the Tsunami. The vast majority of these were apparently from inside Japan. So lets assume that 700,000 users accessed the crisis map but that only 1% of them found the map useful for their purposes. This means that 7,000 unique users found the map informative and of consequence. Unless a random sample of these 7,000 users were surveyed, then I find it rather myopic to claim so confidently that the map had no impact. Just because impact is difficult to measure doesn’t imply there was none to measure in the first place.

In any event, Internews’s reply to this feedback was exemplary and far more con-structive than the brouhaha that occurred over the Disaster 2.0 Report. So I applaud the team for how positive, pro-active and engaging they have been to our feedback. Thank you very much.

Screen Shot 2013-03-10 at 3.25.24 PM

In any event, the gaps should not distract from what is an excellent and important report on the use of technology in response to the Japan Earthquake. As my colleague Hal Seki (who spearheaded the Sinsai Crisis Map) noted on Crisis-Mappers, “the report was accurate and covered important on-going issues in Japan.” So I want to thank him again, and his entire team (including Sora, pictured above, the youngest volunteer behind the the crisis mapping efforts) and Jeffrey & team at Fletcher for all their efforts during those difficult weeks and months following the devastating disaster.

Below are multiple short excerpts from the 56-page Internews report that I found most interesting. So if you don’t have time to read the entire report, then simply glance through the list below.

  • Average tweets-per-minute in Japan before earthquake = 3,000
  • Average tweets-per-minute in Japan after earthquake = 11,000
  • DM’s per minute from Japan to world before earthquake = 200
  • DM’s per minute from Japan to world after earthquake = 1,000
  • Twitter’s global network facilitated search & rescue missions for survivors stranded by the tsunami. Within 3 days the Government of Japan had also set up its first disaster-related Twitter account.
  • Safecast, a volunteer-led project to collect and share radiation measurements, was created within a week of the disaster and generated over 3.5 million readings by December 2012.
  • If there is no information after a disaster, people become even more stressed and anxious. Old media works best in emergencies.
  • Community radio, local newspapers, newsletters–in some instances, hand written newsletters–and word of mouth played a key role in providing lifesaving information for communities. Radio was consistently ranked the most useful source of information by disaster-affected communities, from the day of the disaster right through until the end of the first week.
  • The second challenge involved humanitarian responders’ lack of awareness about the valuable information resources being generated by one very significant, albeit volunteer, community: the volunteer technical and crisis mapping communities.
  • The OpenStreet Map volunteer community, for instance, created a map of over 500,000 roads in disaster-affected areas while volunteers working with another crisis map, Sinsai.info, verified, categorised and mapped 12,000 tweets and emails from the affected regions for over three months. These platforms had the potential to close information gaps hampering the response and recovery operation, but it is unclear to what degree they were used by professional responders.
  • The “last mile” needs to be connected in even the most technologically advanced societies.
  • Still, due to the problems at the Fukushima nuclear plant and the scale of the devastation, there was still the issue of “mismatching” – where mainstream media coverage focused on the nuclear crisis and didn’t provide the information that people in evacuation centres needed most.
  • The JMA use a Short Message Service Cell Broadcast (SMS-CB) system to send mass alerts to mobile phone users in specific geographic locations. Earthquakes affect areas in different ways, so alerting phone users based on location enables region-specific alerts to be sent. The system does not need to know specific phone numbers so privacy is protected and the risk of counterfeit emergency alerts is reduced.
  • A smartphone application such as Yurekuru Call, meaning “Earthquake Coming”, can also be downloaded and it will send warnings before an earthquake, details of potential magnitude and arrival times depending on the location.
  • This started with a 14-year-old junior high school student who made a brave but risky decision to live stream NHK on Ustream using his iPhone camera [which is illegal]. This was done within 17 minutes of the earthquake happening on March 11.
  • So for most disaster- affected communities, local initiatives such as community radios, community (or hyper-local) newspapers and word of mouth provided information evacuees wanted the most, including information on the safety of friends and family and other essential information.
  • It is worth noting that it was not only professional reporters who committed themselves to providing information, but also community volunteers and other actors – and that is despite the fact that they too were often victims of the disaster.
  • And after the disaster, while the general level of public trust in media and in social media increased, radio gained the most trust from locals. It was also cited as being a more personable source of information – and it may even have been the most suitable after events as traumatic as these because distressing images couldn’t be seen.
  • Newspapers were also information lifelines in Ishinomaki, 90km from the epicentre of the earthquake. The local radio station was temporarily unable to broadcast due to a gasoline shortage so for a short period of time, the only information source in the city was a handwritten local newspaper, the Hibi Shimbun. This basic, low-cost, community initiative delivered essential information to people there.
  • Newsletters also proved to be a cost-efficient and effective way to inform communities living in evacuation centres, temporary shelters and in their homes.
  • Social networks such as Twitter, Mixi and Facebook provided a way for survivors to locate friends and family and let people know that they had survived.
  • Audio-visual content sharing platforms like YouTube and Ustream were used not only by established organisations and broadcasters, but also by survivors in the disaster-affected areas to share their experiences. There were also a number of volunteer initiatives, such as the crowdsourced disaster map, Sinsai.info, established to support the affected communities.
  • With approx 35 million account holders in Japan, Twitter is the most popular social networking site in that country. This makes Japan the third largest Twitter user in the world behind the USA and Brazil.
  • The most popular hash tags included: #anpi (for finding people) and #hinan (for evacuation centre information) as well as #jishin (earthquake information).
  • The Japanese site, Mixi, was cited as the most used social media in the affected Tohoku region and that should not be underestimated. In areas where there was limited network connectivity, Mixi users could easily check the last time fellow users had logged in by viewing their profile page; this was a way to confirm whether that user was safe. On March 16, 2011, Mixi released a new application that enabled users to view friends’ login history.
  • Geiger counter radiation readings were streamed by dozens, if not hundreds, of individuals based in the area.
  • Ustream also allowed live chats between viewers using their Twitter, Facebook and Instant Messenger accounts; this service was called “Social Stream”.
  • Local officials and NGOs commented that the content of the tweets or Facebook messages requesting assistance were often not relevant because many of the messages were based on secondary information or were simply being re-tweeted.
  • The JRC received some direct messages requesting help, but after checking the situation on the ground, it became clear that many of these messages were, for instance, re-tweets of aid requests or were no longer relevant, some being over a week old.
  • “Ultimately the opportunities (of social media) outweigh the risks. Social media is here to stay and non-engagement is simply not an option.”
  • The JRC also had direct experience of false information going viral; the organisation became the subject of a rumour falsely accusing it of deducting administration fees from cash donations. The rumour originated online and quickly spread across social networks, causing the JRC to invest in a nationwide advertising campaign confirming that 100 percent of the donations went to the affected people.
  • In February 2012 Facebook tested their Disaster Message Board, where users mark themselves and friends as “safe” after a major disaster. The service will only be activated after major emergencies.
  • Most page views [of Sinsai.info] came from the disaster-affected city of Sendai where internet penetration is higher than in surrounding rural areas. […] None of the survivors interviewed during field research in Miyagi and Iwate were aware of this crisis map.
  • The major mobile phone providers in Japan created emergency messaging services known as “disaster message boards” for people to type, or record messages, on their phones for relatives and friends to access. This involved two types of message boards. One was text based, where people could input a message on the provider’s website that would be stored online or automatically forwarded to pre-registered email addresses. The other was a voice recording that could be emailed to a recipient just like an answer phone message.
  • The various disaster message boards were used 14 million times after the earthquake and they significantly reduced congestion on the network – especially if the same number of people had to make a direct call.
  • Information & communication are a form of aid – although unfor-tunately, historically, the aid sector has not always recognised this. Getting information to people on the side of the digital divide, where there is no internet, may help them survive in times of crisis and help communities rebuild after immediate danger has passed.
  • Timely and accurate information for disaster- affected people as well as effective communication between local populations and those who provide aid also improve humanitarian responses to disasters. Using local media – such as community radio or print media – is one way to achieve this and it is an approach that should be embraced by humanitarian organisations.
  • With plans for a US$50 smartphone in the pipeline, the interna-tional humanitarian community needs to prepare for a transforma-tion in the way that information flows in disaster zones.
  • This report’s clear message is that the more channels of communication available during a disaster the better. In times of emergency it is simply not possible to rely on only one, or even three or four kinds, of communication. Both low tech and high tech methods of communication have proven themselves equally important in a crisis.

bio