Tag Archives: social

Resilience = Anarchism = Resilience?

Resilience is often defined as the capacity for self-organization, which in essence is cooperation without hierarchy. In turn, such cooperation implies mutuality; reciprocation, mutual dependence. This is what the French politician, philo-sopher, economist and socialist “Pierre-Joseph Proudhon had in mind when he first used the term ‘anarchism,’ namely, mutuality, or cooperation without hierarchy or state rule” (1).

Screen Shot 2013-03-24 at 6.36.22 PM

As renowned Yale Professor James Scott explains in his latest bookTwo Cheers for Anarchism, “Forms of informal cooperation, coordination, and action that embody mutuality without hierarchy are the quotidian experience of most people.” To be sure, “most villages and neighborhoods function precisely be-cause of the informal, transient networks of coordination that do not require formal organization, let alone hierarchy. In other words, the experience of anar-chistic mutuality is ubiquitous. The existence, power and reach of the nation-state over the centuries may have undermined the self-organizing capacity (and hence resilience) of individuals and small communities.” Indeed, “so many functions that were once accomplished by mutuality among equals and informal coordination are now state organized or state supervised.” In other words, “the state, arguably, destroys the natural initiative and responsibility that arise from voluntary cooperation.”

This is goes to the heart what James Scott argues in his new book, and he does so  in a very compelling manner. Says Scott: “I am suggesting that two centuries of a strong state and liberal economies may have socialized us so that we have largely lost the habits of mutuality and are in danger now of becoming precisely the dangerous predators that Hobbes thought populated the state of nature. Leviathan may have given birth to its own justification.” And yet, we also see a very different picture of reality, one in which solidarity thrives and mutual-aid remains the norm: we see this reality surface over & over during major disasters—a reality facilitated by mobile technology and social media networks.

Recall Jürgen Habermas’s treatise that “those who take on the tools of open expression become a public, and the presence of a synchronized public increas-ingly constrains undemocratic rulers while expanding the right of that public.” One of the main instruments for synchronization is what the military refers to as “shared awareness.” As my colleague Clay Shirky notes in his excellent piece on The Political Power of Social Media, “shared awareness is the ability of each member of a group to not only understand the situation at hand but also under-stand that everyone else does, too. Social media increase shared awareness by propagating messages through social networks.” Moreover, while “Opinions are first transmitted by the media,” they are then “echoed by friends, family mem-bers, and colleagues. It is in this second, social step that political opinions are formed. This is the step in which the Internet in general, and social media in particular, can make a difference.”

metropiano

In 1990, James Scott published Domination and the Arts of Resistance: Hidden Transcripts, in which he distinguishes between public and hidden transcripts. The former describes the open, public interactions that take place between dominators and oppressed while hidden transcripts relate to the critique of power that “goes on offstage” and which the power elites cannot decode. This hidden transcript is comprised of the second step described above, i.e., the social conversations that ultimately change political behavior. Scott writes that when the oppressed classes publicize this “hidden transcript”, they become conscious of its common status. Borrowing from Habermas, the oppressed thereby become a public and more importantly a synchronized public. Social media is the metronome that can synchronize the collective publication of the hidden trans-cript, yielding greater shared awareness that feeds on itself, thereby threatening the balance of power between Leviathan and now-empowered and self-organized mutual-aid communities.

I have previously argued that social media and online social networks also can and do foster social capital, which increases capacity for self-organization and renders local communities more resilient & independent, thus sowing the seeds for future social movements. In other words, habits of mutuality are not all lost and the Leviathan may still face some surprisesAs Peter Kropotkin observed well over 100 years ago in his exhaustive study, Mutual Aid: A Factor of Evolution, cooperation and mutual aid are the most important factors in the evolution of species and their ability to survive. “There is an immense amount of warfare and extermination going on amidst various species; there is, at the same time, as much, or perhaps even more, of mutual support, mutual aid, and mutual defense… Sociability is as much a law of nature as mutual struggle.” 

Sociability is the tendency or property of being social, of interacting with others. Social media, meanwhile, has become the media for mass social interaction; enabling greater volumes of interactions than at any other time in human history. By definition, these mass social interactions radically increase the probability of mutuality and self-organization. And so, as James Scott puts it best, Two Cheers for Anarchism

bio

Social Network Analysis of Tweets During Australia Floods

This study (PDF) analyzes the community of Twitter users who disseminated  information during the crisis caused by the Australian floods in 2010-2011. “In times of mass emergencies, a phenomenon known as collective behavior becomes apparent. It consists of socio-behaviors that include intensified information search and information contagion.” The purpose of the Australian floods analysis is to reveal interesting patterns and features of this online community using social network analysis (SNA).

The authors analyzed 7,500 flood-related tweets to understand which users did the tweeting and retweeting. This was done to create nodes and links for SNA, which was able to “identify influential members of the online communities that emerged during the Queensland, NSW and Victorian floods as well as identify important resources being referred to. The most active community was in Queensland, possibly induced by the fact that the floods were orders of mag-nitude greater than in NSW and Victoria.”

The analysis also confirmed “the active part taken by local authorities, namely Queensland Police, government officials and volunteers. On the other hand, there was not much activity from local authorities in the NSW and Victorian floods prompting for the greater use of social media by the authorities concerned. As far as the online resources suggested by users are concerned, no sensible conclusion can be drawn as important ones identified were more of a general nature rather than critical information. This might be comprehensible as it was past the impact stage in the Queensland floods and participation was at much lower levels in the NSW and Victorian floods.”

Social Network Analysis is an under-utilized methodology for the analysis of communication flows during humanitarian crises. Understanding the topology of a social network is key to information diffusion. Think of this as a virus infecting a network. If we want to “infect” a social network with important crisis information as quickly and fully as possible, understanding the network’ topology is a requirement as is, therefore, social network analysis.

Social Media = Social Capital = Disaster Resilience?

Do online social networks generate social capital, which, in turn, increases resilience to disasters? How might one answer this question? For example, could we analyze Twitter data to capture levels of social capital in a given country? If so, do countries with higher levels of social capital (as measured using Twitter) demonstrate greater resiliences to disasters?

Twitter Heatmap Hurricane

These causal loops are fraught with all kinds of intervening variables, daring assumptions and econometric nightmares. But the link between social capital and disaster resilience is increasingly accepted. In “Building Resilience: Social Capital in Post-Disaster Recover,” Daniel Aldrich draws on both qualitative and quantita-tive evidence to demonstrate that “social resources, at least as much as material ones, prove to be the foundation for resilience and recovery.” A concise summary of his book is available in my previous blog post.

So the question that follows is whether the link between social media, i.e., online social networks and social capital can be established. “Although real-world organizations […] have demonstrated their effectiveness at building bonds, virtual communities are the next frontier for social capital-based policies,” writes Aldrich. Before we jump into the role of online social networks, however, it is important to recognize the function of “offline” communities in disaster response and resilience.

iran-reliefs

“During the disaster and right after the crisis, neighbors and friends—not private firms, government agencies, or NGOs—provide the necessary resources for resilience.” To be sure, “the lack of systematic assistance from government and NGOs [means that] neighbors and community groups are best positioned to undertake efficient initial emergency aid after a disaster. Since ‘friends, family, or coworkers of victims and also passersby are always the first and most effective responders, “we should recognize their role on the front line of disasters.”

In sum, “social ties can serve as informal insurance, providing victims with information, financial help and physical assistance.” This informal insurance, “or mutual assistance involves friends and neighbors providing each other with information, tools, living space, and other help.” Data driven research on tweets posted during disasters reveal that many provide victims with information, help, tools, living space, assistance and other help. But this support is also provided to complete strangers since it is shared openly and publicly on Twitter. “[…] Despite—or perhaps because of—horrendous conditions after a crisis, survivors work together to solve their problems; […] the amount of (bounding) social capital seems to increase under difficult conditions.” Again, this bonding is not limited to offline dynamics but occurs also within and across online social networks. The tweet below was posted in the aftermath of Hurricane Sandy.

Sandy Tweets Mutual Aid

“By providing norms, information, and trust, denser social networks can implement a faster recovery.” Such norms also evolve on Twitter, as does information sharing and trust building. So is the degree of activity on Twitter directly proportional to the level of community resilience?

This data-driven study, “Do All Birds Tweet the Same? Characterizing Twitter Around the World,” may shed some light in this respect. The authors, Barbara Poblete, Ruth Garcia, Marcelo Mendoza and Alejandro Jaimes, analyze various aspects of social media–such as network structure–for the ten most active countries on Twitter. In total, the working dataset consisted close to 5 million users and over 5 billion tweets. The study is the largest one carried out to date on Twitter data, “and the first one that specifically examines differences across different countries.”

Screen Shot 2012-11-30 at 6.19.45 AM

The network statistics per country above reveals that Japan, Canada, Indonesia and South Korea have highest percentage of reciprocity on Twitter. This is important because according to Poblet et al., “Network reciprocity tells us about the degree of cohesion, trust and social capital in sociology.” In terms of network density, “the highest values correspond to South Korea, Netherlands and Australia.” Incidentally, the authors find that “communities which tend to be less hierarchical and more reciprocal, also displays happier language in their content updates. In this sense countries with high conversation levels (@) … display higher levels of happiness too.”

If someone is looking for a possible dissertation topic, I would recommend the following comparative case study analysis. Select two of the four countries with highest percentage of reciprocity on Twitter: Japan, Canada, Indonesia and South Korea. The two you select should have a close “twin” country. By that I mean a country that has many social, economic and political factors in common. The twin countries should also be in geographic proximity to each other since we ultimately want to assess how they weather similar disasters. The paired can-didates that come to mind are thus: Canada & US and Indonesia & Malaysia.

Next, compare the countries’ Twitter networks, particularly degrees of  recipro-city since this metric appears to be a suitable proxy for social capital. For example, Canada’s reciprocity score is 26% compared to 19% for the US. In other words, quite a difference. Next, identify recent  disasters that both countries have experienced. Do the affected cities in the respective countries weather the disasters differently? Is one community more resilient than the other? If so, do you find a notable quantitative difference in their Twitter networks and degrees of reciprocity? If so, does a subsequent comparative qualitative analysis support these findings?

As cautioned earlier, these causal loops are fraught with all kinds of intervening variables, daring assumptions and econometric nightmares. But if anyone wants to brave the perils of applied social science research, and finds the above re-search questions of interest, then please do get in touch!

Does Social Capital Drive Disaster Resilience?

The link between social capital and disaster resilience is increasingly accepted. In “Building Resilience: Social Capital in Post-Disaster Recover,” Daniel Aldrich draws on both qualitative and quantitative evidence to demonstrate that “social resources, at least as much as material ones, prove to be the foundation for re-silience and recovery.” His case studies suggest that social capital is more important for disaster resilience than physical and financial capital, and more im-portant than conventional explanations.

Screen Shot 2012-11-30 at 6.03.23 AM

Aldrich argues that social capital catalyzes increased “participation among networked members; providing information and knowledge to individuals in the group; and creating trustworthiness.” The author goes so far as using “the phrases social capital and social networks nearly interchangeably.” He finds that “higher levels of social capital work together more effectively to guide resources to where they are needed.” Surveys confirm that “after disasters, most survivors see social connections and community as critical for their recovery.” To this end, “deeper reservoirs of social capital serve as informal insurance and mutual assistance for survivors,” helping them “overcome collective action constraints.”

Capacity for self-organization is thus intimately related to resilience since “social capital can overcome obstacles to collective action that often prevent groups from accomplishing their goals.” In other words, “higher levels of social capital reduce transaction costs, increase the probability of collective action, and make cooperation among individuals more likely.” Social capital is therefore “an asset, a functioning propensity for mutually beneficial collective action […].”

In contrast, communities exhibiting “less resilience fail to mobilize collectively and often must wait for recover guidance and assistance […].”  This implies that vulnerable populations are not solely characterized in terms of age, income, etc., but in terms of “their lack of connections and embeddedness in social networks.” Put differently, “the most effective—and perhaps least expensive—way to mitigate disasters is to create stronger bonds between individuals in vulnerable populations.”

Social Capital

The author brings conceptual clarity to the notion of social capital when he unpacks the term into Bonding Capital, Bridging Capital and Linking Capital. The figure above explains how these differ but relate to each other. The way this relates and applies to digital humanitarian response is explored in this blog post.

Big Data Philanthropy for Humanitarian Response

My colleague Robert Kirkpatrick from Global Pulse has been actively promoting the concept of “data philanthropy” within the context of development. Data philanthropy involves companies sharing proprietary datasets for social good. I believe we urgently need big (social) data philanthropy for humanitarian response as well. Disaster-affected communities are increasingly the source of big data, which they generate and share via social media platforms like twitter. Processing this data manually, however, is very time consuming and resource intensive. Indeed, large numbers of digital humanitarian volunteers are often needed to monitor and process user-generated content from disaster-affected communities in near real-time.

Meanwhile, companies like Crimson Hexagon, Geofeedia, NetBase, Netvibes, RecordedFuture and Social Flow are defining the cutting edge of automated methods for media monitoring and analysis. So why not set up a Big Data Philanthropy group for humanitarian response in partnership with the Digital Humanitarian Network? Call it Corporate Social Responsibility (CRS) for digital humanitarian response. These companies would benefit from the publicity of supporting such positive and highly visible efforts. They would also receive expert feedback on their tools.

This “Emergency Access Initiative” could be modeled along the lines of the International Charter whereby certain criteria vis-a-vis the disaster would need to be met before an activation request could be made to the Big Data Philanthropy group for humanitarian response. These companies would then provide a dedicated account to the Digital Humanitarian Network (DHNet). These accounts would be available for 72 hours only and also be monitored by said companies to ensure they aren’t being abused. We would simply need to  have relevant members of the DHNet trained on these platforms and draft the appropriate protocols, data privacy measures and MoUs.

I’ve had preliminary conversations with humanitarian colleagues from the United Nations and DHnet who confirm that “this type of collaboration would be see very positively from the coordination area within the traditional humanitarian sector.” On the business development end, this setup would enable companies to get their foot in the door of the humanitarian sector—a multi-billion dollar industry. Members of the DHNet are early adopters of humanitarian technology and are ideally placed to demonstrate the added value of these platforms since they regularly partner with large humanitarian organizations. Indeed, DHNet operates as a partnership model. This would enable humanitarian professionals to learn about new Big Data tools, see them in action and, possibly, purchase full licenses for their organizations. In sum, data philanthropy is good for business.

I have colleagues at most of the companies listed above and thus plan to actively pursue this idea further. In the meantime, I’d be very grateful for any feedback and suggestions, particularly on the suggested protocols and MoUs. So I’ve set up this open and editable Google Doc for feedback.

Big thanks to the team at the Disaster Information Management Research Center (DIMRC) for planting the seeds of this idea during our recent meeting. Check out their very neat Emergency Access Initiative.

Trails of Trustworthiness in Real-Time Streams

Real-time information channels like Twitter, Facebook and Google have created cascades of information that are becoming increasingly challenging to navigate. “Smart-filters” alone are not the solution since they won’t necessarily help us determine the quality and trustworthiness of the information we receive. I’ve been studying this challenge ever since the idea behind SwiftRiver first emerged several years ago now.

I was thus thrilled to come across a short paper on “Trails of Trustworthiness in Real-Time Streams” which describes a start-up project that aims to provide users with a “system that can maintain trails of trustworthiness propagated through real-time information channels,” which will “enable its educated users to evaluate its provenance, its credibility and the independence of the multiple sources that may provide this information.” The authors, Panagiotis Metaxas and Eni Mustafaraj, kindly cite my paper on “Information Forensics” and also reference SwiftRiver in their conclusion.

The paper argues that studying the tactics that propagandists employ in real life can provide insights and even predict the tricks employed by Web spammers.

“To prove the strength of this relationship between propagandistic and spamming techniques, […] we show that one can, in fact, use anti-propagandistic techniques to discover Web spamming networks. In particular, we demonstrate that when starting from an initial untrustworthy site, backwards propagation of distrust (looking at the graph defined by links pointing to to an untrustworthy site) is a successful approach to finding clusters of spamming, untrustworthy sites. This approach was inspired by the social behavior associated with distrust: in society, recognition of an untrustworthy entity (person, institution, idea, etc) is reason to question the trust- worthiness of those who recommend it. Other entities that are found to strongly support untrustworthy entities become less trustworthy themselves. As in society, distrust is also propagated backwards on the Web graph.”

The authors document that today’s Web spammers are using increasingly sophisticated tricks.

“In cases where there are high stakes, Web spammers’ influence may have important consequences for a whole country. For example, in the 2006 Congressional elections, activists using Google bombs orchestrated an effort to game search engines so that they present information in the search results that was unfavorable to 50 targeted candidates. While this was an operation conducted in the open, spammers prefer to work in secrecy so that their actions are not revealed. So,  revealed and documented the first Twitter bomb, which tried to influence the Massachusetts special elections, show- ing how an Iowa-based political group, hiding its affiliation and profile, was able to serve misinformation a day before the election to more than 60,000 Twitter users that were follow- ing the elections. Very recently we saw an increase in political cybersquatting, a phenomenon we reported in [28]. And even more recently, […] we discovered the existence of Pre-fabricated Twitter factories, an effort to provide collaborators pre-compiled tweets that will attack members of the Media while avoiding detection of automatic spam algorithms from Twitter.

The theoretical foundations for a trustworthiness system:

“Our concept of trustworthiness comes from the epistemology of knowledge. When we believe that some piece of information is trustworthy (e.g., true, or mostly true), we do so for intrinsic and/or extrinsic reasons. Intrinsic reasons are those that we acknowledge because they agree with our own prior experience or belief. Extrinsic reasons are those that we accept because we trust the conveyor of the information. If we have limited information about the conveyor of information, we look for a combination of independent sources that may support the information we receive (e.g., we employ “triangulation” of the information paths). In the design of our system we aim to automatize as much as possible the process of determining the reasons that support the information we receive.”

“We define as trustworthy, information that is deemed reliable enough (i.e., with some probability) to justify action by the receiver in the future. In other words, trustworthiness is observable through actions.”

“The overall trustworthiness of the information we receive is determined by a linear combination of (a) the reputation RZ of the original sender Z, (b) the credibility we associate with the contents of the message itself C(m), and (c) characteristics of the path that the message used to reach us.”

“To compute the trustworthiness of each message from scratch is clearly a huge task. But the research that has been done so far justifies optimism in creating a semi-automatic, personalized tool that will help its users make sense of the information they receive. Clearly, no such system exists right now, but components of our system do exist in some of the popular [real-time information channels]. For a testing and evaluation of our system we plan to use primarily Twitter, but also real-time Google results and Facebook.”

In order to provide trails of trustworthiness in real-time streams, the authors plan to address the following challenges:

•  “Establishment of new metrics that will help evaluate the trustworthiness of information people receive, especially from real-time sources, which may demand immediate attention and action. […] we show that coverage of a wider range of opinions, along with independence of results’ provenance, can enhance the quality of organic search results. We plan to extend this work in the area of real-time information so that it does not rely on post-processing procedures that evaluate quality, but on real-time algorithms that maintain a trail of trustworthiness for every piece of information the user receives.”

• “Monitor the evolving ways in which information reaches users, in particular citizens near election time.”

•  “Establish a personalizable model that captures the parameters involved in the determination of trustworthiness of in- formation in real-time information channels, such as Twitter, extending the work of measuring quality in more static information channels, and by applying machine learning and data mining algorithms. To implement this task, we will design online algorithms that support the determination of quality via the maintenance of trails of trustworthiness that each piece of information carries with it, either explicitly or implicitly. Of particular importance, is that these algorithms should help maintain privacy for the user’s trusting network.”

• “Design algorithms that can detect attacks on [real-time information channels]. For example we can automatically detect bursts of activity re- lated to a subject, source, or non-independent sources. We have already made progress in this area. Recently, we advised and provided data to a group of researchers at Indiana University to help them implement “truthy”, a site that monitors bursty activity on Twitter.  We plan to advance, fine-tune and automate this process. In particular, we will develop algorithms that calculate the trust in an information trail based on a score that is affected by the influence and trustworthiness of the informants.”

In conclusion, the authors “mention that in a month from this writing, Ushahidi […] plans to release SwiftRiver, a platform that ‘enables the filtering and verification of real-time data from channels like Twitter, SMS, Email and RSS feeds’. Several of the features of Swift River seem similar to what we propose, though a major difference appears to be that our design is personalization at the individual user level.”

Indeed, having been involved in SwiftRiver research since early 2009 and currently testing the private beta, there are important similarities and some differences. But one such difference is not personalization. Indeed, Swift allows full personalization at the individual user level.

Another is that we’re hoping to go beyond just text-based information with Swift, i.e., we hope to pull in pictures and video footage (in addition to Tweets, RSS feeds, email, SMS, etc) in order to cross-validate information across media, which we expect will make the falsification of crowdsourced information more challenging, as I argue here. In any case, I very much hope that the system being developed by the authors will be free and open source so that integration might be possible.

A copy of the paper is available here (PDF). I hope to meet the authors at the Berkman Center’s “Truth in Digital Media Symposium” and highly recommend the wiki they’ve put together with additional resources. I’ve added the majority of my research on verification of crowdsourced information to that wiki, such as my 20-page study on “Information Forensics: Five Case Studies on How to Verify Crowdsourced Information from Social Media.”

Information Forensics: Five Case Studies on How to Verify Crowdsourced Information from Social Media

My 20+ page study on verifying crowdsourced information is now publicly available here as a PDF and here as an open Google Doc for comments. I very much welcome constructive feedback from iRevolution readers so I can improve the piece before it gets published in an edited book next year.

Abstract

False information can cost lives. But no information can also cost lives, especially in a crisis zone. Indeed, information is perishable so the potential value of information must be weighed against the urgency of the situation. Correct information that arrives too late is useless. Crowdsourced information can provide rapid situational awareness, especially when added to a live crisis map. But information in the social media space may not be reliable or immediately verifiable. This may explain why humanitarian (and news) organizations are often reluctant to leverage crowdsourced crisis maps. Many believe that verifying crowdsourced information is either too challenging or impossible. The purpose of this paper is to demonstrate that concrete strategies do exist for the verification of geo-referenced crowdsourced social media information. The study first provides a brief introduction to crisis mapping and argues that crowdsourcing is simply non-probability sampling. Next, five case studies comprising various efforts to verify social media are analyzed to demonstrate how different verification strategies work. The five case studies are: Andy Carvin and Twitter; Kyrgyzstan and Skype; BBC’s User-Generated Content Hub; the Standby Volunteer Task Force (SBTF); and U-Shahid in Egypt. The final section concludes the study with specific recommendations.

Update: See also this link and my other posts on Information Forensics.

Time-Critical Crowdsourcing for Social Mobilization and Crowd-Solving

My good friend Riley Crane just co-authored a very interesting study entitled “Time-Critical Social Mobilization” in the peer-reviewed journal Science. Riley spearheaded the team at MIT that won the DARPA Red Balloon competition last year. His team found the locations of all 10 weather balloons hidden around the continental US in under 9 hours. While we were already discussing alternative approaches to crowdsourcing for social impact before the competition, the approach he designed to win the competition certainly gave us a whole lot more to talk about given the work I’d been doing on crowd sourcing crisis information and near real-time crisis mapping.

Crowd-solving non-trivial problems in quasi real-time poses two important challenges. A very large number of participants is typically required couple with extremely fast execution. Another common challenge is the need for some sort of search process. “For example, search may be conducted by members of the mobilized community for survivors after a natural disaster.” Recruiting large numbers of participants, however, requires that individuals be motivated to actually conduct the search and participate in the information diffusion. Clearly, “providing appropriate incentives is a key challenge in social mobilization.”

This explains the rationale behind DARPA decision to launch their Red Balloon Challenge: “to explore the roles the Internet and social networking play in the timely communication, wide-area team-building, and urgent mobilization required to solve broad-scope, time-critical problems.” So 10 red weather balloons were discretely placed at different locations in the continental US. A senior analyst at the National Geospatial-Intelligence Agency is said to have characterized the challenge is impossible for conventional intelligence-gathering methods. Riley’s team found all 10 balloons in 8 hours and 36 minutes. How did they do it?

Some 36 hours before the start of the challenge, the team at MIT had already recruited over 4,000 participants using a “recursive incentive mechanism.” They used the $40,000 prize money that would be awarded by the winners of the challenge as a “financial incentive structure rewarding not only the people who correctly located the balloons but also those connecting the finder [back to the MIT team].” If Riley and colleagues won:

we would allocate $4000 in prize money to each of the 10 balloons. We promised $2000 per balloon to the first person to send in the cor- rect balloon coordinates. We promised $1000 to the person who invited that balloon finder onto the team, $500 to whoever invited the in- viter, $250 to whoever invited that person, and so on. The underlying structure of the “recursive incentive” was that whenever a person received prize money for any reason, the person who in- vited them would also receive money equal to half that awarded to their invitee

In other words, the reward offers by Team MIT “scales with the size of the entire recruitment tree (because larger trees are more likely to succeed), rather than depending solely on the immediate recruited friends.” What is stunning about Riley et al.’s approach is that their “attrition rate” was almost half the rate of other comparable social network experiments. In other words, participants in the MIT recruitment tree were about twice as likely to “play the game” so-to-speak rather than give up. In addition, the number recruited by each individual followed a power law distribution, which suggests a possible tipping point dynamic.

In conclusion, the mechanism devised by the winning team “simultaneously provides incentives for participation and for recruiting more individuals to the cause.” So what insights does this study provide vis-a-vis live crisis mapping initiatives that are volunteer-based, like those spearheaded by the Standby Volunteer Task Force (SBTF) and the Humanitarian OpenStreetMap (HOT) communities? While these networks don’t have any funding to pay volunteers (this would go against the spirit of volunteerism in any case), I think a number of insights can nevertheless be drawn.

In the volunteer sector, the “currency of exchange” is credit. That is, the knowledge and acknowledgement that I participated in the Libya Crisis Map to support the UN’s humanitarian operations, for example. I recently introduced SBTF “deployment badges” to serve in part the public acknowledgment incentive. SBTF volunteers can now add badges for deployments there were engaged in, e.g., “Sudan 2011”; “New Zealand 2011”, etc.

What about using a recursive credit mechanism? For example, it would be ideal if volunteers could find out how a given report they worked on was ultimately used by a humanitarian colleague monitoring a live map. Using the Red Balloon analogy, the person who finds the balloon should be able to reward all those in her “recruitment tree” or in our case “SBTF network”. Lets say Helena works for the UN and used the Libya Crisis Map whilst in Tripoli. She finds an important report on the map and shares this with her colleagues on the Tunisian border who decide to take some kind of action as a result. Now lets say this report came from a tweet that Chrissy in the Media Monitoring Team found while volunteering on the deployment. She shared the tweet with Jess in the GPS Team who found the coordinates for the location referred to in that tweet. Melissa then added this to the live map being monitored by the UN. Wouldn’t be be ideal if each could be sent an email letting them know about Helena’s response? I realize this isn’t trivial to implement but what would have to be in place to make something like this actually happen? Any thoughts?

On the recruitment side, we haven’t really done anything explicitly to incentivize current volunteers to recruit additional volunteers. Could we incentivize this beyond giving credit? Perhaps we could design a game-like point system? Or a fun ranking system with different titles assigned according to the number of volunteers recruited? Another thought would be to simply ask existing volunteers to recruit one or two additional volunteers every year. We currently have about 700 volunteers in the SBTF, so this might be one way to increase substantially in size.

I’m not sure what type of mechanism we could devise to simultaneously provide incentives for participation and recruitment. Perhaps those incentives already exist, in the sense that the SBTF response to international crises, which perhaps serves as a sufficient draw. I’d love to hear what iRevolution readers think, especially if you have good ideas that we could realistically implement!

How to Verify Social Media Content: Some Tips and Tricks on Information Forensics

Update: I have authored a 20+ page paper on verifying social media content based on 5 case studies. Please see this blog post for a copy.

I get this question all the time: “How do you verify social media data?” This question drives many of the conversations on crowdsourcing and crisis mapping these days. It’s high time that we start compiling our tips and tricks into an online how-to-guide so that we don’t have to start from square one every time the question comes up. We need to build and accumulate our shared knowledge in information forensics. So here is the Google Doc version of this blog post, please feel free to add your best practices and ask others to contribute. Feel free to also add links to other studies on verifying social media content.

If every source we monitored in the social media space was known and trusted, then the need for verification would not be as pronounced. In other words, it is the plethora and virtual anonymity of sources that makes us skeptical of the content they deliver. The process of verifying  social media data thus requires a two-step process: the authentication of the source as reliable and the triangulation of the content as valid. If we can authenticate the source and find it trustworthy, this may be sufficient to trust the content and mark is a verified depending on context. If source authentication is difficult to ascertain, then we need to triangulate the content itself.

Lets unpack these two processes—authentication and triangulation—and apply them to Twitter since the most pressing challenges regarding social media verification have to do with eyewitness, user-generated content. The first step is to try and determine whether the source is trustworthy. Here are some tips on how to do this:

  • Bio on Twitter: Does the source provide a name, picture, bio and any  links to their own blog, identity, professional occupation, etc., on their page? If there’s a name, does searching for this name on Google provide any further clues to the person’s identity? Perhaps a Facebook page, a professional email address, a LinkedIn profile?
  • Number of Tweets: Is this a new Twitter handle with only a few tweets? If so, this makes authentication more difficult. Arasmus notes that “the more recent, the less reliable and the more likely it is to be an account intended to spread disinformation.” In general, the longer the Twitter handle has been around and the more Tweets linked to this handle, the better. This gives a digital trace, a history of prior evidence that can be scrutinized for evidence of political bias, misinformation, etc. Arasmus specifies: “What are the tweets like? Does the person qualify his/her reports? Are they intelligible? Is the person given to exaggeration and inconsistencies?”
  • Number of followers: Does the source have a large following? If there are only a few, are any of the followers know and credible sources? Also, how many lists has this Twitter hanlde been added to?
  • Number following: How many Twitter users does the Twitter handle follow? Are these known and credible sources?
  • Retweets: What type of content does the Twitter handle retweet? Does the Twitter handle in question get retweeted by known and credible sources?
  • Location: Can the source’s geographic location be ascertained? If so, are they nearby the unfolding events? One way to try and find out by proxy is to examine during which periods of the day/night the source tweets the most. This may provide an indication as to the person’s time zone.
  • Timing: Does the source appear to be tweeting in near real-time? Or are there considerable delays? Does anything appear unusual about the timing of the person’s tweets?
  • Social authentication: If you’re still unsure about the source’s reliability, use your own social network–Twitter, Facebook, LinkedIn–to find out if anyone in your network know about the source’s reliability.
  • Media authentication: Is the source quoted by trusted media outlines whether this be in the mainstream or social media space?
  • Engage the source: Tweet them back and ask them for further information. NPR’s Andy Carvin has employed this technique particularly well. For example, you can tweet back and ask for the source of the report and for any available pictures, videos, etc. Place the burden of proof on the source.

These are some of the tips that come to mind for source authentication. For more thoughts on this process, see my previous blog post “Passing the I’m-Not-Gaddafi-Test: Authenticating Identity During Crisis Mapping Operations.” If you some tips of your own not listed here, please do add them to the Google Doc—they don’t need to be limited to Twitter either.

Now, lets say that we’ve gone through list above and find the evidence inconclusive. We thus move to try and triangulate the content. Here are some tips on how to do this:

  • Triangulation: Are other sources on Twitter or elsewhere reporting on the event you are investigating? As Arasmus notes, “remain skeptical about the reports that you receive. Look for multiple reports from different unconnected sources.” The more independent witnesses you can get information from the better and the less critical the need for identity authentication.
  • Origins: If the user reporting an event is not necessarily the original source, can the original source be identified and authenticated? In particular, if the original source is found, does the time/date of the original report make sense given the situation?
  • Social authentication: Ask members of your own social network whether the tweet you are investigating is being reported by other sources. Ask them how unusual the event reporting is to get a sense of how likely it is to have happened in the first place. Andy Carvin’s followers, for example, “help him translate, triangulate, and track down key information. They enable remarkable acts of crowdsourced verification […] but he must always tell himself to check and challenge what he is told.”
  • Language: Andy Carvin notes that tweets that sound too official, using official language like “breaking news”, “urgent”, “confirmed” etc. need to be scrutinized. “When he sees these terms used, Carvin often replies and asks for additional details, for pictures and video. Or he will quote the tweet and add a simple one word question to the front of the message: Source?” The BBC’s UGC (user-generated content) Hub in London also verifies whether the vocabulary, slang, accents are correct for the location that a source might claim to be reporting from.
  • Pictures: If the twitter handle shares photographic “evidence”, does the photo provide any clues about the location where it was taken based on buildings, signs, cars, etc., in the background? The BBC’s UGC Hub checks weaponry against those know for the given country and also looks for shadows to determine the possible time of day that a picture was taken. In addition, they examine weather reports to “confirm that the conditions shown fit with the claimed date and time.” These same tips can be applied to Tweets that share video footage.
  • Follow up: If you have contacts in the geographic area of interest, then you could ask them to follow up directly/in-person to confirm the validity of the report. Obviously this is not always possible, particularly in conflict zones. Still, there is increasing anecdotal evidence that this strategy is being used by various media organizations and human rights groups. One particularly striking example comes from Kyrgyzstan where  a Skype group with hundreds of users across the country were able disprove and counter rumors at a breathtaking pace. See this blog post for more details. See my blog post on “How to Use Technology to Counter Rumors During Crises: Anecdotes from Kyrgyzstan.”

These are just a handful of tips and tricks come to mind. The number of bullet points above clearly shows we are not completely powerless when verifying social media data. There are several strategies available. The main challenge, as the BBC points out, is that this type of information forensics “can take anything from seconds […] to hours, as we hunt for clues and confirmation.” See for example my earlier post on “The Crowdsourcing Detective: Crisis, Deception and Intrigue in the Twitterspehere” which highlights some challenges but also new opportunities.

One of Storyful‘s comparative strengths when it comes to real-time news curation is the growing list of authenticated users it follows. This represents more of a bounded (but certainly not static) approach.  As noted in my previous blog post on “Seeking the Trustworthy Tweet,” following a bounded model presents some obvious advantages. This explains by the BBC recommends “maintaining lists of previously verified material [and sources] to act as a reference for colleagues covering the stories.” This strategy is also employed by the Verification Team of the Standby Volunteer Task Force (SBTF).

In sum, I still stand by my earlier blog post entitled “Wag the Dog: How Falsifying Crowdsourced Data can be a Pain.” I also continue to stand by my opinion that some data–even if not immediately verifiable—is better than no data. Also, it’s important to recognize that  we have in some occasions seen social media prove to be self-correcting, as I blogged about here. Finally, we know that information is often perishable in times of crises. By this I mean that crisis data often has a “use-by date” after which, it no longer matters whether said information is true or not. So speed is often vital. This is why semi-automated platforms like SwiftRiver that aim to filter and triangulate social media content can be helpful.