Crisis Mapping for Disaster Preparedness, Mitigation and Resilience

Crisis mapping for disaster preparedness is nothing new. In 2004, my colleague Suha Ulgen spearheaded an innovative project in Istanbul that combined public participation and mobile geospatial technologies for the purposes of disaster mitigation. Suha subsequently published an excellent overview of the project entitled “Public Participation Geographic Information Sharing Systems for Co-mmunity Based Urban Disaster Mitigation,” available in this edited book on Geo-Information for Disaster Management. I have referred to this project in count-less conversations since 2007  so it is high time I blog about it as well.

Suha’s project included a novel “Neighborhood Geographic Information Sharing System,” which “provided volunteers with skills and tools for identification of seismic risks and response assets in their neighborhoods. Field data collection volunteers used low-cost hand-held computers and data compiled was fed into a geospatial database accessible over the Internet. Interactive thematic maps enabled discussion of mitigation measures and action alternatives. This pilot evolved into a proposal for sustained implementation with local fire stations.” Below is a screenshot of the web-based system that enabled data entry and query.

There’s no reason why a similar approach could not be taken today, one that uses a dedicated smart phone app combined with integrated gamification and social networking features. The idea would be to make community mapping fun and rewarding; a way to foster a more active and connected community—which would in turn build more social capital. In the event of a disaster, this same smart phone app would allow users to simply “check in” to receive information on the nearest shelter areas (response assets) as well as danger zones such as overpasses, etc. This is why geo-fencing is so important for crisis mapping.

(Incidentally, Suha’s project also included a “School Commute Contingency Pilot” designed to track school-bus routes in Istanbul and thus “stimulate contingency planning for commute-time emergencies when 400,000 students travel an average of 45 minutes each way on 20,000 service buses. [GPS] data loggers were used to determine service bus routes displayed on printed maps high-lighting nearest schools along the route.” Suha proposed that “bus-drivers, parents and school managers be issued route maps with nearest schools that could serve as both meeting places and shelters”).

Fast forward to 2012 and the Humanitarian OpenStreetMap’s (HOT) novel project “Community Mapping for Exposure in Indonesia,” which resulted in the mapping of over 160,000 buildings and numerous village level maps in under ten months. The team also organized a university competition to create incentives for the mapping of urban areas. “The students were not only tasked to digitize buildings, but to also collect building information such building structure, wall type, roof type and the number of floors.” This contributed to the mapping and codification of some 30,000 buildings.

As Suha rightly noted almost 10 years ago, “for disaster mitigation measures to be effective they need to be developed in recognition of the local differences and adopted by the active participation of each community.” OSM’s work in Indonesia fully embodies the importance of mapping local differences and provides important insights on how to catalyze community participation. The buildup of social capital is another important outcome of these efforts. Social capital facilitates collective action and increases local capacity for self-organization, resulting in greater social resilience. In sum, these novel projects demonstrate that technologies used for crisis mapping can be used for disaster preparedness, mitigation and resilience.

Crowdsourcing Crisis Response Following Philippine Floods

Widespread and heavy rains resulting from Typhoon Haikui have flooded the Philippine capital Manila. Over 800,000 have been affected by the flooding and some 250,000 have been relocated to evacuation centers. Given the gravity of the situation, “some resourceful Filipinos put up an online spreadsheet where concerned citizens can list down places where help is most urgently needed” (1). Meanwhile, Google’s Crisis Response Team has launched this resource page  which includes links to News updates, Emergency contact information, Person Finder and this shelter map.

Filipinos volunteers are using an open (but not editable) Google Spreadsheet and crowdsourcing reports using this Google Form to collect urgent reports on needs. The spreadsheet (please click the screenshot below to enlarge) includes time of incident, location (physical address), a description of the alert (many include personal names and phone numbers) and the person it was reported by. Additional fields include status of the alert, the urgency of this alert and whether action has been taken. The latter is also color coded.

“The spreadsheet can easily be referenced by any rescue group that can access the web, and is constantly updated by volunteers real-time” (2). This reminds me a lot of the Google Spreadsheets we used following the Haiti Earthquake of 2010. The Standby Volunteer Task Force (SBTF) continues to use Google Spreadsheets in similar aways but for the purposes of media monitoring and these are typically not made public. What is noteworthy about these important volunteer efforts in the Philippines is that the spreadsheet was made completely public in order to crowdsource the response.

As I’ve noted before, emergency management professionals cannot be every-where at the same time, but the crowd is always there. The tradeoff with the use of open data to crowdsource crisis response is obviously privacy and data protection. Volunteers may therefore want to let those filling out the Google Form know that any information they provide will or may be made public. I would also recommend that they create an “About Us” or “Who We Are” link to cultivate a sense of trust with the initiative. Finally, crowdsourcing offers-for-help may facilitate the “matchmaking” of needs and available resources.

I would give the same advice to volunteers who recently setup this Crowdmap of the floods. I would also suggest they set up their own Standby Volunteer Task Force (SBTF) in order to deploy again in the future. In the meantime, reports on flood levels can be submitted to the crisis map via webform, email and SMS.

Traditional vs. Crowdsourced Election Monitoring: Which Has More Impact?

Max Grömping makes a significant contribution to the theory and discourse of crowdsourced election monitoring in his excellent study: “Many Eyes of Any Kind? Comparing Traditional and Crowdsourced Monitoring and their Contribu-tion to Democracy” (PDF). This 25-page study is definitely a must-read for anyone interested in this topic. That said, Max paints a false argument when he writes: “It is believed that this new methodology almost magically improves the quality of elections […].” Perhaps tellingly, he does not reveal who exactly believes in this false magic. Nor does he cite who subscribes to the view that  “[…] crowdsourced citizen reporting is expected to have significant added value for election observation—and by extension for democracy.”

My doctoral dissertation focused on the topic of crowdsourced election observa-tion in countries under repressive rule. At no point in my research or during interviews with activists did I come across this kind of superficial mindset or opinion. In fact, my comparative analysis of crowdsourced election observation showed that the impact of these initiatives was at best minimal vis-a-vis electoral accountability—particularly in the Sudan. That said, my conclusions do align with Max’s principle findings: “the added value of crowdsourcing lies mainly in the strengthening of civil society via a widened public sphere and the accumulation of social capital with less clear effects on vertical and horizontal accountability.”

This is huge! Traditional monitoring campaigns don’t strengthen civil society or the public sphere. Traditional monitoring teams are typically composed of inter-national observers and thus do not build social capital domestically. At times, traditional election monitoring programs may even lead to more violence, as this recent study revealed. But the point is not to polarize the debate. This is not an either/or argument but rather a both/and issue. Traditional and crowdsourced election observation efforts can absolutely complement each other precisely because they each have a different comparative advantage. Max concurs: “If the crowdsourced project is integrated with traditional monitoring from the very beginning and thus serves as an additional component within the established methodology of an Election Monitoring Organization, the effect on incentive structures of political parties and governments should be amplified. It would then include the best of both worlds: timeliness, visualization and wisdom of the crowd as well as a vetted methodology and legitimacy.”

Recall Jürgen Habermas and his treatise that “those who take on the tools of open expression become a public, and the presence of a synchronized public increasingly constrains un-democratic rulers while expanding the right of that public.” Why is this important? Because crowdsourced election observation projects can potentially bolster this public sphere and create local ownership. Furthermore, these efforts can help synchronize shared awareness, an important catalyzing factor of social movements, according to Habermas. Furthermore, my colleague Phil Howard has convincingly demonstrated that a large active online civil society is a key causal factor vis-a-vis political transitions towards more democratic rule. This is key because the use of crowdsourcing and crowd-mapping technologies often requires some technical training, which can expand the online civil society that Phil describes and render that society more active (as occurred in Egypt during the 2010 Parliamentary Elections—see  dissertation).

The problem? There is very little empirical research on crowdsourced election observation projects let alone assessments of their impact. Then again, these efforts at crowdsourcing are only a few years old and many do’ers in this space are still learning how to be more effective through trial and error. Incidentally, it is worth noting that there has also been very little empirical analysis on the impact of traditional monitoring efforts: “Further quantitative testing of the outlined mechanisms is definitely necessary to establish a convincing argument that election monitoring has positive effects on democracy.”

In the second half of his important study, Max does an excellent job articulating the advantages and disadvantages of crowdsourced election observation. For example, he observes that many crowdsourced initiatives appear to be spon-taneous rather than planned. Therein lies part of the problem. As demonstrated in my dissertation, spontaneous crowdsourced election observation projects are highly unlikely to strengthen civil society let alone build any kind of social capital. Furthermore, in order to solicit a maximum number of citizen-generated election reports, a considerable amount of upfront effort on election awareness raising and education needs to take place in addition to partnership outreach not to mention a highly effective media strategy.

All of this requires deliberate, calculated planning and preparation (key to an effective civil society), which explains why Egyptian activists were relatively more successful in their crowdsourced election observation efforts compared to their counterparts in the Sudan (see dissertation). This is why I’m particularly skeptical of Max’s language on the “spontaneous mechanism of protection against electoral fraud or other abuses.” That said, he does emphasize that “all this is of course contingent on citizens being informed about the project and also the project’s relevance in the eyes of the media.”

I don’t think that being informed is enough, however. An effective campaign not only seeks to inform but to catalyze behavior change, no small task. Still Max is right to point out that a crowdsourced election observation project can “encou-rage citizens to actively engage with this information, to either dispute it, confirm it, or at least register its existence.” To this end, recall that political change is a two-step process, with the second—social step—being where political opinions are formed (Katz and Lazarsfeld 1955). “This is the step in which the Internet in general, and social media in particular, can make a difference” (Shirky 2010). In sum, Max argues that “the public sphere widens because this engagement, which takes place in the context of the local all over the country, is now taken to a wider audience by the means of mapping and real-time reporting.” And so, “even if crowdsourced reports are not acted upon, the very engagement of citizens in the endeavor to directly make their voices heard and hold their leaders accountable widens the public sphere considerably.”

Crowdsourcing efforts are fraught with important and very real challenges, as is already well known. Reliability of crowdsourced information, risk of hate speech spread via uncontrolled reports, limited evidence of impact, concerns over security and privacy of citizen reporters, etc. That said, it is important to note that this “field” is evolving and many in this space are actively looking for solutions to these challenges. During the 2010 Parliamentary Elections in Egypt, the U-Shahid project was able to verify over 90% of the crowdsourced reports. The “field” of information forensics is becoming more sophisticated and variants to crowdsourcing such as bounded crowdsourcing and crowdseeding are not only being proposed but actually implemented.

The concern over unconfirmed reports going viral has little to do with crowd-sourcing. Moreover, the vast majority of crowdsourced election observation initiatives I have studied moderate all content before publication. Concerns over security and privacy are issues not limited to crowdsourced election observation and speak to a broader challenge. There are already several key initiatives underway in the humanitarian and crisis mapping community to address these important challenges. And lest we forget, there are few empirical studies that demonstrate the impact of traditional monitoring efforts in the first place.

In conclusion, traditional monitors are sometimes barred from observing an election. In the past, there have been few to no alternatives to this predicament. Today, crowdsourced efforts are sure to swell up. Furthermore, in the event that traditional monitors conclude that an election was stolen, there’s little they can do to catalyze a local social movement to place pressure on the thieves. This is where crowdsourced election observation efforts could have an important contribution. To quote Max: “instead of being fearful of the ‘uncontrollable crowd’ and criticizing the drawbacks of crowdsourcing, […] governments would be well-advised to embrace new social media. Citizens […] will use new techno-logies and new channels for information-sharing anyway, whether endorsed by their governments or not. So, governments might as well engage with ICTs and crowdsourcing proactively.”

Big thanks to Max for this very valuable contribution to the discourse and to my colleague Tiago Peixoto for flagging this important study.

Launching a Library of Crisis Hashtags on Twitter

I recently posted the following question on the CrisisMappers list-serve: “Does anyone know whether a list of crisis hashtags exists?”

There are several reasons why such a hashtag list would be of added value to the CrisisMappers community and beyond. First, an analysis of Twitter hashtags used during crises over the past few years could be quite insightful; interesting new patterns may be evolving. Second, the resulting analysis could be used as a guide to find (and create) new hashtags when future crises unfold. Third, a library of hashtags would make it easier to collect historical datasets of crisis information shared on Twitter for the purposes of analysis & social computing research. To be sure, without this data, developing more sophisticated machine learning platforms like the Twitter Dashboard for the Humanitarian Cluster System would be serious challenge indeed.

After posting my question on CrisisMappers and Twitter, it was clear that no such library existed. So my colleague Sara Farmer launched a Google Spreadsheet to crowdsource an initial list. Since I was working on a similar list, I’ve created a combined spreadsheet which is available and editable here. Please do add any other crisis hashtags you may know about so we can make this the most comprehensive and up-to-date resource available to everyone. Thank you!

Whilst doing this research, I came across two potentially interesting and helpful hashtag websites: Hashonomy.com and Hashtags.org.

Crowdsourcing a Crisis Map of the Beijing Floods: Volunteers vs Government

Flash floods in Beijing have killed over 70 people and forced the evacuation of more than 50,000 after destroying over 8,000 homes and causing $1.6 billion in damages. In total, some 1.5 million people have been affected by the floods after Beijing recorded the heaviest rainfall the city has seen in more than 60 years.

The heavy rains began on July 21. Within hours, users of the Guokr.com social network launched a campaign to create a live crisis map of the flood’s impact using Google Maps. According to TechPresident, “the result was not only more accurate than the government output—it was available almost a day earlier. According to People’s Daily Online, these crowd-sourced maps were widely circulated on Weibo [China’s version of Twitter] the Monday and Tuesday after the flooding.” The crowdsourced, citizen-generated flood map of Beijing is available here and looks like this:

One advantage of working with Google is that the crisis map can also be viewed via Google Earth. That said, the government does block a number of Google services in China, which puts the regime at a handicap during disasters.

This is an excellent example of crowdsourced crisis mapping. My one recommen-dation to Chinese volunteers would be to crowdsource solutions in addition to problems. In other words, map offers of help and turn the crisis map into a local self-help map, i.e., a Match.com for citizen-based humanitarian response. In short, use the map as a platform for self-organization and crowdsource response by matching calls for help with corresponding offers of help. I would also recommend they create their own Standby Volunteer Task Force (SBTF) for crisis mapping to build social capital and repeat these efforts in future disasters.

Several days after Chinese volunteers first launched their crisis map, the Beijing Water Authority released its own map, which looks like a classic example of James Scott’s “Seeing Like a State.” The map is difficult to read and it is unclear whether the map is even a dynamic or interactive, or live for that matter. It appears static and cryptic. One wonders whether these adjectives also describe the government’s response.

Meanwhile, there is growing anger over the state’s botched response to the floods. According to People’s Daily, “Chinese netizens have criticised the munici-pal authority for failing to update the city’s run-down drainage system or to pre-warn residents about the impending disaster.” In other cities, Guangdong Mobile (the local division of China Mobile) sent out 30 million SMS about the storm in cooperation with the provincial government. “Mobile users in Shenzhen, Zhongshan, Zhuhai, Jiangmen, and Yunfu received reminders to be careful from the telecom company because those five cities were forecast to be most affected by the storm.”

All disasters are political. They test the government’s capacity. The latter’s inability to respond swiftly and effectively has repercussions on citizens’ perception of governance and statehood. The more digital volunteers engage in crisis mapping, the more they highlight the local capacity and agency of ordinary citizens to create shared awareness and help themselves—with or without the state. In doing so, volunteers build social capital, which facilitates future collective action both on and offline. If government officials are not worried about their own failures in disaster management, they should be. This failure will continue to have political consequences, in China and elsewhere.

CrisisTracker: Collaborative Social Media Analysis For Disaster Response

I just had the pleasure of speaking with my new colleague Jakob Rogstadius from Madeira Interactive Technologies Institute (Madeira-TTI). Jakob is working on CrisisTracker, a very interesting platform designed to facilitate collaborative social media analysis for disaster response. The rationale for CrisisTracker is the same one behind Ushahidi’s SwiftRiver project and could be hugely helpful for crisis mapping projects carried out by the Standby Volunteer Task Force (SBTF).

From the CrisisTracker website:

“During large-scale complex crises such as the Haiti earthquake, the Indian Ocean tsunami and the Arab Spring, social media has emerged as a source of timely and detailed reports regarding important events. However, indivi-dual disaster responders, government officials or citizens who wish to access this vast knowledge base are met with a torrent of information that quickly results in information overload. Without a way to organize and navigate the reports, important details are easily overlooked and it is challenging to use the data to get an overview of the situation as a whole.”

We (Madeira University, University of Oulu and IBM Research) believe that volunteers around the world would be willing to assist hard-pressed decision makers with information management, if the tools were available. With this vision in mind, we have developed Crisis-Tracker.”

Like SwiftRiver, CrisisTracker combines some automated clustering of content with the crowdsourced curation of said content for further filtering. “Any user of the system can directly contribute tags that make it easier for other users to retrieve information and explore stories by similarity. In addition, users of the system can influence how tweets are grouped into stories.” Stories can be filtered by Report Category, Keywords, Named Entities, Time and Location. CrisisTracker also allows for simple geo-fencing to capture and list only those Tweets displayed on a given map.

Geolocation, Report Categories and Named Entities are all generated manually. The clustering of reports into stories is done automatically using keyword frequencies. So if keyword dictionaries exist for other languages, the platform could be used in these other languages as well. The result is a list of clustered Tweets displayed below the map, with the most popular cluster at the top.

Clicking on an entry like the row in red above opens up a new page, like the one below. This page lists a group of tweets that all discuss the same specific event, in this case an explosion in Syria’s capital.

What is particularly helpful about this setup is the meta-data displayed for this story or event: the number of people who tweeted about the story, the number of tweets about the story, the first day/time the story was shared on twitter. In addition, the first tweet to report the story is listed along, which is very helpful. This list can be ranked according to “Size” which is a figure that reflects the minimum number of original tweets and the number of Twitter users who shared these tweets. This is a particularly useful metric (and way to deal with spammers). Users also have the option of listing the first 50 tweets that referenced the story.

As you may be able to tell from the “Hide Story” and “Remove” buttons on the righthand-side of the display above, each clustered story and indeed tweet can be hidden or removed if not relevant. This is where crowdsourced curation comes in. In addition, CrisisTracker enable users to geo-tag and categorize each tweets according to report type (e.g., Violence, Deaths, Request/Need, etc.), general keywords (e.g., #assad, #blasts, etc.) and named entities. Note the the keywords can be removed and more high-quality tags can be added or crowdsourced by users as well (see below).

CrisisTracker also suggests related stories that may be of interest to the user based on the initial clustering and filtering—assisted manual clustering. In addition, the platform’s API means that the data can then be exported in XML using a simple parser. So interoperability with platforms like Ushahidi’s would be possible. After our call, Jakob added a link on each story page in the system (a small XML icon below the related stories) to get the story in XML format. Any other system can now take this URL and parse the story into its own native format. Jakob is also looking to build a number of extensions to CrisisTracker and a “Share with Ushahidi” button may be one such future extension. Crisis-Tracker is basically Jakob’s core PhD project, which is very cool, so he’ll be working on this for at least one more year.

In sum, this could very well be the platform that many of us in the crisis mapping space have been waiting for. As I wrote in February 2012, turning the Twitter-sphere “into real-time shared awareness will require that our filtering and curation platforms become more automated and collaborative. I believe the key is thus to combine automated solutions with real-time collaborative crowd-sourcing tools—that is, platforms that enable crowds to collaboratively filter and curate real-time information, in real-time. Right now, when we comb through Twitter, for example, we do so on our own, sitting behind our laptop, isolated from others who may be seeking to filter the exact same type of content. We need to develop free and open source platforms that allow for the distributed-but-networked, crowdsourced filtering and curation of information in order to democratize the sense-making of the firehose.”

Actually, I’ve been advocating for this approach since early 2009. So I’m really excited about Jakob’s project. We’ll be partnering with him and the Standby Volunteer Task Force (SBTF) in September 2012 to test the platform and provide him with expert feedback on how to further streamline the tool for collaborative social media analysis and crisis mapping. Jakob is also looking for domain experts to help on this study. In the meantime, I’ve invited Jakob to present Crisis-Tracker at the 2012 CrisisMappers Conference in Washington DC and very much hope he can join us to demo his tool to us in person. In the meantime, the video above provides an excellent overview of CrisisTracker, as does the project website. Finally, the project is also open source and available on Github here.

Epilogue: The main problem with CrisisTracker is that it is still too manual; it does not include any machine learning & artificial intelligence features; and has only focused on Syria. This may explain why it has not gained traction in the humanitarian space so far.

Towards a Twitter Dashboard for the Humanitarian Cluster System

One of the principal Research and Development (R&D) projects I’m spearheading with colleagues at the Qatar Computing Research Institute (QCRI) has been getting a great response from several key contacts at the UN’s Office for the Coordination of Humanitarian Affairs (OCHA). In fact, their input has been instrumental in laying the foundations for our early R&D efforts. I therefore highlighted the initiative during my recent talk at the UN’s ECOSOC panel in New York, which was moderated by OCHA Under-Secretary General Valerie Amos. The response there was also very positive. So what’s the idea? To develop the foundations for a Twitter Dashboard for the Humanitarian Cluster System.

The purpose of the Twitter Dashboard for Humanitarian Clusters is to extract relevant information from twitter and aggregate this information according to Cluster for analytical purposes. As the above graphic shows, clusters focus on core humanitarian issues including Protection, Shelter, Education, etc. Our plan is to go beyond standard keyword search and simple Natural Language Process-ing (NLP) approaches to more advanced Machine Learning (ML) techniques and social computing methods. We’ve spent the past month asking various contacts whether anyone has developed such a dashboard but thus far have not come across any pre-existing efforts. We’ve also spent this time getting input from key colleagues at OCHA to ensure that what we’re developing will be useful to them.

It is important to emphasize that the project is purely experimental for now. This is one of the big advantages of being part of an institute for advanced computing R&D; we get to experiment and carry out applied research on next-generation humanitarian technology solutions. We realize full well what the many challenges and limitations of using Twitter as an information source are, so I won’t repeat these here. The point is not to suggest that a would-be Twitter Dashboard should be used instead of existing information management platforms. As United Nations colleagues themselves have noted, such a dashboard would simply be another dial on their own dashboards, which may at times prove useful, especially when compared or integrated with other sources of information.

Furthermore, if we’re serious about communicating with disaster affected comm-unities and the latter at times share crisis information on Twitter, then we may want to listen to what they are saying. This includes Diasporas as well. The point, quite simply, is to make full use of Twitter by at least extracting all relevant and meaningful information that contributes to situational awareness. The plan, therefore, is to have the Twitter Dashboard for Humanitarian Clusters aggregate information relevant to each specific cluster and to then provide key analytics for this content in order to reveal potentially interesting trends and outliers within each cluster.

Depending on how the R&D goes, we envision adding “credibility computing” to the Dashboard and expect to collaborate with our Arabic Language Technology Center to add Arabic tweets as well. Other languages could also be added in the future depending on initial results. Also, while we’re presently referring to this platform as a “Twitter” Dashboard, adding SMS,  RSS feeds, etc., could be part of a subsequent phase. The focus would remain specifically on the Humanitarian Cluster system and the clusters’ underlying minimum essential indicators for decision-making.

The software and crisis ontologies we are developing as part of these R&D efforts will all be open source. Hopefully, we’ll have some initial results worth sharing by the time the International Conference of Crisis Mappers (ICCM 2012) rolls around in mid-October. In the meantime, we continue collaborating with OCHA and other colleagues and as always welcome any constructive feedback from iRevolution readers.

Introducing GeoXray for Crisis Mapping

My colleague Joel Myhre recently pointed me to Geosemble’s GeoXray platform, which “automatically filters content to your geographic area of interest and to your keywords of interest to provide you with timely, relevant information that enables you and your organization to make better decisions faster.” While I haven’t tested the platform, it seems similar to what Geofeedia offers.

Perhaps the main difference, beyond user-interface and maybe ease-of-use, is that GeoXray pulls in both external public content (from Twitter, Facebook, Blogs, News, PDFs, etc.) and internal sources such as private databases, documents etc. The platform allows users to search content by keyword, location and time. GeoXray also works off the Google Earth Engine, which enables visual-ization from different angles. The tool can also pull in content from Wikimapia and allows users to tag mapped content according to perceived veracity. One of the strengths of the platform appears to be the tool’s automated geo-location feature. For more on GeoXray:

Truth in the Age of Social Media: A Social Computing and Big Data Challenge

I have been writing and blogging about “information forensics” for a while now and thus relished Nieman Report’s must-read study on “Truth in the Age of Social Media.” My applied research has specifically been on the use of social media to support humanitarian crisis response (see the multiple links at the end of this blog post). More specifically, my focus has been on crowdsourcing and automating ways to quantify veracity in the social media space. One of the Research & Development projects I am spearheading at the Qatar Computing Research Institute (QCRI) specifically focuses on this hybrid approach. I plan to blog about this research in the near future but for now wanted to share some of the gems in this superb 72-page Nieman Report.

In the opening piece of the report, Craig Silverman writes that “never before in the history of journalism—or society—have more people and organizations been engaged in fact checking and verification. Never has it been so easy to expose an error, check a fact, crowdsource and bring technology to bear in service of verification.” While social media is new, traditional journalistic skills and values are still highly relevant to verification challenges in the social media space. In fact, some argue that “the business of verifying and debunking content from the public relies far more on journalistic hunches than snazzy technology.”

I disagree. This is not an either/or challenge. Social computing can help every-one, not just journalists, develop and test hunches. Indeed, it is imperative that these tools be in the reach of the general public since a “public with the ability to spot a hoax website, verify a tweet, detect a faked photo, and evaluate sources of information is a more informed public. A public more resistant to untruths and so-called rumor bombs.” This public resistance to untruths can itself be moni-tored and modeled to quantify veracity, as this study shows.

David Turner from the BBC writes that “while some call this new specialization in journalism ‘information forensics,’ one does not need to be an IT expert or have special equipment to ask and answer the fundamental questions used to judge whether a scene is staged or not.” No doubt, but as Craig rightly points out, “the complexity of verifying content from myriad sources in various mediums and in real time is one of the great new challenges for the profession.” This is fundamentally a Social Computing, Crowd Computing and Big Data problem. Rumors and falsehoods are treated as bugs or patterns of interference rather than as a feature. The key here is to operate at the aggregate level for statistical purposes and to move beyond the notion of true/false as a dichotomy and to-wards probabilities (think statistical physics). Clustering social media across different media and cross-triangulation using statistical models is one area I find particularly promising.

Furthermore, the fundamental questions used to judge whether or not a scene is staged can be codified. “Old values and skills aren’t still at the core of the discipline.” Indeed, and heuristics based on decades of rich experience in the field of journalism can be coded into social computing algorithms and big data analytics platforms. This doesn’t mean that a fully automated solution should be the goal. The hunch of the expert when combined with the wisdom of the crowd and advanced social computing techniques is far more likely to be effective. As CNN’s Lila King writes, technology may not always be able to “prove if a story is reliable but offers helpful clues.” The quicker we can find those clues, the better.

It is true, as Craig notes, that repressive regimes “create fake videos and images and upload them to YouTube and other websites in the hope that news organizations and the public will find them and take them for real.” It is also true that civil society actors can debunk these falsifications as often I’ve noted in my research. While the report focuses on social media, we must not forget that off-line follow up and investigation is often an option. During the 2010 Egyptian Parliamentary Elections, civil society groups were able to verify 91% of crowd-sourced information in near real time thanks to hyper-local follow up and phone calls. (Incidentally, they worked with a seasoned journalist from Thomson Reuters to design their verification strategies). A similar verification strategy was employed vis-a-vis the atrocities commi-tted in Kyrgyzstan two years ago.

In his chapter on “Detecting Truth in Photos”, Santiago Lyon from the Associated Press (AP) describes the mounting challenges of identifying false or doctored images. “Like other news organizations, we try to verify as best we can that the images portray what they claim to portray. We look for elements that can support authenticity: Does the weather report say that it was sunny at the location that day? Do the shadows fall the right way considering the source of light? Is cloth- ing consistent with what people wear in that region? If we cannot communicate with the videographer or photographer, we will add a disclaimer that says the AP “is unable to independently verify the authenticity, content, location or date of this handout photo/video.”

Santiago and his colleagues are also exploring more automated solutions and believe that “manipulation-detection software will become more sophisticated and useful in the future. This technology, along with robust training and clear guidelines about what is acceptable, will enable media organizations to hold the line against willful image manipulation, thus maintaining their credibility and reputation as purveyors of the truth.”

David Turner’s piece on the BBC’s User-Generated Content (UGC) Hub is also full of gems. “The golden rule, say Hub veterans, is to get on the phone whoever has posted the material. Even the process of setting up the conversation can speak volumes about the source’s credibility: unless sources are activists living in a dictatorship who must remain anonymous.” This was one of the strategies used by Egyptians during the 2010 Parliamentary Elections. Interestingly, many of the anecdotes that David and Santiago share involve members of the “crowd” letting them know that certain information they’ve posted is in fact wrong. Technology could facilitate this process by distributing the challenge of collective debunking in a far more agile and rapid way using machine learning.

This may explain why David expects the field of “information forensics” to becoming industrialized. “By that, he means that some procedures are likely to be carried out simultaneously at the click of an icon. He also expects that technological improvements will make the automated checking of photos more effective. Useful online tools for this are Google’s advanced picture search or TinEye, which look for images similar to the photo copied into the search function.” In addition, the BBC’s UGC Hub uses Google Earth to “confirm that the features of the alleged location match the photo.” But these new technologies should not and won’t be limited to verifying content in only one media but rather across media. Multi-media verification is the way to go.

Journalists like David Turner often (and rightly) note that “being right is more important than being first.” But in humanitarian crises, information is the most perishable of commodities, and being last vis-a-vis information sharing can actual do harm. Indeed, bad information can have far-reaching negative con-sequences, but so can no information. This tradeoff must be weighed carefully in the context of verifying crowdsourced crisis information.

Mark Little’s chapter on “Finding the Wisdom in the Crowd” describes the approach that Storyful takes to verification. “At Storyful, we thinking a com-bination of automation and human skills provides the broadest solution.” Amen. Mark and his team use the phrase “human algorithm” to describe their approach (I use the term Crowd Computing). In age when every news event creates a community, “authority has been replaced by authenticity as the currency of social journalism.” Many of Storyful’s tactics for vetting authenticity are the same we use in crisis mapping when we seek to validate crowdsourced crisis information. These combine the common sense of an investigative journalist with advanced digital literacy.

In her chapter, “Taking on the Rumor Mill,” Katherine Lee rights that a “disaster is ready-made for social media tools, which provide the immediacy needed for reporting breaking news.” She describes the use of these tools during and after the tornado hat hit Alabama in April 2011. What I found particularly interesting was her news team’s decision to “log to probe some of the more persistent rumors, tracking where they might have originated and talking with officials to get the facts. The format fit the nature of the story well. Tracking the rumors, with their ever-changing details, in print would have been slow and awkward, and the blog allowed us to update quickly.” In addition, the blog format “gave readers a space to weigh in with their own evidence, which proved very useful.”

The remaining chapters in the Nieman Report are equally interesting but do not focus on “information forensics” per se. I look forward to sharing more on QCRI’s project on quantifying veracity in the near future as our objective is to learn from experts such as those cited above and codify their experience so we can leverage the latest breakthroughs in social computing and big data analytics to facilitate the verification and validation of crowdsourced social media content. It is worth emphasizing that these codified heuristics cannot and must not remain static, nor can the underlying algorithms become hardwired. More on this in a future post. In the meantime, the following links may be of interest:

  • Information Forensics: Five Case Studies on How to Verify Crowdsourced Information from Social Media (Link)
  • How to Verify and Counter Rumors in Social Media (Link)
  • Data Mining to Verify Crowdsourced Information in Syria (Link)
  • Analyzing the Veracity of Tweets During a Crisis (Link)
  • Crowdsourcing for Human Rights: Challenges and Opportunities for Information Collection & Verification (Link)
  • Truthiness as Probability: Moving Beyond the True or False Dichotomy when Verifying Social Media (Link)
  • The Crowdsourcing Detective: Crisis, Deception and Intrigue in the Twittersphere (Link)
  • Crowdsourcing Versus Putin (Link)
  • Wiki on Truthiness resources (Link)
  • My TEDx Talk: From Photosynth to ALLsynth (Link)
  • Social Media and Life Cycle of Rumors during Crises (Link)
  • Wag the Dog, or How Falsifying Crowdsourced Data Can Be a Pain (Link)

Evaluating the Impact of SMS on Behavior Change

The purpose of PeaceTXT is to use mobile messaging (SMS) to catalyze behavior change vis-a-vis peace and conflict issues for the purposes of violence prevention. You can read more about our pilot project in Kenya here and here. We’re hoping to go live next month with some initial trials. In the meantime, we’ve been busy doing research to develop an appropriate monitoring and evaluation strategy. As is often the case in this new innovative initiatives, we have to look to other fields for insights, which is why my colleague Peter van der Windt recently shared this peer-reviewed study entitled: “Mobile Phone Technologies Improve Adherence to Antiretroviral Treatment in a Resource-Limited Setting: A Randomized Con-trolled Trial of Text Message Reminders.”

The objective of the study was to test the “efficacy of short message service (SMS) reminders on adherence to Antiretroviral Treatment (ART) among patients attending a rural clinic in Kenya.” The authors used a Randomized Control Trial (RCT) of “four SMS reminders interventions with 48 weeks of follow-up.” Over four hundred patients were enrolled in the trial and “randomly assigned to a control group or one of the four intervention groups. Participants in the intervention groups received SMS reminders that were either short or long and sent at a daily or weekly frequency.”

The four different text message interventions were “chosen to address different barriers to adherence such as forgetful- ness and lack of social support. Short messages were meant to serve as a simple reminder to take medications, whereas long messages were meant to provide additional support. Daily messages were close to the frequency of medication usage, whereas weekly messages were meant to avoid the possibility that very frequent text messages would be habituating.” The SMS content was developed after extensive consultation with clinic staff and the messages were “sent at 12 p.m., rather than twice daily (during dosing times) to avoid excess reliance on the accuracy of the SMS  software.”

The results of the subsequent statistical analysis reveal that “53% of participants receiving weekly SMS reminders achieved adherence of at least 90% during the 48 weeks of the study, compared with 40% of participants in the control group. Participants in groups receiving weekly reminders were also significantly less likely to experience treatment interruptions exceeding 48 hours during the 48-week follow-up period than participants in the control group.” Interestingly, “adding words of encouragement in the longer text message reminders was not more effective than either a short reminder or no reminder.” Furthermore, it is worth noting that “weekly reminders improved adherence, whereas daily remin-ders did not. Habituation, or the diminishing of a response to a frequently repeated stimulus, may explain this finding. Daily messages might also have been considered intrusive.”

In sum, “despite SMS outages, phone loss, and a rural population, these results suggest that simple SMS interventions could be an important strategy to sustaining optimal ART response.” In other words, SMS reminders can serve as an important tool to catalyze positive behavior change in resource-limited settings. Several insights from this study are going to be important for us to consider in our PeaceTXT project. So if you know of any other relevant studies we should be paying attention to, then please let us know. Thank you!