Category Archives: Information Forensics

QED – Goodbye Doha, Hello Adventure!

Quod Erat Demonstrandum (QED) is Latin for “that which had to be proven.” This abbreviation was traditionally used at the end of mathematical proofs to signal the completion of said proofs. I joined the Qatar Computing Research Institute (QCRI) well over 3 years ago with a very specific mission and mandate: to develop and deploy next generation humanitarian technologies. So I built the Institute’s Social Innovation Program from the ground up and recruited the majority of the full-time experts (scientists, engineers, research assistants, interns & project manager) who have become integral to the Program’s success. During these 3+years, my team and I partnered directly with humanitarian and development organizations to empirically prove that methods from advanced computing can be used to make sense of Big (Crisis) Data. The time has thus come to add “QED” to the end of that proof and move on to new adventures. But first a reflection.

Over the past 3.5 years, my team and I at QCRI developed free and open source solutions powered by crowdsourcing and artificial intelligence to make sense of Tweets, text messages, pictures, videos, satellite and aerial imagery for a wide range of humanitarian and development projects. We co-developed and co-deployed these platforms (AIDR and MicroMappers) with the United Nations and the World Bank in response to major disasters such as Typhoons Haiyan and RubyCyclone Pam and both the Nepal & Chile Earthquakes. In addition, we carried out peer-reviewed, scientific research on these deployments to better understand how to meet the information needs of our humanitarian partners. We also tackled the information reliability question, experimenting with crowd-sourcing (Verily) and machine learning (TweetCred) to assess the credibility of information generated during disasters. All of these initiatives were firsts in the humanitarian technology space.

We later developed AIDR-SMS to auto-classify text messages; a platform that UNICEF successfully tested in Zambia and which the World Food Program (WFP) and the International Federation of the Red Cross (IFRC) now plan to pilot. AIDR was also used to monitor a recent election, and our partners are now looking to use AIDR again for upcoming election monitoring efforts. In terms of MicroMappers, we extended the platform (considerably) in order to crowd-source the analysis of oblique aerial imagery captured via small UAVs, which was another first in the humanitarian space. We also teamed up with excellent research partners to crowdsource the analysis of aerial video footage and to develop automated feature-detection algorithms for oblique imagery analysis based on crowdsourced results derived from MicroMappers. We developed these Big Data solutions to support damage assessment efforts, food security projects and even this wildlife protection initiative.

In addition to the above accomplishments, we launched the Internet Response League (IRL) to explore the possibility of leveraging massive multiplayer online games to process Big Crisis Data. Along similar lines, we developed the first ever spam filter to make sense of Big Crisis Data. Furthermore, we got directly engaged in the field of robotics by launching the Humanitarian UAV Network (UAViators), yet another first in the humanitarian space. In the process, we created the largest repository of aerial imagery and videos of disaster damage, which is ripe for cutting-edge computer vision research. We also spearheaded the World Bank’s UAV response to Category 5 Cyclone Pam in Vanuatu and also directed a unique disaster recovery UAV mission in Nepal after the devastating earthquakes. (I took time off from QCRI to carry out both of these missions and also took holiday time to support UN relief efforts in the Philippines following Typhoon Haiyan in 2013). Lastly, on the robotics front, we championed the development of international guidelines to inform the safe, ethical & responsible use of this new technology in both humanitarian and development settings. To be sure, innovation is not just about the technology but also about crafting appropriate processes to leverage this technology. Hence also the rationale behind the Humanitarian UAV Experts Meetings that we’ve held at the United Nations Secretariat, the Rockefeller Foundation and MIT.

All  of the above pioneering-and-experimental projects have resulted in extensive media coverage, which has placed QCRI squarely on the radar of international humanitarian and development groups. This media coverage has included the New York Times, Washington Post, Wall Street Journal, CNN, BBC News, UK Guardian, The Economist, Forbes and Times Magazines, New Yorker, NPR, Wired, Mashable, TechCrunch, Fast Company, Nature, New Scientist, Scientific American and more. In addition, our good work and applied research has been featured in numerous international conference presentations and keynotes. In sum, I know of no other institute for advanced computing research that has contributed this much to the international humanitarian space in terms of thought-leadership, strategic partnerships, applied research and operational expertise through real-world co-deployments during and after major disasters.

There is, of course, a lot more to be done in the humanitarian technology space. But what we have accomplished over the past 3 years clearly demonstrates that techniques from advanced computing can indeed provide part of the solution to the pressing Big Data challenge that humanitarian & development organizations face. At the same time, as I wrote in the concluding chapter of my new book, Digital Humanitarians, solving the Big Data challenge does not alas imply that international aid organizations will actually make use of the resulting filtered data or any other data for that matter—even if they ask for this data in the first place. So until humanitarian organizations truly shift towards both strategic and tactical evidence-based analysis & data-driven decision-making, this disconnect will surely continue unabated for many more years to come.

Reflecting on the past 3.5 years at QCRI, it is crystal clear to me that the number one most important lesson I (re)learned is that you can do anything if you have an outstanding, super-smart and highly dedicated team that continually goes way above and beyond the call of duty. It is one thing for me to have had the vision for AIDR, MicroMappers, IRL, UAViators, etc., but vision alone does not amount to much. Implementing said vision is what delivers results and learning. And I simply couldn’t have asked for a more talented & stellar team to translate these visions into reality over the past 3+years. You each know who you are, partners included; it has truly been a privilege and honor working with you. I can’t wait to see what you do next at/with QCRI. Thank you for trusting me; thank you for sharing my vision; thanks for your sense of humor, and thank you for your dedication and loyalty to science and social innovation.

So what’s next for me? I’ll be lining up independent consulting work with several organizations (likely including QCRI). In short, I’ll be open for business. I’m also planning to work on a new project that I’m very excited about, so stay tuned for updates; I’ll be sure to blog about this new adventure when the time is right. For now, I’m busy wrapping up my work as Director of Social Innovation at QCRI and working with the best team there is. QED.

How to Become a Digital Sherlock Holmes and Support Relief Efforts

Humanitarian organizations need both timely and accurate information when responding to disasters. Where is the most damage located? Who needs the most help? What other threats exist? Respectable news organizations also need timely and accurate information during crisis events to responsibly inform the public. Alas, both humanitarian & mainstream news organizations are often confronted with countless rumors and unconfirmed reports. Investigative journalists and others have thus developed a number of clever strategies to rapidly verify such reports—as detailed in the excellent Verification Handbook. There’s just one glitch: Journalists and humanitarians alike are increasingly overwhelmed by the “Big Data” generated during crises, particularly information posted on social media. They rarely have enough time or enough staff to verify the majority of unconfirmed reports. This is where Verily comes in, a new type of Detective Agency for a new type of detective: The Virtual Digital Detective.

Screen Shot 2015-02-26 at 5.47.35 AM

The purpose of Verily is to rapidly crowdsource the verification of unconfirmed reports during major disasters. The way it works is simple. If a humanitarian or news organization has a verification request, they simply submit this request online at Verily. This request must be phrased in the form of a Yes-or-No question, such as: “Has the Brooklyn Bridge been destroyed by the Hurricane?”; “Is this Instagram picture really showing current flooding in Indonesia”?; “Is this new YouTube video of the Chile earthquake fake?”; “Is it true that the bush fires in South Australia are getting worse?” and so on.

Verily helps humanitarian & news organizations find answers to these questions by rapidly crowdsourcing the collection of clues that can help answer said questions. Verification questions are communicated widely across the world via Verily’s own email-list of Digital Detectives and also via social media. This new bread of Digital Detectives then scour the web for clues that can help answer the verification questions. Anyone can become a Digital Detective at Verily. Indeed, Verily provides a menu of mini-verification guides for new detectives. These guides were written by some of the best Digital Detectives on the planet, the authors of the Verification Handbook. Verily Detectives post the clues they find directly to Verily and briefly explain why these clues help answer the verification question. That’s all there is to it.

xlarge

If you’re familiar with Reddit, you may be thinking “Hold on, doesn’t Reddit do this already?” In part yes, but Reddit is not necessarily designed to crowdsource critical thinking or to create skilled Digital Detectives. Recall this fiasco during the Boston Marathon Bombings which fueled disastrous “witch hunts”. Said disaster would not have happened on Verily because Verily is deliberately designed to focus on the process of careful detective work while providing new detectives with the skills they need to precisely avoid the kind of disaster that happened on Reddit. This is no way a criticism of Reddit! One single platform alone cannot be designed to solve every problem under the sun. Deliberate, intentional design is absolutely key.

In sum, our goal at Verily is to crowdsource Sherlock Holmes. Why do we think this will work? For several reasons. First, authors of the Verification Handbook have already demonstrated that individuals working alone can, and do, verify unconfirmed reports during crises. We believe that creating a community that can work together to verify rumors will be even more powerful given the Big Data challenge. Second, each one of us with a mobile phone is a human sensor, a potential digital witness. We believe that Verily can help crowdsource the search for eyewitnesses, or rather the search for digital content that these eyewitnesses post on the Web. Third, the Red Balloon Challenge was completed in a matter of hours. This Challenge focused on crowdsourcing the search for clues across an entire continent (3 million square miles). Disasters, in contrast, are far more narrow in terms of geographic coverage. In other words, the proverbial haystack is smaller and thus the needles easier to find. More on Verily here & here.

So there’s reason to be optimistic that Verily can succeed given the above and recent real-world deployments. Of course, Verily is is still very much in early phase and still experimental. But both humanitarian organizations and high-profile news organizations have expressed a strong interest in field-testing this new Digital Detective Agency. To find out more about Verily and to engage with experts in verification, please join us on Tuesday, March 3rd at 10:00am (New York time) for this Google Hangout with the Verily Team and our colleague Craig Silverman, the Co-Editor of the Verification Handbook. Click here for the Event Page and here to follow on YouTube. You can also join the conversations on Twitter and pose questions or comments using the hashtag #VerilyLive.

Video: Digital Humanitarians & Next Generation Humanitarian Technology

How do international humanitarian organizations make sense of the “Big Data” generated during major disasters? They turn to Digital Humanitarians who craft and leverage ingenious crowdsourcing solutions with trail-blazing insights from artificial intelligence to make sense of vast volumes of social media, satellite imagery and even UAV/aerial imagery. They also use these “Big Data” solutions to verify user-generated content and counter rumors during disasters. The talk below explains how Digital Humanitarians do this and how their next generation humanitarian technologies work.

Many thanks to TTI/Vanguard for having invited me to speak. Lots more on Digital Humanitarians in my new book of the same title.

bookcover

Videos of my TEDx talks and the talks I’ve given at the White House, PopTech, Where 2.0, National Geographic, etc., are all available here.

Reflections on Digital Humanitarians – The Book

In January 2014, I wrote this blog post announcing my intention to write a book on Digital Humanitarians. Well, it’s done! And launches this week. The book has already been endorsed by scholars at Harvard, MIT, Stanford, Oxford, etc; by practitioners at the United Nations, World Bank, Red Cross, USAID, DfID, etc; and by others including Twitter and National Geographic. These and many more endorsements are available here. Brief summaries of each book chapter are available here; and the short video below provides an excellent overview of the topics covered in the book. Together, these overviews make it clear that this book is directly relevant to many other fields including journalism, human rights, development, activism, business management, computing, ethics, social science, data science, etc. In short, the lessons that digital humanitarians have learned (often the hard way) over the years and the important insights they have gained are directly applicable to fields well beyond the humanitarian space. To this end, Digital Humanitarians is written in a “narrative and conversational style” rather than with dense, technical language.

The story of digital humanitarians is a multifaceted one. Theirs is not just a story about using new technologies to make sense of “Big Data”. For the most part, digital humanitarians are volunteers; volunteers from all walks of life and who occupy every time zone. Many are very tech-savvy and pull all-nighters, but most simply want to make a difference using the few minutes they have with the digital technologies already at their fingertips. Digital humanitarians also include pro-democracy activists who live in countries ruled by tyrants. This story is thus also about hope and humanity; about how technology can extend our humanity during crises. To be sure, if no one cared, if no one felt compelled to help others in need, or to change the status quo, then no one even would bother to use these new, next generation humanitarian technologies in the first place.

I believe this explains why Professor Leysia Palen included the following in her very kind review of my book: “I dare you to read this book and not have both your heart and mind opened.” As I reflected to my editor while in the midst of book writing, an alternative tag line for the title could very well be “How Big Data and Big Hearts are Changing the Face of Humanitarian Response.” It is personally and deeply important to me that the media, would-be volunteers  and others also understand that the digital humanitarians story is not a romanticized story about a few “lone heroes” who accomplish the impossible thanks to their super human technical powers. There are thousands upon thousands of largely anonymous digital volunteers from all around the world who make this story possible. And while we may not know all their names, we certainly do know about their tireless collective action efforts—they mobilize online from all corners of our Blue Planet to support humanitarian efforts. My book explains how these digital volunteers do this, and yes, how you can too.

Digital humanitarians also include a small (but growing) number of forward-thinking professionals from large and well-known humanitarian organizations. After the tragic, nightmarish earthquake that struck Haiti in January 2010, these seasoned and open-minded humanitarians quickly realized that making sense of “Big Data” during future disasters would require new thinking, new risk-taking, new partnerships, and next generation humanitarian technologies. This story thus includes the invaluable contributions of those change-agents and explains how these few individuals are enabling innovation within the large bureaucracies they work in. The story would thus be incomplete without these individuals; without their appetite for risk-taking, their strategic understanding of how to change (and at times circumvent) established systems from the inside to make their organizations still relevant in a hyper-connected world. This may explain why Tarun Sarwal of the International Committee of the Red Cross (ICRC) in Geneva included these words (of warning) in his kind review: “For anyone in the Humanitarian sector — ignore this book at your peril.”

bookcover

Today, this growing, cross-disciplinary community of digital humanitarians are crafting and leveraging ingenious crowdsourcing solutions with trail-blazing insights from advanced computing and artificial intelligence in order to make sense of “Big Data” generated during disasters. In virtually real-time, these new solutions (many still in early prototype stages) enable digital volunteers to make sense of vast volumes of social media, SMS and imagery captured from satellites & UAVs to support relief efforts worldwide.

All of this obviously comes with a great many challenges. I certainly don’t shy away from these in the book (despite my being an eternal optimist : ). As Ethan Zuckerman from MIT very kindly wrote in his review of the book,

“[Patrick] is also a careful scholar who thinks deeply about the limits and potential dangers of data-centric approaches. His book offers both inspiration for those around the world who want to improve our disaster response and a set of fertile challenges to ensure we use data wisely and ethically.”

Digital humanitarians are not perfect, they’re human, they make mistakes, they fail; innovation, after all, takes experimenting, risk-taking and failing. But most importantly, these digital pioneers learn, innovate and over time make fewer mistakes. In sum, this book charts the sudden and spectacular rise of these digital humanitarians and their next generation technologies by sharing their remarkable, real-life stories and the many lessons they have learned and hurdles both cleared & still standing. In essence, this book highlights how their humanity coupled with innovative solutions to “Big Data” is changing humanitarian response forever. Digital Humanitarians will make you think differently about what it means to be humanitarian and will invite you to join the journey online. And that is what it’s ultimately all about—action, responsible & effective action.

Why did I write this book? The main reason may perhaps come as a surprise—one word: hope. In a world seemingly overrun by heart-wrenching headlines and daily reminders from the news and social media about all the ugly and cruel ways that technologies are being used to spy on entire populations, to harass, oppress, target and kill each other, I felt the pressing need to share a different narrative; a narrative about how selfless volunteers from all walks of life, from all ages, nationalities, creeds use digital technologies to help complete strangers on the other side of the planet. I’ve had the privilege of witnessing this digital good-will first hand and repeatedly over the years. This goodwill is what continues to restore my faith in humanity and what gives me hope, even when things are tough and not going well. And so, I wrote Digital Humanitarians first and fore-most to share this hope more widely. We each have agency and we can change the world for the better. I’ve seen this and witnessed the impact first hand. So if readers come away with a renewed sense of hope and agency after reading the book, I will have achieved my main objective.

For updates on events, talks, trainings, webinars, etc, please click here. I’ll be organizing a Google Hangout on March 5th for readers who wish to discuss the book in more depth and/or follow up with any questions or ideas. If you’d like additional information on this and future Hangouts, please click on the previous link. If you wish to join ongoing conversations online, feel free to do so with the FB & Twitter hashtag #DigitalJedis. If you’d like to set up a book talk and/or co-organize a training at your organization, university, school, etc., then do get in touch. If you wish to give a talk on the book yourself, then let me know and I’d be happy to share my slides. And if you come across interesting examples of digital humanitarians in action, then please consider sharing these with other readers and myself by using the #DigitalJedis hashtag and/or by sending me an email so I can include your observation in my monthly newsletter and future blog posts. I also welcome guest blog posts on iRevolutions.

Naturally, this book would never have existed were it for digital humanitarians volunteering their time—day and night—during major disasters across the world. This book would also not have seen the light of day without the thoughtful guidance and support I received from these mentors, colleagues, friends and my family. I am thus deeply and profoundly grateful for their spirit, inspiration and friendship. Onwards!

Digital Jedis: There Has Been An Awakening…

Live: Crowdsourced Verification Platform for Disaster Response

Earlier this year, Malaysian Airlines Flight 370 suddenly vanished, which set in motion the largest search and rescue operation in history—both on the ground and online. Colleagues at DigitalGlobe uploaded high resolution satellite imagery to the web and crowdsourced the digital search for signs of Flight 370. An astounding 8 million volunteers rallied online, searching through 775 million images spanning 1,000,000 square kilometers; all this in just 4 days. What if, in addition to mass crowd-searching, we could also mass crowd-verify information during humanitarian disasters? Rumors and unconfirmed reports tend to spread rather quickly on social media during major crises. But what if the crowd were also part of the solution? This is where our new Verily platform comes in.

Verily Image 1

Verily was inspired by the Red Balloon Challenge in which competing teams vied for a $40,000 prize by searching for ten weather balloons secretly placed across some 8,000,0000 square kilometers (the continental United States). Talk about a needle-in-the-haystack problem. The winning team from MIT found all 10 balloons within 8 hours. How? They used social media to crowdsource the search. The team later noted that the balloons would’ve been found more quickly had competing teams not posted pictures of fake balloons on social media. Point being, all ten balloons were found astonishingly quickly even with the disinformation campaign.

Verily takes the exact same approach and methodology used by MIT to rapidly crowd-verify information during humanitarian disasters. Why is verification important? Because humanitarians have repeatedly noted that their inability to verify social media content is one of the main reasons why they aren’t making wider user of this medium. So, to test the viability of our proposed solution to this problem, we decided to pilot the Verily platform by running a Verification Challenge. The Verily Team includes researchers from the University of Southampton, the Masdar Institute and QCRI.

During the Challenge, verification questions of various difficulty were posted on Verily. Users were invited to collect and post evidence justifying their answers to the “Yes or No” verification questions. The photograph below, for example, was posted with the following question:

Verily Image 3

Unbeknownst to participants, the photograph was actually of an Italian town in Sicily called Caltagirone. The question was answered correctly within 4 hours by a user who submitted another picture of the same street. The results of the new Verily experiment are promissing. Answers to our questions were coming in so rapidly that we could barely keep up with posting new questions. Users drew on a variety of techniques to collect their evidence & answer the questions we posted:

Verily was designed with the goal of tapping into collective critical thinking; that is, with the goal of encouraging people think about the question rather than use their gut feeling alone. In other words, the purpose of Verily is not simply to crowdsource the collection of evidence but also to crowdsource critical thinking. This explains why a user can’t simply submit a “Yes” or “No” to answer a verification question. Instead, they have to justify their answer by providing evidence either in the form of an image/video or as text. In addition, Verily does not make use of Like buttons or up/down votes to answer questions. While such tools are great for identifying and sharing content on sites like Reddit, they are not the right tools for verification, which requires searching for evidence rather than liking or retweeting.

Our Verification Challenge confirmed the feasibility of the Verily platform for time-critical, crowdsourced evidence collection and verification. The next step is to deploy Verily during an actual humanitarian disaster. To this end, we invite both news and humanitarian organizations to pilot the Verily platform with us during the next natural disaster. Simply contact me to submit a verification question. In the future, once Verily is fully developed, organizations will be able to post their questions directly.

bio

See Also:

  • Verily: Crowdsourced Verification for Disaster Response [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]
  • Six Degrees of Separation: Implications for Verifying Social Media [link]

Got TweetCred? Use it To Automatically Identify Credible Tweets (Updated)

Update: Users have created an astounding one million+ tags over the past few weeks, which will help increase the accuracy of TweetCred in coming months as we use these tags to further train our machine learning classifiers. We will be releasing our Firefox plugin in the next few days. In the meantime, we have just released our paper on TweetCred which describes our methodology & classifiers in more detail.

What if there were a way to automatically identify credible tweets during major events like disasters? Sounds rather far-fetched, right? Think again.

The new field of Digital Information Forensics is increasingly making use of Big Data analytics and techniques from artificial intelligence like machine learning to automatically verify social media. This is how my QCRI colleague ChaTo et al. already predicted both credible and non-credible tweets generated after the Chile Earthquake (with an accuracy of 86%). Meanwhile, my colleagues Aditi, et al. from IIIT Delhi also used machine learning to automatically rank the credibility of some 35 million tweets generated during a dozen major international events such as the UK Riots and the Libya Crisis. So we teamed up with Aditi et al. to turn those academic findings into TweetCred, a free app that identifies credible tweets automatically.

CNN TweetCred

We’ve just launched the very first version of TweetCred—key word being first. This means that our new app is still experimental. On the plus side, since TweetCred is powered by machine learning, it will become increasingly accurate over time as more users make use of the app and “teach” it the difference between credible and non-credible tweets. Teaching TweetCred is as simple as a click of the mouse. Take the tweet below, for example.

ARC TweetCred Teach

TweetCred scores each tweet based based on a 7-point system, the higher the number of blue dots, the more credible the content of the tweet is likely to be. Note that a TweetCred score also takes into account any pictures or videos included in a tweet along with the reputation and popularity of the Twitter user. Naturally, TweetCred won’t always get it right, which is where the teaching and machine learning come in. The above tweet from the American Red Cross is more credible than three dots would suggest. So you simply hover your mouse over the blue dots and click on the “thumbs down” icon to tell TweetCred it got that tweet wrong. The app will then ask you to tag the correct level of credibility for that tweet is.

ARC TweetCred Teach 3

That’s all there is to it. As noted above, this is just the first version of TweetCred. The more all of us use (and teach) the app, the more accurate it will be. So please try it out and spread the word. You can download the Chrome Extension for TweetCred here. If you don’t use Chrome, you can still use the browser version here although the latter has less functionality. We very much welcome any feedback you may have, so simply post feedback in the comments section below. Keep in mind that TweetCred is specifically designed to rate the credibility of disaster/crisis related tweets rather than any random topic on Twitter.

As I note in my book Digital Humanitarians (forthcoming), empirical studies have shown that we’re less likely to spread rumors on Twitter if false tweets are publicly identified by Twitter users as being non-credible. In fact, these studies show that such public exposure increases the number of Twitter users who then seek to stop the spread of said of rumor-related tweets by 150%. But, it makes a big difference whether one sees the rumors first or the tweets dismissing said rumors first. So my hope is that TweetCred will help accelerate Twitter’s self-correcting behavior by automatically identifying credible tweets while countering rumor-related tweets in real-time.

This project is a joint collaboration between IIIT and QCRI. Big thanks to Aditi and team for their heavy lifting on the coding of TweetCred. If the experiments go well, my QCRI colleagues and I may integrate TweetCred within our AIDR (Artificial Intelligence for Disaster Response) and Verily platforms.

Bio

See also:

  • New Insights on How to Verify Social Media [link]
  • Predicting the Credibility of Disaster Tweets Automatically [link]
  • Auto-Ranking Credibility of Tweets During Major Events [link]
  • Auto-Identifying Fake Images on Twitter During Disasters [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]
  • Analyzing Fake Content on Twitter During Boston Bombings [link]
  • How to Verify Crowdsourced Information from Social Media [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]
  • Tweets, Crises and Behavioral Psychology: On Credibility and Information Sharing [link]