We’ve all seen prompts like this:
More than 100 million of these ReCAPTCHAs get filled out every day on sites like Facebook, Twitter and CNN. Google uses them to simultaneously filter out spam and digitize Google Books and archives of the New York Times. For example:
So what’s the connection to disaster response? In early 2010, I blogged about using massive multiplayer games to tag crisis information and asked: What is the game equivalent of reCAPTCHA for tagging crisis information? (Big thanks to friend and colleague Albert Lin for reminding me of this recently). Well, the game equivalent is perhaps the Internet Response League (IRL). But what if we simply used ReCPATCHA itself for disaster response?
Humanitarian organizations like the American Red Cross regularly monitor Twitter for disaster-related information. But they are often overwhelmed with millions of tweets during major events. While my team and I at QCRI are developing automated solutions to manage this Big (Crisis) Data, we could also use the ReCAPTCHA methodology. For example, our automated classifiers can tell us with a certain level of accuracy whether a tweet is disaster-related, whether it refers to infrastructure damage, urgent needs, etc. If the classifier is not sure—say the tweet is scored as having a 50% chance of being related to infrastructure damage—then we could automatically post it to our version of ReCAPCHA (see below). Perhaps a list of 3 tweets could be posted with the user prompted to tag which one of the 3 is damage-related. (The other two tweets could come from a separate database of random tweets).
There are reportedly 44,000 United Nations employees around the globe. World Vision also employs over 40,000, the International Committee of the Red Cross (ICRC) has more than 12,000 employees while Oxfam has about 7,000. That’s 100,000 people right there who probably log onto their work emails at least once a day. Why not insert a ReCaptcha when they log in? We could also add ReCAPTCHAs to these organizations’ Intranets & portals like Virtual OSOCC. On a related note, Google recently added images from Google Street View to ReCAPTCHAS. So we could automatically collect images shared on social media during disasters and post them to our own disaster response ReCAPTCHAs:
In sum, as humanitarians log into their emails multiple times a day, they’d be asked to tag which tweets and/or pictures relate to on ongoing disaster. Last year, we tagged tweets and images in support of the UN’s disaster response efforts in the Philippines following Typhoon Pablo. Adding a customized ReCAPTCHA for disaster response would help us tap a much wider audience of “volunteers”, which would mean an even more rapid turn around time for damage assessments following major disasters.
Very interesting concept Patrick! Good concrete way to insert some micro-tasking work into our everyday (even for non-disaster times!) Be happy to chat about ways to insert a project like this into UN-OCHA (and perhaps wider).
And, I could have given you a better picture to use 🙂
Well I wanted to save the better picture for the real project :b
But seriously, would be great to talk about how to make this happen. Seems easy enough, simply need to figure out a few workflows.
ps. thanks for reading!
reCAPTCHA works by using two words. The first was already validated by some other number of users, the system knows what it is, and serves as the bot filter/control word. The second word doesn’t affect the user’s ability to submit the form. Instead, that’s the word that the system doesn’t know yet. After a set number of users submit the same transcription of it, the word becomes verified and can be used as a control word.
To crowd-source disaster information, a similar system could be used. If two tweets or images are displayed, the first one would be known disaster information (to filter out bots), and the second would come from your “maybe” list. After 2/3 people agree on whether or not it’s disaster-related, the information can be considered verified both for emergency public information purposes and for filtering out bots.
Thanks Ian, yes exactly.
Interesting idea as usual Patrick. If you would like to talk more about testing it in World Vision, I’d be happy to do what I could to see what I could do.
Wonderful! Thanks for reading and offering, Chris. A colleague of mine has a good contact at ReCAPTCHA so hoping to get in introduction. If we could just customize their plugin, then this may be relatively straightforward. So will definitely keep you posted.
It’s an interesting idea, but would it be able to create enough data fast enough to use for a disaster? It might be better suited to complex emergencies and longer term humanitarian emergencies.
Thanks for reading, and for your question. The idea is not to use this approach in isolation, hence my work on projects like MicroMappers, IRL, etc. Am looking for an ecosystem approach. Also, keep in mind that there would be pre-filtering on the data input using those automated classifiers, so the volume of tweets/images would be a small subset. Plus, if ReCAPTCHA can be used for disaster response, then we can always look to scale beyond humanitarian organizations (ie, have more than 100,000 users).
Cool idea! When used like Ian described, I can see this would be a useful approach for pre-filtering data by the crowd.
The second example (screenshot) with 3 images doesn’t do must justice to the idea though (anything looks like a disaster compared to a kitten – and they have to pick 1 image).
Thanks for reading. Re image example, the point is *not* to make it hard for the human! Wiki’s and online banking interfaces use simple images. In terms of 3 images, that was just an example, it is completely arbitrary, it can be 8 images (just like some wiki’s).
Pingback: reCAPTCHA podría clasificar automáticamente información sobre catástrofes | iRescate
Pingback: Как использование капчи может помочь во время кризисов? | Дежурная Волонтерская Целевая Сила (SBTF Russian)
Pingback: The First Ever Spam Filter for Disaster Response | iRevolution