Update: We’re developing & launching MicroFilters to improve MicroMappers.
About 47 hours ago, the UN Office for the Coordination of Humanitarian Affairs (OCHA) activated the Digital Humanitarian Network (DHN) in response to the Pakistan Earthquake. The activation request was for 48 hours, so the deployment will soon phase out. As already described here, the Standby Volunteer Task Force (SBTF) teamed up with QCRI to carry out an early test of MicroMappers, which was not set to launch until next month. This post shares some initial thoughts on how the test went along with preliminary results.
During ~40 hours, 109 volunteers from the SBTF and the public tagged just over 30,000 tweets that were posted during the first 36 hours or so after the quake. We were able to automatically collect these tweets thanks to our partnership with GNIP and specifically filtered for said tweets using half-a-dozen hashtags. Given the large volume of tweets collected, we did not require that each tweet be tagged at least 3 times by individual volunteers to ensure data quality control. Out of these 30,000+ tweets, volunteers tagged a total of 177 tweets as noting needs or infrastructure damage. A review of these tweets by the SBTF concluded that none were actually informative or actionable.
Just over 350 pictures were tweeted in the aftermath of the earthquake. These were uploaded to the ImageClicker for tagging purposes. However, none of the pictures captured evidence of infrastructure damage. In fact, the vast majority were unrelated to the earthquake. This was also true of pictures published in news articles. Indeed, we used an automated algorithm to identify all tweets with links to news articles; this algorithm would then crawl these articles for evidence of images. We found that the vast majority of these automatically extracted pictures were related to politics rather than infrastructure damage.
A few preliminary thoughts and reflections from this first test of MicroMappers. First, however, a big, huge, gigantic thanks to my awesome QCRI team: Ji Lucas, Imran Muhammad and Kiran Garimella; to my outstanding colleagues on the SBTF Core Team including but certainly not limited to Jus Mackinnon, Melissa Elliott, Anahi A. Iaccuci, Per Aarvik & Brendan O’Hanrahan (bios here); to the amazing SBTF volunteers and members of the general public who rallied to tag tweets and images—in particular our top 5 taggers: Christina KR, Leah H, Lubna A, Deborah B and Joyce M! Also bravo to volunteers in the Netherlands, UK, US and Germany for being the most active MicroMappers; and last but certainly not least, big, huge and gigantic thanks to Andrew Ilyas for developing the algorithms to automatically identify pictures and videos posted to Twitter.
So what did we learn over the past 48 hours? First, the disaster-affected region is a remote area of south-western Pakistan with a very light social media footprint, so there was practically no user-generated content directly relevant to needs and damage posted on Twitter during the first 36 hours. In other words, there were no needles to be found in the haystack of information. This is in stark contrast to our experience when we carried out a very similar operation following Typhoon Pablo in the Philippines. Obviously, if there’s little to no social media footprint in a disaster-affected area, then monitoring social media is of no use at all to anyone. Note, however, that MicroMappers could also be used to tag 30,000+ text messages (SMS). (Incidentally, since the earthquake struck around 12noon local time, there was only about 18 hours of daylight during the 36-hour period for which we collected the tweets).
Second, while the point of this exercise was not to test our pre-processing filters, it was clear that the single biggest problem was ultimately with the filtering. Our goal was to upload as many tweets as possible to the Clickers and stress-test the apps. So we only filtered tweets using a number of general hashtags such as #Pakistan. Furthermore, we did not filter out any retweets, which probably accounted for 2/3 of the data, nor did we filter by geography to ensure that we were only collecting and thus tagging tweets from users based in Pakistan. This was a major mistake on our end. We were so pre-occupied with testing the actual Clickers that we simply did not pay attention to the pre-processing of tweets. This was equally true of the images uploaded to the ImageClicker.
So where do we go from here? Well we have pages and pages worth of feedback to go through and integrate in the next version of the Clickers. For me, one of the top priorities is to optimize our pre-processing algorithms and ensure that the resulting output can be automatically uploaded to the Clickers. We have to refine our algorithms and make damned sure that we only upload unique tweets and images to our Clickers. At most, volunteers should not see the same tweet or image more than 3 times for verification purposes. We should also be more careful with our hashtag filtering and also consider filtering by geography. Incidentally, when our free & open source AIDR platform becomes operational in November, we’ll also have the ability to automatically identify tweets referring to needs, reports of damage, and much, much more.
In fact, AIDR was also tested for the very first time. SBTF volunteers tagged about 1,000 tweets, and just over 130 of the tags enabled us to create an accurate classifier that can automatically identify whether a tweet is relevant for disaster response efforts specifically in Pakistan (80% accuracy). Now, we didn’t apply this classifier on incoming tweets because AIDR uses streaming Twitter data, not static, archived data which is what we had (in the form of CSV files). In any event, we also made an effort to create classifiers for needs and infrastructure damage but did not get enough tags to make these accurate enough. Typically, we need a minimum of 20 or so tags (i.e., examples of actual tweets referring to needs or damage). The more tags, the more accurate the classifier.
The reason there were so few tags, however, is because there were very few to no informative tweets referring to needs or infrastructure damage during the first 36 hours. In any event, I believe this was the very first time that a machine learning classifier was crowdsourced for disaster response purposes. In the future, we may want to first crowdsource a machine learning classifier for disaster relevant tweets and then upload the results to MicroMappers; this would reduce the number of unrelated tweets displayed on a TweetClicker.
As expected, we have also received a lot of feedback vis-a-vis user experience and the user interface of the Clickers. Speed is at the top of the list. That is, making sure that once I’ve clicked on a tweet/image, the next tweet/image automatically appears. At times, I had to wait more than 20 seconds for the next item to load. We also need to add more progress bars such as the number of tweets or images that remain to be tagged—a countdown display, basically. I could go on and on, frankly, but hopefully these early reflections are informative and useful to others developing next-generation humanitarian technologies. In sum, there is a lot of work to be done still. Onwards!
Were you only filtering in English or in Urdu and Balochi, as well?
Very good question, I totally overlooked this in my blog post, so thanks for raising the language issue. We collected all tweets in all languages. But our volunteers were mostly English speaking so any tweets that were not in English they tagged as “Not English”. I just looked at those tagged tweets and I’d say maybe 1,000 were not in English but most of these (70%-80%) were retweets. In the future, we’re planning to develop a TranslateClicker to facilitate this. AIDR will also be able to collect tweets according to language, so that will make the process easier as well. Thanks for reading, and thanks again for your question.
Pingback: Tag Tweets to Support Earthquake Relief Efforts in Pakistan! « MicroMappers
wouldn’t the accuracy of micromapper`s depicted disaster be confined to limited areas as compared to real disaster impact , as there are more remote areas without media footprint?
Thanks for your note, Arshad. Not sure if I fully understand your question, but said accuracy has nothing to do with MicroMappers and everything to do with accuracy/coverage of social media. Hence my noting that MicroMappers could also pull in SMS.
Pingback: MicroMappers Launched for Pakistan Earthquake Response (Updated) | iRevolution
Pingback: Categorising tweets for the Pakistan earthquake | Social Media for Good
Where is the source code for AIDR? I’d like to show it to my students.
Thanks for reading and for your interest, Neal. As the blog post notes, the platform is under development will launch in November. The source code will be made available at that time.
Pingback: Results of MicroMappers Response to Pakistan Earthquake • Dominik P.H. Kalisch
Interesting article and evaluation. It seems that we need to spread the word globally that if you are impacted by disaster, do your best to take a photo of infrastructure damage, or describe your situation, and tweet it. Of course this may not be possible in many circumstances but might it be doable in some situations. I think that many people do not know about MicroMapper-initatives that could possible help them. “Spreading the word” would be a preparedness measure that would hopefully improve the quality of tweet data – just some thoughts!
Thanks for reading, Gerald, and thanks for your comment. Yes, I agree with you, there is an important role for policy here. To your point, you may find the following of interest:
Thanks Patrick for your response and links. Glad to see that research is being done in this area and that the importance of citizen reporting via Twitter is being recognized.
Pingback: Developing MicroFilters for Digital Humanitarian Response | iRevolution
Thank you Patrick for the innovative work. My question is why does it have to be geared to disasters only?! My understanding, if I am right, is that MicroMaper is capable for handling any major collective event, such as Hajj to Makkah/Mecca “pilgrimage”, which is approaching soon.
Best wishes and regards.
Hi Ramze, thanks for reading and commenting. Indeed, microtasking can be used for any number of things. MicroMappers, however, is specifically for humanitarian disasters because this addresses a major need for digital humanitarian response, the field that I work in. You can create your own microtasking apps using CrowdCrafting since the platform is free and open source.
Pingback: AIDR: Artificial Intelligence for Disaster Response | iRevolution
Pingback: How AI, Twitter and Digital Volunteers are Transforming Humanitarian Disaster Response | The Standby Task Force
Pingback: Social Media, Crisis Mapping and the New Frontier in Disaster Response | The Standby Task Force
Pingback: MicroMappers: Microtasking for Disaster Response | iRevolution
Pingback: Humanitarian Crisis Computing 101 | iRevolution