Recent scientific research has shown that aerial imagery captured during a single 20-minute UAV flight can take more than half-a-day to analyze. We flew several dozen flights during the World Bank’s humanitarian UAV mission in response to Cyclone Pam earlier this year. The imagery we captured would’ve taken a single expert analyst a minimum 20 full-time workdays to make sense of. In other words, aerial imagery is already a Big Data problem. So my team and I are using human computing (crowdsourcing), machine computing (artificial intelligence) and computer vision to make sense of this new Big Data source.
For example, we recently teamed up with the University of Southampton and EPFL to analyze aerial imagery of the devastation caused by Cyclone Pam in Vanuatu. The purpose of this research is to generate timely answers. Aid groups want more than high-resolution aerial images of disaster-affected areas, they want answers; answers like the number and location of damaged buildings, the number and location of displaced peoples, and which roads are still useable for the delivery of aid, for example. Simply handing over the imagery is not good enough. As demonstrated in my new book, Digital Humanitarians, both aid and development organizations are already overwhelmed by the vast volume and velocity of Big Data generated during and post-disasters. Adding yet another source, Big Aerial Data, may be pointless since these organizations may simply not have the time or capacity to make sense of this new data let alone integrate the results with their other datasets.
We therefore analyzed the crowdsourced results from the deployment of our MicroMappers platform following Cyclone Pam to determine whether those results could be used to train algorithms to automatically detect disaster damage in future disasters in Vanuatu. During this MicroMappers deployment, digital volunteers analyzed over 3,000 high-resolution oblique aerial images, tracing houses that were fully destroyed, partially damaged and largely intact. My colleague Ferda Ofli and I teamed up with Nicolas Rey (a graduate student from EPFL who interned with us over the summer) to explore whether these traces could be used to train our algorithms. The results below were written with Ferda and Nicolas. Our research is not just an academic exercise. Vanuatu is the most disaster-prone country in the world. What’s more, this year’s El Niño is expected to be one of the strongest in half-a-century.
According to the crowdsourced results, 1,145 of the high-resolution images did not contain any buildings. Above is a simple histogram depicting the number of buildings per image. The aerial images of Vanuatu are very heterogeneous, and vary not only in diversity of features they exhibit but also in the angle of view and the altitude at which the pictures were taken. While the vast majority of the images are oblique, some are almost nadir images, and some were taken very close to the ground or even before take off.
The heterogeneity of our dataset of images makes the automated analysis of this imagery a lot more difficult. Furthermore, buildings that are under construction, of which there are many in our dataset, represent a major difficulty because they look very similar to damaged buildings. Our first task thus focused on training our algorithms to determine whether or not any given aerial image shows some kind of building. This is an important task given that more than ~30% of the images in our dataset do not contain buildings. As such, if we can develop an accurate algorithm to automatically filter out these irrelevant images (like the “noise” below), this will allows us focus the crowdsourced analysis of relevant images only.
While our results are purely preliminary, we are still pleased with our findings thus far. We’ve been able to train our algorithms to determine whether or not an aerial image includes a building with just over 90% accuracy at the tile level. More specifically, our algorithms were able to recognize and filter out 60% of the images that do not contain any buildings (recall rate), and only 10% of the images that contain buildings were mistakingly discarded (precision rate of 90%). The example below is an example. There are still quite a number of major challenges, however, so we want to be sure not to over-promise anything at this stage. In terms of next steps, we would like to explore whether our computer vision algorithms can distinguish between destroyed an intact buildings.
The UAVs we were flying in Vanuatu required that we landed them in order to get access to the collected imagery. Increasingly, newer UAVs offer the option of broadcasting the aerial images and videos back to base in real time. DJI’s new Phantom 3 UAV (pictured below), for example, allows you to broadcast your live aerial video feed directly to YouTube (assuming you have connectivity). There’s absolutely no doubt that this is where the UAV industry is headed; towards real-time data collection and analysis. In terms of humanitarian applications, and search and rescue, having the data-analysis carried out in real-time is preferable.
This explains why my team and I recently teamed up with Elliot Salisbury & Sarvapali Ramchurn from the University of Southampton to crowdsource the analysis of live aerial video footage of disaster zones and to combine this crowdsourcing with (hopefully) near real-time machine learning and automated feature detection. In other words, as digital volunteers are busy tagging disaster damage in video footage, we want our algorithms to learn from these volunteers in real-time. That is, we’d like the algorithms to learn what disaster damage looks like so they can automatically identify any remaining disaster damage in a given aerial video.
So we recently carried out a MicroMappers test-deployment using aerial videos from the humanitarian UAV mission to Vanuatu. Close to 100 digital volunteers participated in this deployment. Their task? To click on any parts of the videos that show disaster damage. And whenever 80% or more of these volunteers clicked on the same areas, we would automatically highlight these areas to provide near-real time feedback to the UAV pilot and humanitarian teams.
At one point during the simulations, we had some 30 digital volunteers clicking on areal videos at the same time, resulting in an average of 12 clicks per second for more than 5 minutes. In fact, we collectively clicked on the videos a total of 49,706 times! This provided more than enough real-time data for MicroMappers to act as a human-intelligence sensor for disaster damage assessments. In terms of accuracy, we had about 87% accuracy with the collective clicks. Here’s how the simulations looked like to the UAV pilots as we were all clicking away:
Thanks to all this clicking, we can export only the most important and relevant parts of the video footage while the UAV is still flying. These snippets, such as this one and this one, can then be pushed to MicroMappers for additional verification. These animations are small and quick, and reduce a long aerial video down to just the most important footage. We’re now analyzing the areas that were tagged in order to determine whether we can use this data to train our algorithms accordingly. Again, this is far more than just an academic curiosity. If we can develop robust algorithms during the next few months, we’ll be ready to use them effectively during the next Typhoon season in the Pacific.
In closing, big thanks to my team at QCRI for translating my vision of Micro-Mappers into reality and for trusting me well over a year ago when I said we needed to extend our work to aerial imagery. All of the above research would simply not have been possible without MicroMappers existing. Big thanks as well to our excellent partners at EPFL and Southampton for sharing our vision and for their hard work on our joint projects. Last but certainly not least, sincerest thanks to digital volunteers from SBTF and beyond for participating in these digital humanitarian deployments.