I had the pleasure of participating in two Internews sponsored meetings in New York today. Fellow participants included OCHA, Oxfram, Red Cross, Save the Children, World Vision, BBC World Service Trust, Thomson Reuters Foundation, Humanitarian Media Foundation, International Media Support and several others.
The first meeting was a three-hour brainstorming session on “Improving Humanitarian Information for Affected Communities” organized in preparation for the second meeting on “The Unmet Need for Communication in Humanitarian Response,” which was held at the UN General Assembly.
The meetings presented an ideal opportunity for participants to share information on current initiatives that focus on communications with crisis-affected populations. Ushahidi naturally came to mind so I introduced the concept of crowdsourcing crisis information. I should have expected the immediate push back on the issue of data validation.
Crowdsourcing and Data Validation
While I have already blogged about overcoming some of the challenges of data validation in the context of crowdsourcing here, there is clearly more to add since the demand for “fully accurate information” a.k.a. “facts and only facts” was echoed during the second meeting in the General Assembly. I’m hoping this blog post will help move the discourse beyond the black and white concepts that characterize current discussions on data accuracy.
Having worked in the field of conflict early warning and rapid response for the past seven years, I fully understand the critical importance of accurate information. Indeed, a substantial component of my consulting work on CEWARN in the Horn of Africa specifically focused on the data validation process.
To be sure, no one in the humanitarian and human rights community is asking for inaccurate information. We all subscribe to the notion of “Do No Harm.”
Does Time Matter?
What was completely missing from today’s meetings, however, was a reference to time. Nobody noted the importance of timely information during crises, which is rather ironic since both meetings focused on sudden onset emergencies. I suspect that our demand (and partial Western obsession) for fully accurate information has clouded some of our thinking on this issue.
This is particularly ironic given that evidence-based policy-making and data-driven analysis are still the exception rather than the rule in the humanitarian community. Field-based organizations frequently make decisions on coordination, humanitarian relief and logistics without complete and fully accurate, real-time information, especially right after a crisis strikes.
So why is this same community holding crowdsourcing to a higher standard?
Time versus Accuracy
Timely information when a crisis strikes is a critical element for many of us in the humanitarian and human rights communities. Surely then we must recognize the tradeoff between accuracy and timeliness of information. Crisis information is perishable!
The more we demand fully accurate information, the longer the data validation process typically takes and thus the more likely the information will be become useless. Our public health colleagues who work in emergency medicine know this only too well.
The figure below represents the perishable nature of crisis information. Data validation makes sense during time-periods A and B. Continuing to carry out data validation beyond time B may be beneficial to us, but hardly to crisis affected communities. We may very well have the luxury of time. Not so for at-risk communities.
This point often gets overlooked when anxieties around inaccurate information surface. Of course we need to insure that information we produce or relay is as accurate as possible. Of course we want to prevent dangerous rumors from spreading. To this end, the Thomson Reuters Foundation clearly spelled out that their new Emergency Information Service (EIS) would only focus on disseminating facts and only facts. (See my previous post on EIS here).
Yes, we can focus all our efforts on disseminating facts, but are those facts communicated after time-period B above really useful to crisis-affected communities? (Incidentally, since EIS will be based on verifiable facts, their approach may well be liked to Wikipedia’s rules for corrective editing. In any event, I wonder how EIS might define the term “fact”).
Ushahidi was created within days of the Kenyan elections in 2007 because both the government and national media were seriously under-reporting widespread human rights violations. I was in Nairobi visiting my parents at the time and it was also frustrating to see the majority of international and national NGOs on the ground suffering from “data hugging disorder,” i.e., they had no interest whatsoever to share information with each other or the public for that matter.
This left the Ushahidi team with few options, which is why they decided to develop a transparent platform that would allow Kenyans to report directly, thereby circumventing the government, media and NGOs, who were working against transparency.
Note that the Ushahidi team is only comprised of tech-experts. Here’s a question: why didn’t the human rights or humanitarian community set up a platform like Ushahidi? Why were a few tech-savvy Kenyans without a humanitarian background able to set up and deploy the platform within a week and not the humanitarian community? Where were we? Shouldn’t we be the ones pushing for better information collection and sharing?
In a recent study for the Harvard Humanitarian Initiative (HHI), I mapped and time-stamped reports on the post-election violence reported by the mainstream media, citizen journalists and Ushahidi. I then created a Google Earth layer of this data and animated the reports over time and space. I recommend reading the conclusions.
Accuracy is a Luxury
Having worked in humanitarian settings, we all know that accuracy is more often luxury than reality, particularly right after a crisis strikes. Accuracy is not black and white, yes or no. Rather, we need to start thinking in terms of likelihood, i.e., how likely is this piece of information to be accurate? All of us already do this everyday albeit subjectively. Why not think of ways to complement or triangulate our personal subjectivities to determine the accuracy of information?
At CEWARN, we included “Source of Information” for each incident report. A field reporter could select from several choices: (1) direct observation; (2) media, and (3) rumor. This gave us a three-point weighted-scale that could be used in subsequent analysis.
At Ushahidi, we are working on Swift River, a platform that applies human crowdsourcing and machine analysis (natural language parsing) to filter crisis information produced in real time, i.e., during time-periods A and B above. Colleagues at WikiMapAid are developing similar solutions for data on disease outbreaks. See my recent post on WikiMapAid and data validation here.
In sum, there are various ways to rate the likelihood that a reported event is true. But again, we are not looking to develop a platform that insures 100% reliability. If full accuracy were the gold standard of humanitarian response (or military action for that matter), the entire enterprise would come to a grinding halt. The intelligence community has also recognized this as I have blogged about here.
The purpose of today’s meetings was for us to think more concretely about communication in crises from the perspective of at-risk communities. Yet, as soon as I mentioned crowdsourcing the discussion became about our own demand for fully accurate information with no concerns raised about the importance of timely information for crisis-affected communities.
Ironic, isn’t it?
Patrick Philippe Meier