The field of sentiment analysis is one that I’ve long been interested in. See my previous post on the use of sentiment analysis for early warning here. So when we began receiving thousands of text messages from Haiti, I decided to ask my colleagues at the EC’s Joint Research Center (JRC) whether they could run some of their sentiment analysis software on the incoming SMS’s.
The 4636 SMS initiative in Haiti was a collaboration between many organizations and was coordinated by Josh Nesbit of FrontlineSMS. The system allowed individuals in Haiti to text in their location and urgent needs. These would then be shared with some of the humanitarian actors on the ground and also mapped on the Ushahidi-Haiti platform, which was used by first responders such as the Marine Corps.
Here’s how the JRC in partnership with the University of Alicante carried out their analysis on the incoming SMS’s:
As many individual words are ambiguous (e.g. the word ‘help’ probably predominantly indicates a negative situation, but it may also be positive, as in “help has finally arrived”), they looked at the most frequent word groups, or word n-grams (sizes 2 to 5 words). Out of these, they identified about 100 n-grams that they felt are (high) negative or (high) positive. These were added to the sentiment analysis tool.
The graph below depicts the changing sentiment reflected in the SMS data between January 17th and February 5th.
There is, of course, no way to tell whether the incoming text messages reflect the general feeling of the population. It is also important to emphasize that the number of individuals sending in SMS’s increased during this time period. Still, it would be interesting to go through the sentiment analysis data and identify what may have contributed to the peaks and troughs of the above graph.
Incidentally, the lowest point on this graph is associated with the date of January 21. The data reveals that a major aftershock took place that day. There are subsequent reports of trauma, food/water shortages, casualties, need for medication, etc., which drive the sentiment analysis scores down.
Update 1: My colleague Ralf Steinberger and the Ushahidi-Haiti group is looking into the reasons behind the spike around January 30th. Ralf notes the following:
I checked the news a bit, using the calendar function in EMM NewsExplorer (http://emm.newsexplorer.eu/). I checked both the English and the French news for the day. One certainly positive news item accessible to Haitians on that day was that Haiti leaders pointed to progress. Another (French) positive news item is that the WFP (PAM) put in place a structured food aid system aiming at feeding up to 2 million people via women only. People were given food coupons (25kg of rice per family), starting Saturday 30.1.
Ralf also found that many of the original SMS’s received on that day had not been translated into English. So we’re looking into why that might have been. Hopefully we can get them translated retro-actively for the purposes of this analysis.
Update 2: Josef Steinberger from JRC has produced a revised sentiment analysis graph through to mid March.
This kind of sentiment analysis can be done in real-time. In future deployments where SMS becomes the principle source to communicate with disaster affected populations, using this kind of approach may eventually provide an overall score for how the humanitarian community is doing.