Part 6: Mobile Technologies and Collaborative Analytics

This is Part 6 of 7 of the highlights from “Illuminating the Path: The Research and Development Agenda for Visual Analytics.” Please see this post for an introduction to the study and access to the other 6 parts.

Mobile Technologies

The National Visual Analytics Center (NVAC) study recognizes that “mobile technologies will play a role in visual analytics, especially to users at the front line of homeland security.” To this end, researchers must “devise new methods to best employ these technologies and provide a means to allow data to scale between high-resolution displays in command and control centers to field-deployable displays.”

Collaborative Analytics

While collaborative platforms from wiki’s to Google docs allow many individuals to work collaboratively, these functionalities rarely feature in crisis mapping platforms. And yet, humanitarian crises (just like homeland security challenges) are so complex that they cannot be addressed by individuals working in silos.

On the contrary, crisis analysis, civilian protection and humanitarian response efforts are “sufficiently large scale and important that they must be addressed through the coordinated action of multiple groups of people, often with different backgrounds working in disparate locations with differing information.”

In other words, “the issue of human scalability plays a critical role, as systems must support the communications needs of these groups of people working together across space and time, in high-stress and time-sensitive environments, to make critical decisions.”

Patrick Philippe Meier

Updated: Humanitarian Situation Risk Index (HSRI)

The Humanitarian Situation Risk Index (HSRI) is a tool created by UN OCHA in Colombia. The objective of HSRI is to determine the probability that a humanitarian situation occurs in each of the country’s municipalities in relation to the ongoing complex emergency. HSRI’s overall purpose is to serve as a “complementary analytical tool in decision-making allowing for humanitarian assistance prioritization in different regions as needed.”

UPDATE: I actually got in touch with the HSRI group back in February 2009 to let them know about Ushahidi and they have since “been running some beta-testing on Ushahidi, and may as of next week start up a pilot effort to organize a large number of actors in northeastern Colombia to feed data into [their] on-line information system.” In addition, they “plan to move from a logit model calculating probability of a displacement situation for each of the 1,120 Colombian municipalities, to cluster analysis, and have been running the identical model on data [they] have for confined communities.”

hsrimap

HSRI uses statistical tools (principal component analysis and the Logit model) to estimate the risk indexes. The indexes range from 0 to 1, where 0 is no risk and 1 is maximum risk. The team behind the project clearly state that the tool does not indicate the current situation in each municipality given that the data is not collected in real-time. Nor does the tool quantify the precise number of persons at risk.

The data used to estimate the Humanitarian Situation Risk Index “mostly comes from official sources, due to the fact that the vast majority of data collected and processed are from State entities, and in the remaining cases the data is from non-governmental or multilateral institutions.” The following table depicts the data collected.

hsri

I’d be interested to know whether the project will move towards doing any temporal analysis of the data over time. This would enable trends analysis which could more directly inform decision-making than a static map representing static data. One other thought might be to complement this “baseline” type data with event-data by using mobile phones and a “bounded crowdsourcing” approach a la Ushahidi.

Patrick Philippe Meier

Part 5: Data Visualization and Interactive Interface Design

This is Part 5 of 7 of the highlights from “Illuminating the Path: The Research and Development Agenda for Visual Analytics.” Please see this post for an introduction to the study and access to the other 6 parts.

Data Visualization

The visualization of information “amplifies human cognitive capabilities in six basic ways” by:

  • Increasing cognitive resources, such as by using a visual resource to expand human working memory;
  • Reducing search, such as by representing a large amount of data in a small place;
  • Enhancing the recognition of patterns, such as when information is organized in space by its time relationships;
  • Supporting the easy perceptual inference of relationships that are otherwise more difficult to induce;
  • Enabling perceptual monitoring of a large number of potential events;
  • Providing a manipulable medium that, unlike static diagrams, enables the exploration of a space of parameter values.

The table below provides additional information on how visualization amplifies cognition:

NAVCsTable

Clearly, “these capabilities of information visualization, combined with computational data analysis, can be applied to analytic reasoning to support the sense-making process.” The National Visualization and Analysis Center (NVAC) thus recommends developing “visually based methods to support the entire analytic reasoning process, including the analysis of data as well as structured reasoning techniques such as the construction of arguments, convergent-divergent investigation, and evaluation of alternatives.”

Since “well-crafted visual representations can play a critical role in making information clear […], the visual representations and interactions we develop must readily support users of varying backgrounds and expertise.” To be sure, “visual representations and interactions must be developed with the full range of users in mind, from the experienced user to the novice working under intense pressure […].”

As NVACs notes, “visual representations are the equivalent of power tools for analytical reasoning.” But just like real power tools, they can cause harm if used carelessly. Indeed, it is important to note that “poorly designed visualizations may lead to an incorrect decision and great harm. A famous example is the poor visualization of the O-ring data produced before the disastrous launch of the Challenger space shuttle […].”

Effective Depictions

This is why we need some basic principles for developing effective depictions, such as the following:

  • Appropriateness Principle: the visual representation should provide neither more or less information than that needed for the task at hand. Additional information may be distracting and makes the task more difficult.
  • Naturalness Principle: experiential cognition is most effective when the properties of the visual representation most closely match the information being represented. This principle supports the idea that new visual metaphors are only useful for representing information when they match the user’s cognitive model of the information. Purely artificial visual metaphors can actually hinder understanding.
  • Matching Principle: representations of information are mst effective when they match the task to be performed by the user. Effective visual representations should present affordances suggestive of the appropriate action.
  • Congruence Principle: the structure and content of a visualization should correspond to the structure and content of the desired mental representation.
  • Apprehension Principle: the structure and content of a visualization should be readily and accurately perceived and comprehended.

Further research is needed to understand “how best to combine time and space in visual representation. “For example, in the flow map, spatial information is primary” in that it defines the coordinate system, but “why is this the case, and are there visual representations where time is foregrounded that could also be used to support analytical tasks?”

In sum, we must deepen our understanding of temporal reasoning and “create task-appropriate methods for integrating spatial and temporal dimensions of data into visual representations.”

Interactive Interface Design

It is important in the visual analytics process that researchers focus on visual representations of data and interaction design in equal measure. “We need to develop a ‘science of interaction’ rooted in a deep understanding of the different forms of interaction and their respective benefits.”

For example, one promising approach for simplifying interactions is to use 3D graphical user interfaces. Another is to move beyond single modality (or human sense) interaction techniques.

Indeed, recent research suggests that “multi-modal interfaces can overcome problems that any one modality may have. For example, voice and deictic (e.g., pointing) gestures can complement each other and make it easier for the user to accomplish certain tasks.” In fact, studies suggest that “users prefer combined voice and gestural communication over either modality alone when attempting graphics manipulation.”

Patrick Philippe Meier

Part 4: Automated Analysis and Uncertainty Visualized

This is Part 4 of 7 of the highlights from “Illuminating the Path: The Research and Development Agenda for Visual Analytics.” Please see this post for an introduction to the study and access to the other 6 parts.

As data flooding increases, the human eye may have difficulty focusing on patterns. To this end, VA systems should have “semi-automated analytic engines and user-driven interfaces.” Indeed, “an ideal environment for analysis would have a seamless integration of computational and visual techniques.”

For example, “the visual overview may be based on some preliminary data transformations […]. Interactive focusing, selecting, and filtering could be used to isolate data associated with a hypothesis, which could then be passed to an analysis engine with informed parameter settings. Results could be superimposed on the original information to show the difference between the raw data and the computed model, with errors highlighted visually.”

Yet current mathematical techniques “for representing pattern and structure, as well as visualizing correlations, time patterns, metadata relationships, and networks of linked information,” do not work well “for more complex reasoning tasks—particularly temporal reasoning and combined time and space reasoning […], much work remains to be done.” Furthermore, “existing techniques also fail when faced with the massive scale, rapidly changing data, and variety of information types we expect for visual analytics tasks.”

Furthermore, “the complexity of this problem will require algorithmic advances to address the establishment and maintenance of uncertainty measures at varying levels of data abstraction.” There is presently “no accepted methodology to represent potentially erroneous information, such as varying precision, error, conflicting evidence, or incomplete information.”

To this end, “interactive visualization methods are needed that allow users to see what is missing, what is known, what is unknown, and what is conjectured, so that they may infer possible alternative explanations.”

In sum, “uncertainty must be displayed if it is to be reasoned with and incorporated into the visual analytics process. In existing visualizations, much of the information is displayed as if it were true.”

Patrick Philippe Meier

Part 3: Data Tetris and Information Synthesis

This is Part 3 of 7 of the highlights from “Illuminating the Path: The Research and Development Agenda for Visual Analytics.” Please see this post for an introduction to the study and access to the other 6 parts.

Visual Analytics (VA) tools need to integrate and visualize different data types. But the integration of this data needs to be “based on their meaning rather than the original data type” in order to “facilitate knowledge discovery through information synthesis.” However, “many existing visual analytics systems are data-type-centric. That is, they focus on a particular type of data […].”

We know that different types of data are regularly required to conduct solid anlaysis, so developing a data synthesis capability is particularly important. This means ability to “bring data of different types together in a single environment […] to concentrate on the meaning of the data rather than on the form in which it was originally packaged.”

To be sure, information synthesis needs to “extend beyond the current data-type modes of analysis to permit the analyst to consider dynamic information of all types in seamless environment.” So we need to “eliminate the artificial constraints imposed by data type so that we can aid the analyst in reaching deeper analytical insight.”

To this end, we need breakthroughs in “automatic or semi-automatic approaches for identifying [and coding] content of imagery and video data.” A semi-automatic approach could draw on crowdsourcing, much like Ushahidi‘s Swift River.

In other words, we need to develop visual analytical tools that do not force the analyst to “perceptually and cognitively integrate multiple elements. […] Systems that force a user to view sequence after sequence of information are time-consuming and error-prone.” New techniques are also needed to do away with the separation of ‘what I want and the act of doing it.'”

Patrick Philippe Meier

Armed Conflict and Location Event Dataset (ACLED)

I joined the Peace Research Institute, Oslo (PRIO) as a researcher in 2006 to do some data development work on a conflict dataset and to work with Norways’ former Secretary of State on assessing the impact of armed conflict on women’s health for the Ministry of Foreign Affairs (MFA).

I quickly became interested in a related PRIO project that had recently begun called the “Armed Conflict and Location Event Dataset, or ACLED. Having worked with conflict event-datasets as part of operational conflict early warning systems in the Horn, I immediately took interest in the project.

While I have referred to ACLED in a number of previous blog posts, two of my main criticisms (until recently) were (1) the lack of data on recent conflicts; and (2) the lack of an interactive interface for geospatial analysis, or at least more compelling visualization platform.

Introducing SpatialKey

Independently, I came across UniveralMind back November of last year when Andrew Turner at GeoCommons made a reference to the group’s work in his presentation at an Ushahidi meeting. I featured one of the group’s products, SpatialKey, in my recent video primer on crisis mapping.

As it turns out, ACLED is now using SpatialKey to visualize and analyze some of it’s data. So the team has definitely come a long way from using ArcGIS and Google Earth, which is great. The screenshot below, for example, depicts the ACLED data on Kenya’s post-election violence using SpatialKey.

ACLEDspatialkey

If the Kenya data is not drawn from the Ushahidi then this could be an exciting research opportunity to compare both datasets using visual analysis and applied geo-statistics. I write “if” because PRIO somewhat surprisingly has not made the Kenya data available. They are usually very transparent so I will follow up with them and hope to get the data. Anyone interested in co-authoring this study?

Academics Get up To Speed

It’s great to see ACLED developing conflict data for more recent conflicts. Data on Chad, Sudan and the Central African Republic (CAR) is also depicted using SpatialKey but again the underlying spreadsheet data does not appear to be available regrettably. If the data were public, then the UN’s Threat and Risk Mapping Analysis (TRMA) project may very well have much to gain from using the data operationally.

ACLEDspatialkey2

Data Hugging Disorder

I’ll close with just one—perhaps unwarranted—concern since I still haven’t heard back from ACLED about accessing their data. As academics become increasingly interested in applying geospatial analysis to recent or even current conflicts by developing their own datasets (a very positive move for sure), will these academics however keep their data to themselves until they’ve published an article in a peer-reviewed journal, which can often take up to a year or more to publish?

To this end I share the concern that my colleague Ed Jezierski from InSTEDD articulated in his excellent blog post yesterday: “Academic projects that collect data with preference towards information that will help to publish a paper rather than the information that will be the most actionable or help community health the most.” Worst still, however, would be academics collecting data very relevant to the humanitarian or human rights community and not sharing that data until their academic papers are officially published.

I don’t think there needs to be competition between scholars and like-minded practitioners. There are increasingly more scholar-practitioners who recognize that they can contributed their research and skills to the benefit of the humanitarian and human rights communities. At the same time, the currency of academia remains the number of peer-reviewed publications. But humanitarian practitioners can simply sign an agreement such that anyone using the data for humanitarian purposes cannot publish any analysis of said data in a peer-reviewed forum.

Thoughts?

Patrick Philippe Meier

Part 2: Data Flooding and Platform Scarcity

This is Part 2 of 7 of the highlights from “Illuminating the Path: The Research and Development Agenda for Visual Analytics.” Please see this post for an introduction to the study and access to the other 6 parts.

Data Flooding

Data flooding is a term I use to illustrate the fact that “our ability to collect data is increasing at a faster rate than our ability to analyze it.” To this end, I completely agree with the recommendation that new methods are required to “allow the analyst to examine this massive, multi-dimensional, multi-source, time-varying information stream to make decisions in a time-critical manner.”

We don’t want less, but rather more information since “large data volumes allow analysts to discover more complete information about a situation.” To be sure, “scale brings opportunities as well.” As a result, for example, “analysts may be able to determine more easily when expected information is missing,” which sometimes “offers important clues […].”

However, while computer processing power and memory density have changed radically over the decades, “basic human skills and abilities do not change significantly over time.” Technological advances can certainly leverage our skills “but there are fundamental limits that we are asymptotically approaching,” hence the notion of information glut.

In other words, “human skills and abilities do not scale.” That said, the number of humans involved in analytical problem-solving does scale. Unfortunately, however, “most published techniques for supporting analysis are targeted for a single user at a time.” This means that new techniques that “gracefully scale from a single user to a collaborative (multi-user) environment” need to be developed.

Platform Scarcity

However, current technologies and platforms being used in the humanitarian and human rights communities do not address the needs for handling ever-changing volumes of information. “Furthermore, current tools provide very little in the way of support for the complex tasks of anlaysis and discovery process.” There clearly is a platform scarcity.

Admittedly, “creating effective visual representations is a labor-intensive process that requires a solid understanding of the visualization pipeline, characteristics of the data to be displayed, and the tasks to be performed.”

However, as is clear from the crisis mapping projects I have consulted on, “most visualization software is written with incomplete knowledge of at least some of this information.” Indeed, it is rarely possible for “the analyst, who has the best understanding of the data and task, to construct new tools.”

The NVAC study thus recommends that “research is needed to create software that supports the most complex and time-consuming portions of the analytical process, so that analysts can respond to increasingly more complex questions.” To be sure, “we need real-time analytical monitoring that can alert first responders to unusual situations in advance.”

Patrick Philippe Meier

Part 1: Visual Analytics

This is Part 1 of 7 of the highlights from “Illuminating the Path: The Research and Development Agenda for Visual Analytics.” Please see this post for an introduction to the study and access to the other 6 parts.

NVAC defines Visual Analytics (VA) as “the science of analytical reasoning facilitated by interactive visual interfaces. People use VA tools and techniques to synthesize information and derive insights from massive, dynamic, ambiguous, and often conflicting data; detect the expected and discover the unexpected; provide timely, defensible, and understandable assessments; and communicate assessment effectively for action.”

The field of VA is necessarily multidisciplinary and combines “techniques from information visualization with techniques from computational transformation and analysis of data.” VA includes the following focus areas:

  • Analytical reasoning techniques, “that enable users to obtain deep insights that directly support assessment, planning and decision-making”;
  • Visual representations and interaction techniques, “that take advantage of the human eye’s broad bandwidth pathway to into the mind to allow users to see, explore, and understand large amounts of information at once”;
  • Data representation and transformations, “that convert all types of conflicting and dynamic data in ways that support visualization and analysis”;
  • Production, presentation and dissemination techniques, “to communicate information in the appropriate context to a variety of audiences.”

As is well known, “the human mind can understand complex information received through visual channels.” The goal of VA is thus to facilitate the analytical reasoning process “through the creation of software that maximizes human capacity to perceive, understand, and reason about complex and dynamic situations.”

In sum, “the goal is to facilitate high-quality human judgment with a limited investment of the analysts’ time.” This means in part to “expose all relevant data in a way that facilitates the reasoning process to enable action.” To be sure, solving a problem often means representing it so that the solution is more obvious (adapted from Herbert Simon). Sometimes, the simple act of placing information on a timeline or a map can generate clarity and profound insight.” Indeed, both “temporal relationships and spatial patterns can be revealed through timelines and maps.”

VA also reduces the costs associated with sense-making in two primary ways, by:

  1. Transforming information into forms that allow humans to offload cognition onto easier perceptual processes;
  2. Allowing software agents to do some of the filtering, representation translation, interpretation, and even reasoning.

That said, we should keep in mind that “human-designed visualizations are still much better than those created by our information visualization systems.” That is, there are more “highly evolved and widely used metaphors created by human information designers” than there are “successful new computer-mediated visual representations.”

Patrick Philippe Meier

Research Agenda for Visual Analytics

I just finished reading “Illuminating the Path: The Research and Development Agenda for Virtual Analytics.” The National Visualization and Analytics Center (NVACs) published the 200-page book in 2004 and the volume is absolutely one of the best treaties I’ve come across on the topic yet. The purpose of this series of posts that follow is to share some highlights and excerpts relevant for crisis mapping.

NVACcover

Co-edited by James Thomas and Kristin Cook,  the book focuses specifically on homeland security but there are numerous insights to be gained on how “virtual analytics” can also illuminate the path for crisis mapping analytics. Recall that the field of conflict early warning originated in part from World War II and  the lack of warning during Pearl Harbor.

Several coordinated systems for the early detection of a Soviet bomber attack on North America were set up in the early days of the Cold War. The Distant Early Warning Line, or Dew Line, was the most sophisticated of these. The point to keep in mind is that the national security establishment is often in the lead when it comes to initiatives that can also be applied for humanitarian purposes.

The motivation behind the launching of NVACs and this study was 9/11. In my opinion, this volume goes a long way to validating the field of crisis mapping. I highly recommend it to colleagues in both the humanitarian and human rights communities. In fact, the book is directly relevant to my current consulting work with the UN’s Threat and Risk Mapping Analysis (TRMA) project in the Sudan.

So this week, iRevolution will be dedicated to sharing daily higlights from the NVAC study. Taken together, these posts will provide a good summary of the rich and in-depth 200-page study. So check back here post for live links to NVAC highlights:

Part 1: Visual Analytics

Part 2: Data Flooding and Platform Scarcity

Part 3: Data Tetris and Information Synthesis

Part 4: Automated Analysis and Uncertainty Visualized

Part 5: Data Visualization and Interactive Interface Design

Part 6: Mobile Technologies and Collaborative Analytics

Part 7: Towards a Taxonomy of Visual Analytics

Note that the sequence above does not correspond to specific individual chapters in the NVAC study. This structure for the summary is what made most sense.

Patrick Philippe Meier

UN Sudan Information Management Working (Group)

I’m back in the Sudan to continue my work with the UNDP’s Threat and Risk Mapping Analysis (TRMA) project. UN agencies typically suffer from what a colleague calls “Data Hugging Disorder (DHD),” i.e., they rarely share data. This is generally the rule, not the exception.

UN Exception

There is an exception, however: the recently established UN’s Information Management Working Group (IMWG) in the Sudan. The general goal of the IMWG is to “facilitate the development of a coherent information management approach for the UN Agencies and INGOs in Sudan in close cooperation with local authorities and institutions.”

More specifically, the IMWG seeks to:

  1. Support and advise the UNDAF Technical Working Groups and Work Plan sectors in the accessing and utilization of available data for improved development planning and programming;
  2. Develop/advise on the development of, a Sudan-specific tool, or set of tools, to support decentralized information-sharing and common GIS mapping, in such a way that it will be consistent with the DevInfo system development, and can eventually be adopted/integrated as a standard plug-in for the same.

To accomplish these goals, the IMWG will collectively assume a number of responsibilities including the following:

  • Agree on  information sharing protocols, including modalities of shared information update;
  • Review current information management mechanisms to have a coherent approach.

The core members of the working group include: IOM, WHO, FAO, UNICEF, UNHCR, UNPFA, WFP, OCHA and UNDP.

Information Sharing Protocol

These members recently signed and endorsed an “Information Sharing Protocol”. The protocol sets out the preconditions, the responsibilities and the rights of the IMWG members for sharing, updating and accessing the data of the information providers.

With this protocol, each member commits to sharing specific datasets, in specific formats and at specific intervals. The data provided is classified as either public access or classified accessed. The latter is further disaggregated into three categories:

  1. UN partners only;
  2. IMWG members only;
  3. [Agency/group] only.

There is also a restricted access category, which is granted on a case-by-case basis only.

UNDP/TRMA’s Role

UNDP’s role (via TRMA) in the IMWG is to technically support the administration of the information-sharing between IMWG members. More specifically, UNDP will provide ongoing technical support for the development and upgrading of the IMWG database tool in accoardance with the needs of the Working Group.

In addition, UNDP’s role is to receive data updates, to update the IMWG tool and to circulate data according to classification of access as determined by individual contributing agencies. Would a more seemless information sharing approach might work; one in which UNDP does not have to be the repository of the data let alone manually update the information?

In any case, the very existence of a UN Information Management Working Group in the Sudan suggests that Data Hugging Disorders (DHDs) can be cured.

Patrick Philippe Meier