For those working in the field of conflict prevention and humanitarian assistance, reliable real-time data plays a critical role in staging a successful intervention.  As a recent discussion at the U.S. Institute of Peace with Dr. Steven Livingston made clear, the humanitarian policy world is dealing with an environment where data gathering technology is advancing at an exceptional rate.  The conversation then addressed the challenge created by all this technology; the sheer volume of incoming data can overwhelm policy makers and field-based practitioners.Where does the conversation go from here?  We know that the unfiltered mass of data can confounding to a policy maker, and useless to an under-resourced soldier or aid worker.  But if data coming from mobile phones, remote sensors, and Droid devices can be filtered effectively, it could provide fine-grained, highly reliable information for humanitarian aid workers and peacekeepers.

To start the discussion of a working methodology for data management, let’s think in terms of filters.  The first filter can be called “time frame”.  How quickly does an agency or person need to respond in order to accomplish their task humanitarian crisis setting?  We will call our second filter “lag”.  How much time has passed since an event was reported and the data was received by an aid agency?  These two filters create a framework for filtering the mass of data, which lets us do two things:

  1. In an operational setting, we need to determine our operating “time frame”; is my job measured in hours, days, or weeks?
  2. Once we’ve got the “time frame” filter in place, we can then look at the “lag” filter.  If my work is measured in hours, then data older than a few hours might not be helpful.  If I work with timeframes measured in weeks or months, then I would be interested in data that could be weeks or even months old.

To put this into perspective, I’ll put the above methodology to use for two hypothetical actors in an emergency situation.  The first actor is a trauma surgeon who has been deployed to a disaster area; they have a reliable vehicle and their medical instruments.  If they have some sort of a mobile data device, they can be receiving geo-synchronized pings for help from people with cellular phones.  That’s all they really need; they are operating in a response mode and are not concerned with forecasting where the next event will occur, so the need for precision is relatively low.

The second actor is a military logistics officer deployed to the same disaster area.  While initial responders like the doctor are able to do their jobs in the meantime, the logistics officer must determine where to move limited resources so that those resources are in the best possible location to be deployed as efficiently as possible.  Because of the volume of stuff being moved, and the time this kind of movement takes, the logistics officer wants data that can give her some notion of where to place stores to be in the most accessible area relative to upcoming events.  To accomplish this, a dataset of event locations must be fairly precise.  A data collector will want coordinates of previous events, and will probably go through a process of cleaning the data to produce a statistical analysis of the most efficient place to store incoming goods and resources.  Unlike the doctor, who is highly mobile, precision matters to the logistics officer because moving large amounts of stuff is time consuming, and in an emergency time is of the essence.

Bridging the gap between the pace of technological innovation and managing the volume of data is a core challenge to realizing the value of emerging technologies in humanitarian and peacekeeping operations.  I plan to lay out a couple of concepts for data analysis in later posts, focusing on methods that can take advantage of both the “real time” nature of the data and the fact that much of it reflects geographic location too.  For now, I hope this post can get some conversation going on filtering methodology and improving the process for collecting of new data.

Charles Martin-Shields is currently a training and research consultant with the U.S. Institute of Peace.  His work focuses on conflict analysis, international development and analytic methodology.  He is currently completing a long term study of educational development in conflict-affected contexts to be published in March 2011 through the University of Toronto, and consults on project development and risk analysis in post-conflict settings.  He can be reached at


  • Pingback: Tweets that mention Corralling the Data, Instead of the Data Corralling Us | TechChange --

  • Victoria

    It's good to read a thoughtful response to the deluge of data that to me often seems overwhelming and meaningless. I welcome an on-going discussion of a working methodology for data management that non-data managers can understand, and Charles Martin-Shields has laid a good foundation for that discussion. I look forward to learning more "what" and "how" regarding data filters for international humanitarian aid workers and peacekeepers. And I hope he'll consider how those in domestic human services might identify and "corrall" the data available in crisis stabilization and evaluation – i.e., those with non-medical jobs in substance abuse and addiction that are measured in days.

    • One tool that is being developed to help manage what you correctly identify as a deluge of data is <a href="” target=”_blank”> it allows you to eliminate redundant messages and quickly sort incoming data by who is sending it.