To start the discussion of a working methodology for data management, let’s think in terms of filters. The first filter can be called “time frame”. How quickly does an agency or person need to respond in order to accomplish their task humanitarian crisis setting? We will call our second filter “lag”. How much time has passed since an event was reported and the data was received by an aid agency? These two filters create a framework for filtering the mass of data, which lets us do two things:
- In an operational setting, we need to determine our operating “time frame”; is my job measured in hours, days, or weeks?
- Once we’ve got the “time frame” filter in place, we can then look at the “lag” filter. If my work is measured in hours, then data older than a few hours might not be helpful. If I work with timeframes measured in weeks or months, then I would be interested in data that could be weeks or even months old.
To put this into perspective, I’ll put the above methodology to use for two hypothetical actors in an emergency situation. The first actor is a trauma surgeon who has been deployed to a disaster area; they have a reliable vehicle and their medical instruments. If they have some sort of a mobile data device, they can be receiving geo-synchronized pings for help from people with cellular phones. That’s all they really need; they are operating in a response mode and are not concerned with forecasting where the next event will occur, so the need for precision is relatively low.
The second actor is a military logistics officer deployed to the same disaster area. While initial responders like the doctor are able to do their jobs in the meantime, the logistics officer must determine where to move limited resources so that those resources are in the best possible location to be deployed as efficiently as possible. Because of the volume of stuff being moved, and the time this kind of movement takes, the logistics officer wants data that can give her some notion of where to place stores to be in the most accessible area relative to upcoming events. To accomplish this, a dataset of event locations must be fairly precise. A data collector will want coordinates of previous events, and will probably go through a process of cleaning the data to produce a statistical analysis of the most efficient place to store incoming goods and resources. Unlike the doctor, who is highly mobile, precision matters to the logistics officer because moving large amounts of stuff is time consuming, and in an emergency time is of the essence.
Bridging the gap between the pace of technological innovation and managing the volume of data is a core challenge to realizing the value of emerging technologies in humanitarian and peacekeeping operations. I plan to lay out a couple of concepts for data analysis in later posts, focusing on methods that can take advantage of both the “real time” nature of the data and the fact that much of it reflects geographic location too. For now, I hope this post can get some conversation going on filtering methodology and improving the process for collecting of new data.
Charles Martin-Shields is currently a training and research consultant with the U.S. Institute of Peace. His work focuses on conflict analysis, international development and analytic methodology. He is currently completing a long term study of educational development in conflict-affected contexts to be published in March 2011 through the University of Toronto, and consults on project development and risk analysis in post-conflict settings. He can be reached at email@example.com.