award_star

Non Profit Organization Advancing Peace

Non Profit Organization Advancing Peace

Overview

The Human Rights Data Analysis Group (HRDAG) is a non-profit organization that applies rigorous science to the analysis of human rights violations. HRDAG collaborated with the Colombian Truth Commission (CEV in Spanish) and the Special Jurisdiction for Peace (JEP in Spanish) between 2020 and 2022 in the largest human rights data science project to date. The goal was to produce official statistical information about patterns of violence during the Colombian armed conflict. We used data from 112 datasets collected by 44 state institutions, victims’ organizations, and civil society organizations to analyze homicides, kidnappings, enforced disappearances, recruitment of child soldiers, and forced displacement. The results were included in the CEV’s Final Report and continue being used by the JEP.

Even with access to multiple databases, some human rights violations are never documented. Therefore, missing data is a central challenge, and our understanding of violence can be biased. Statistical methods can help overcome missing data to confront the truth.

Our analyses consisted of three main components. First, we used semi-supervised machine learning to link and deduplicate the records. Second, we used statistical imputation to probabilistically fill in missing information in observed records. Here we used neural networks to include relevant information from unstructured text fields. Third, we used multiple systems estimation to estimate the universe of victims, including the underreporting. We leveraged cloud computing solutions to run estimates for over 350,000 strata.  

In 2023, we published analysis-ready data along with the R package ‘verdata’ to help researchers and practitioners replicate the results and answer new questions about patterns of violence in Colombia.

What is the potential of your work for widespread impact? How do you meaningfully improve the lives of people?

Our work provides a way to acknowledge and account for unidentified victims missing from existing documentation efforts. For example, there are 374,567 documented victims of homicides between 1985 and 2018, but we estimate that they could be up to 852,756. Without considering the impacts of missing data, history could be excluding more than 400,000 victims who lost their lives in the Colombian armed conflict.

Additionally, our open-source package ‘verdata’ can promote a deeper understanding of the armed conflict. It empowers any interested person to conduct further analyses and answer questions that are meaningful to them.

Lastly, our project can be a leading example of the application of rigorous science to answer human rights-related questions in other contexts. It can be used by other truth commissions, civil society, or States wishing to use quantitative methods to examine patterns of violence by acknowledging the limitations of documented violence.

How does your project support peacebuilding and/or conflict resolution efforts in the context of a humanitarian crisis or developmental context?

Since collecting information about human rights violations is dangerous and difficult, all databases are likely to collect incomplete data. By using statistical methods to appropriately account for missing data, we can gain a better understanding of the magnitude of the Colombian armed conflict and contribute to the right to truth.

As there may be victims who were never recorded, we cannot be certain of a single number of victims. Therefore, there would be no way to calculate how different the true levels of violence were from what was documented. Statistical estimation makes it possible to reduce the uncertainty in the number of undocumented victims to a measurable and interpretable range. It also allows acknowledgment of the lives of people who were lost, even if we can’t name them.

Our project provides a consistent quantitative foundation for transitional justice mechanisms (such as truth commissions) to acknowledge and overcome missing data in human rights violations’ registries.

In what ways does your project contribute to the existing PeaceTech ecosystem and research efforts in a compelling way?

First, we trained local researchers from the CEV and the JEP in principled data processing, computer coding, and statistical analysis. This strengthened these institutions’ internal capacity to conduct data-driven advocacy work. The results can be seen not only in the CEV’s Final Report but also in the JEP, whose data scientists continue to use these tools.

Second, the publication of the Final Report allows individuals from different backgrounds to approach data science and understand its uses for peacebuilding. Our technical annex is rigorous but also approachable by any person, with no need of expertise in the field.

Third, with the publication of ‘verdata’ we provide open access to the data and software necessary to not only replicate the published results but also answer new questions.

Fourth, we are doing outreach to promote the project with our package ‘verdata’. This way, we are training the next generation of human rights data scientists to use tech for peacebuilding.

With the award funds, how would you expand the scope and applicability of your project or research beyond its initial pilot?

We would expand our efforts to raise local, regional, and international awareness of the challenges and limitations of observed data and the power of statistical tools to account for missing data. We believe that using statistics is key to acknowledging the suffering of the most vulnerable victims, who are likely the ones who are not included in any record.

We would like to give workshops to civil society, State organizations, and victim’s collectives about the methods that we used and the ‘verdata’ package. In addition, we would love to give those affected by violence the skills to do their own data-driven advocacy.

We would also like to do this exercise at an international level. There are other countries that continue to suffer violence. We believe that the Colombian case can be an inspiration to use statistics in other countries to answer difficult questions about human rights violations.

How does your work leverage collaborations and partnerships to unlock new opportunities and maximize impact?

First, it enables collaboration between data scientists from different parts of the world. In the Colombian case, approximately 20 data scientists from the US and Colombia worked together for over two years. The lessons from this exchange can be used in other projects, such as the active research projects from the JEP. In those, the Colombian data scientists are training other researchers in principles of data processing and statistical analysis.

Second, it opens conversations between social scientists and data scientists. Both kinds of expertise are necessary for a project about violence to succeed: data scientists for the methods and social scientists for the assumptions and interpretations. By creating bridges between these two worlds, we can contribute to the right to truth. 

Third, our project can unlock new opportunities and promote the application of these methods for other violent contexts by being an example of good practices for using data science for peacebuilding.