Could Big Data Be the Cure for the Largest Ebola Outbreak Ever?

October 24, 2014
Until March 2014, the largest Ebola outbreak to have taken place was in Uganda in 2000-2001. This outbreak caused 224 deaths and there were 425 reported human cases. Unfortunately, since March things have changed and we are now in the middle of the largest Ebola outbreak ever.

Until March 2014, the largest Ebola outbreak to have taken place was in Uganda in 2000-2001. This outbreak caused 224 deaths and there were 425 reported human cases. Unfortunately, since March things have changed and we are now in the middle of the largest Ebola outbreak ever. The number of people hit with Ebola has reached almost 10.000 in West Africa and the death toll now stands on almost 4.900. The three countries at the epicentre of this outbreak are Guinea, Liberia and Sierra Leone, which happen to be one of the poorest countries in the world. Without a vaccine yet available, stopping this outbreak is extremely difficult, but could we turn to Big Data to stop the disease from spreading?

Tracking diseases, and especially infectious diseases, are already a common use case for Big Data. We know the case of Google Flu Trends, which used to be better in predicting a flu outbreak in a certain location in the world than for example the US Centre for Disease Control. Although Google Flu Trends has lost some of its predictive power, it shows the potential Big Data offers for disease tracking.

Traditionally, infectious diseases are mapped by recording disease occurrences that are obtained from literature, web reports and the GenBank. The GenBank is a database containing publicly available nucleotide sequences for more than 250.000 described species. This helps to define the extent of the disease and to map all occurrences where the disease has been reported. To predict how a disease may spread, other epidemiologically relevant environmental variables are added such as temperature and rainfall. Using statistics, maps are developed with regions shaded from low probability of occurrence to high probability of occurrence. Such maps help organisations to determine which actions are required to limit the spread of the disease. Unfortunately, this process is time-consuming and not very accurate. In the data-driven world that we live in, this could be done differently.

Challenges When Fighting Ebola

The first challenge that we need to face when applying Big Data techniques to combat Ebola is that reliable and accurate data is difficult to collect. Countries like Guinea, Liberia and Sierra Leone have very limited infrastructure, with respect to roads and IT. Smartphone penetration is extremely low and accurate public data from the government is hard to find. In addition, these countries are among the world’s poorest countries and have no funding possibilities to overcome these challenges.

Fortunately, the IMF has proposed an additional $ 127 million fund to help these countries fight Ebola. In addition, the European Union has recently announced €24.4 million from the EU budget for urgently needed Ebola research. This funding will go to five projects, ranging from a large-scale clinical trial of a potential vaccine, to testing existing and novel compounds to treat Ebola.

Stopping an Ebola Outbreak

Although there are some substantial challenges to cope with, there are various ways to use data in a new innovative way that could result in successfully combating Ebola. Apart from the fact that Big Data can help find a medicine for Ebola, it can also be used to get the outbreak under control and to prevent more people from getting infected with Ebola.

One of the most important tasks in stopping an Ebola outbreak is tracing all the people who were in contact with an infected person and monitoring them for the duration of the incubation period, which is 21-days in total. This is vital and doing this correctly can indeed stop an Ebola outbreak, as was proven by Nigeria who successfully stopped the outbreak despite being the most populated African country. Of course this is not the only important aspect. You would also need sufficient high quality materials and trained staff, which Nigeria luckily had, to be able to safely monitor anyone that might be infected.

Another important measurement is to inform the people sufficiently. The Nigerian authorities used several initiatives to communicate the message, including radio, television and house-to-house campaigns, using local dialects, to inform people about the risks of the disease.

Big Data can be used to track how the disease spreads or could spread across a region. Based on this information, locals can be better informed, or quarantine measures can be taken more in advance. So, how would this work?

Tracking An Ebola Outbreak With Big Data

If you want to communicate with so many people at once as well as track them, even in the poorest regions of the world, the best way to go is using the cell phone. Cell phone penetration in Africa is at 80% and it is still growing at 4.2 percent annually. This means that most of the population can be reached and tracked via the mobile phone.

With an incubation period of 2 to 21 days, during which victims might not know that they are infected, it is important to know where they are, where they are going and with whom they are in contact. Cell phone data, including call data records, can provide vital information in this matter. Orange Telecom Senegal provided researchers with data from 150.000 cell phones used in 2013 and this data provided clear insights in the travel patterns of the locals. When there is an outbreak in a region, such travel patterns can predict where the outbreak might be going.


When such travel patterns are combined with other data sources such as cultural behaviour, or unstructured data from blogs or social media detailing information regarding an outbreak, the path of the outbreak could be predicted with a lot more detail. Especially blogs from health workers can be relevant in predicting an outbreak. Quite often foreign health workers keep a blog and they often describe symptoms they come across. When you monitor these blogs with text analytics and Natural Language Processing techniques, it can help get you an early warning regarding a possible outbreak. This is exactly what happened this year with HealthMap.

Such data aggregation, without telecom data, is already very valuable to (health) organizations around the world. A great example of this is thereforeHealthMap. This is an innovative tool developed by Boston Children’s Hospital in 2006, which combines tens of thousands data sources to provide a global overview of infectious diseases. Sources such as social media websites, government sites, local news or infectious-disease physicians’ social networks are analysed using sophisticated algorithms and the location of diseases are shown on a map. This helps travellers, governments and health workers to obtain a better overview of different diseases around the world.


Combining HealthMap With Telecom Data

Imagine when you combine HealthMap with regional cell phone data. This would result in detailed predictions on where an outbreak might be heading. Based on such patterns, health workers can then set-up medical centres in advance or authorities can restrict travel in certain areas. In addition, local authorities can issues warnings via the telecom networks to specific regions that are at risk. Urging the locals to be more careful when travelling, to be cautious with contact with others or to contact a doctor via text message when possible Ebola symptoms arise.

An important pre-requisite with the Ebola outbreak is that the data should be collected at regional scale. So not only Orange Telecom should provide anonymous call data records, but also other telecom providers such as Cellcom from Guinea or Africell SL from Sierra Leone should provide this. In order to have the most effect, the data should be in real time combined across regions and with other sources. This would result in predictive warnings based on the patterns that can be discovered and could have a significant effect on the Ebola outbreak.

Privacy and Ethical Issues

That’s also where potential privacy or ethical issues might come in place. Anonymous data to track a possible outbreak is a lot different than knowing which cell phone owner has symptoms of Ebola, if they have replied to a warning message. What should be done if this is known and it can be seen that he or she travels across the region or has a lot of contact with others? How far should local authorities go trying to stop an outbreak, while possibly infringing the privacy of their citizens? This is a difficult area that most probably each local authority will approach differently because of cultural differences.

Ebola is a horrible disease with unfortunately a very high death rate. A combined approach of different local and regional authorities, telecom organizations, local and global health organizations, a wide variety of structured and unstructured data sources as well as the best data scientists in the world is therefore probably the only way forward to bring this outbreak to a stop.


I really appreciate that you are reading my post. I am a regular blogger on the topic of Big Data and how organizations should develop a Big Data Strategy. If you wish to read more on these topics, then please click ‘Follow’ or connect with me viaTwitter or Facebook.

You might also be interested in my book: Think Bigger – Developing a Successful Big Data Strategy for Your Business.

This article originally appeared on Datafloq.