Author: Oscar Sanchez – Data Architect
ABSTRACT
AI models for predicting cardiac diseases [The Colombian example]
Around the world, heart diseases are the most common cause of death, generally because people don’t pay attention to them. At the same time, there are not enough heart specialists for many people who have or will have heart problems in the future.
For this reason, this article proposes using Inteligencia Artificial to predict heart diseases before patients have delicate issues like heart attacks.
Also, to establish how many specialists are needed in different areas to be effective and lose fewer lives because of heart diseases.
Take from Freepik
Cardiac diseases around the world
Around the world, cardiovascular diseases are the first cause of death. The percentage of global deaths caused by cardiovascular problems is expected to be 45% (WHO, 2021) . Besides, out of the 17 million premature deaths (under 70 y.o) due to noncommunicable diseases in 2019, 38% were caused by CVDs.
As a result of the pandemic caused by COVID-19, a significant sample of patients who had complications have presented an increased risk of sudden death, acute myocardial infarction, and arrhythmias, among others (Triana).
Under these circumstances, the percentage of deaths from cardiovascular problems, far from decreasing, will increase significantly after 2020, and it is already a global public health problem. Also, the world is critically concerned about the lack of heart specialists. There are not enough heart specialists to cover the high demand.
Despite being the major cause of mortality, preventive campaigns have not been effective, and many people have died without knowing they had a heart complication to treat.
Cardiac diseases in Colombia and their treatment
Not only around the world, but also in Colombia, cardiac diseases are the most popular cause of dead (minsalud, s.f.).
Figure 1: causes of dead in Colombia source: DANE
According to the image, the main cause of death in Colombia is heart issues, with a total of 13.926 deaths only in the first quarter of 2022.
Despite many prevention campaigns, the number of deaths keeps growing, and lacking specialists doesn’t facilitate the situation.
Also, patients or their health provider entities cannot cover many treatment costs. Because of this, this issue is a main problem for Colombia.
The Colombian government invested in 2017 around 6.4 billion pesos (1.5 million dollars) to treat cardiac diseases (cost, 2017) , but the number of deaths continued to grow, so we have to ask ourselves if investing more solves the problem.
Prediction of cardiac disease with AI
With the use of Artificial Intelligence, we can support and diagnose cardiovascular diseases early, and we wanted to experiment to create a reliable solution.
First, we took into account significant variables such as:
- BMI (body mass index)
- Bad habits (tobacco, alcohol) (information provided by the patient)
- High blood sugar levels (diagnostic tests)
- High blood cholesterol levels (diagnostic tests)
- Fruit consumption (information provided by the patient)
- Vegetable consumption (information provided by the patient)
- Having hypertension (diagnostic tests)
- Percentage of body fat (measurable values)
- Percentage of visceral fat (measurable values)
As you can see, these types of variables are categorised into two types:
- Diagnostic tests or measurable values
- Subjective information provided by the patient.
The solution to the problem arises from the extraction of the information of the patients from their clinical history, searching for the variables described above to build a training dataset with the highest possible reliability.
For this exercise, we used a dataset provided by Kaggle (kaggle, 2023) , which greatly approximates global behaviour.
After this, we proceed with the training of the adjusted model and, in the same way, the exposure of an API to evaluate new patients.
The response to this evaluation will categorise the patient as a patient without risk or with a potential heart problem. So, the number of cardiologists needed can also be predicted according to the number of patients at risk of developing heart disease.
Applicable AI models
The applicable AI models according to the behaviour of data are:
- Decision tree: a type of supervised learning algorithm used in machine learning and data mining that uses a flowchart-like structure to visualise the decisions made by the algorithm based on the input features of the data (Liberman, 2017) .
- Random forest: is a supervised learning algorithm used in machine learning that combines multiple decision trees to improve the accuracy and robustness of the model.
The dataset we chose has 253.680 rows for data training and was tested (80/20) with the variables mentioned before. The best accuracy it reached was with random forest, with 70%.
It means that of 100 people, the model could hit 70. It is a good number for saving lives without the cost of corrective treatments in a country like Colombia.
Application and utilities of an accurate prediction
Accurately predicting cardiac disease may help apply preventive treatments to save lives cheaply. In the same way, it might reduce patient risk because preventive treatments are cheaper than corrective treatments and more effective (cost, 2017) .
In a country like Colombia, it may make a difference and increase the coverage around vulnerable populations.
Also, with the classification of people, you may determine the need for a heart specialist in a specific area in Colombia to distribute such specialists better, as they are very short-staffed.
CONCLUSION
AI may be a great supporting tool not only for patients who should improve their health habits and have a good preventive treatment to save their lives but also for getting good use of the heart specialist, knowing that it is a scarce resource in Colombia and worldwide.
Around the world, many initiatives exist for using AI in different health issues, mainly for diagnostics. An early diagnosis of any disease makes the difference between life and death.
According to data protection, health entities should participate in open data initiatives in Colombia to provide good input for research on health topics and other issues for the country’s growth.
REFERENCIAS
bayer. (2020). Obtenido de https://www.bayer.com/es/co/las-enfermedades-cardiovasculares-son-la-primera-causa-de-muerte-en-colombia-y-el-mundo#:~:text=septiembre%2001%2C%202020-,Las%20enfermedades%20cardiovasculares%20son%20la%20primera%20causa%20de%20muerte%20en,15.543%20a%20enfer
cost. (2017). Obtenido de https://consultorsalud.com/colombia-invierte-64-billones-al-ano-en-tratar-enfermedades-cardiacas/
kaggle. (2023). Obtenido de www.kaggle.com
Liberman, N. (1 de 2017). Obtenido de https://towardsdatascience.com/decision-trees-and-random-forests-df0c3123f991
minsalud. (s.f.). Obtenido de https://www.minsalud.gov.co/salud/publica/PENT/Paginas/enfermedades-cardiovasculares.aspx
WHO. (2021). Obtenido de https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)
Oscar Sanchez – Data Architect