Machine-learning forecasting model of tuberculosis cases among children in South Africa

Authors

DOI:

https://doi.org/10.17159/sajs.2025/16658

Keywords:

ARIMA model, Bayesian, random forest, machine learning, tuberculosis

Abstract

Globally, children and young adolescents under 15 years old constitute approximately 11% of all tuberculosis (TB) cases, with a growing concern over TB infections in children under 5 years old, especially in resource-limited settings. Nonetheless, the true extent of TB burden among children remains inadequately explored in South Africa. The application of a random forest–Bayesian autoregressive integrated moving average (RF-BARIMA) model for infectious disease prediction has not been previously employed to study TB in children. In this study, we employed the RF-BARIMA model to forecast TB incidences, from 2010 to 2019, among children under 5 years old in South Africa’s Eastern Cape Province. Comparative analysis demonstrated that the RF-BARIMA model outperformed other models in predictive accuracy and forecast capability. Our predictions revealed a projected mean of 0.4122 TB cases per month in 2022, with an effective sample size of 4054 TB cases in the Eastern Cape Province. These findings indicate a prospective reduction of 1670.85 TB cases among children under 5 years old in the coming years. The RF-BARIMA model offers enhanced predictive and forecast accuracy in comparison to the single Bayesian ARIMA model. These results provide compelling evidence of significant under-reporting and potentially elevated TB incidence among children under 5 years old in South Africa’s Eastern Cape Province, raising important implications for public health policy and intervention strategies.
Significance:
Childhood tuberculosis (TB) in South Africa is a significant concern, with the majority of cases occurring in children aged 0–4 years. The burden in children mirrors the high burden of the adult epidemic in the country. The RF-BARIMA model integrates the non-linear pattern of random forest with the probabilistic time series forecasting strengths of Bayesian ARIMA, aiming to improve prediction accuracy and quantify uncertainty in the forecasts. The results lead to a call for urgent public health policy and intervention strategies to address the under-reporting and elevated TB incidence in this vulnerable demographic, further reinforcing the study’s global significance.

Published

2025-11-26

Issue

Section

Research Article

How to Cite

Azeez, A., Osuji, G., Mutambayi, R., & Ndege, J. (2025). Machine-learning forecasting model of tuberculosis cases among children in South Africa. South African Journal of Science, 121(11/12). https://doi.org/10.17159/sajs.2025/16658
Views
  • Abstract 148
  • PDF 76
  • EPUB 26
  • XML 26
  • Supplementary material 70
  • Peer review history 35