BMC Public Health
. 2025 Jul 29;25(1):2577.
doi: 10.1186/s12889-025-23862-2. COVID-19 risk stratification among older adults: a machine learning approach to identify personal and health-related risk factors
Arezoo Abasi 1 , Seyed Abbas Motevalian 2 3 , Haleh Ayatollahi 4
Affiliations
Background: The COVID-19 pandemic highlighted the need to understand factors influencing individuals' risk perceptions and health behaviors. This study aimed to explore the roles of individuals' knowledge, perception, and health-related issues in determining COVID-19 risk by developing a predictive model for classifying individuals into the risk categories, incorporating both clustering and model interpretation techniques.
Methods: To identify distinct COVID-19 risk groups, clustering analysis was applied using the demographic, health, and behavioral data. Subsequently, several machine learning models-including CatBoost, XGBoost, Random Forest, Generalized Linear Model (GLM), Decision Tree, H2O Deep Neural Network (DNN), and L2 SVM-were used to predict risk classifications. SHAP (SHapley Additive exPlanations) analysis was applied to interpret the contribution of individual features in model predictions.
Results: Three distinct risk classes were identified: Class 0 (high knowledge, low-risk factors, no household COVID-19 diagnosis), Class 1 (health-related issues (e.g., hypertension), low lnowldge), and Class 2 (high knowledge, higher health risks (e.g., hypertension, household COVID-19 diagnosis)). L2 SVM achieved the highest accuracy (0.9724), followed by XGBoost (0.9301) and CatBoost (0.9265). SHAP analysis revealed that household hygiene practices and health-related issues, such as hypertension and Gastrointestinal symptoms were key drivers of risk classification.
Conclusion: Integrating individuals' knowledge, perception, and health-related issues into COVID-19 risk assessments enhances predictive accuracy. Public health policies should focus on both physical and psychological factors to effectively mitigate the spread and impact of COVID-19. Data-driven models may inform future efforts to prioritize resource allocation and improve public health responses for vulnerable populations.
Keywords: COVID-19; Health behavior; Machine learning; Perception; Predictive learning models.
. 2025 Jul 29;25(1):2577.
doi: 10.1186/s12889-025-23862-2. COVID-19 risk stratification among older adults: a machine learning approach to identify personal and health-related risk factors
Arezoo Abasi 1 , Seyed Abbas Motevalian 2 3 , Haleh Ayatollahi 4
Affiliations
- PMID: 40730988
- PMCID: PMC12306135
- DOI: 10.1186/s12889-025-23862-2
Background: The COVID-19 pandemic highlighted the need to understand factors influencing individuals' risk perceptions and health behaviors. This study aimed to explore the roles of individuals' knowledge, perception, and health-related issues in determining COVID-19 risk by developing a predictive model for classifying individuals into the risk categories, incorporating both clustering and model interpretation techniques.
Methods: To identify distinct COVID-19 risk groups, clustering analysis was applied using the demographic, health, and behavioral data. Subsequently, several machine learning models-including CatBoost, XGBoost, Random Forest, Generalized Linear Model (GLM), Decision Tree, H2O Deep Neural Network (DNN), and L2 SVM-were used to predict risk classifications. SHAP (SHapley Additive exPlanations) analysis was applied to interpret the contribution of individual features in model predictions.
Results: Three distinct risk classes were identified: Class 0 (high knowledge, low-risk factors, no household COVID-19 diagnosis), Class 1 (health-related issues (e.g., hypertension), low lnowldge), and Class 2 (high knowledge, higher health risks (e.g., hypertension, household COVID-19 diagnosis)). L2 SVM achieved the highest accuracy (0.9724), followed by XGBoost (0.9301) and CatBoost (0.9265). SHAP analysis revealed that household hygiene practices and health-related issues, such as hypertension and Gastrointestinal symptoms were key drivers of risk classification.
Conclusion: Integrating individuals' knowledge, perception, and health-related issues into COVID-19 risk assessments enhances predictive accuracy. Public health policies should focus on both physical and psychological factors to effectively mitigate the spread and impact of COVID-19. Data-driven models may inform future efforts to prioritize resource allocation and improve public health responses for vulnerable populations.
Keywords: COVID-19; Health behavior; Machine learning; Perception; Predictive learning models.