Sci Rep
. 2025 Nov 26;15(1):42110.
doi: 10.1038/s41598-025-26072-3. Model fit vs. predictive reliability: a case study of the 1978 influenza outbreak
Denis Tverskoi 1 2 , Grzegorz A Rempala 3 4
Affiliations
The 1978 A/H1N1 influenza outbreak in a boarding school in northern England provides a widely used dataset for epidemiological modelling. However, its focus on recovery dynamics rather than transmission presents challenges for standard SIR (SEIR) modelling. Recent attempts to fit the data using a delay differential equation (DDE) model have demonstrated an excellent retrospective fit but raise concerns regarding complexity, the effects of overfitting, and an unusually high estimate of the basic reproduction number. This study assesses the robustness of the DDE model, revealing that its parameter estimates and outputs-particularly the basic reproduction number-are highly sensitive to dataset size distortions, leading to poor predictive performance in the early epidemic stages. To address these limitations, we propose a simpler stochastic SIR model that, while yielding a slightly worse retrospective fit, provides more stable and reliable forward predictions. Our findings challenge the notion that classic SIR-type models cannot capture the dynamics of the 1978 outbreak and highlight the risks of undesirable effects of over-parametrization when employing complex models. More broadly, this study underscores the importance of balancing model complexity with predictive reliability in epidemiological modeling, in order to support robust public health decision-making when data availability is limited.
Keywords: Approximate Bayesian computation; Delay differential equation; Forward prediction; Overfitting; SIR model.
. 2025 Nov 26;15(1):42110.
doi: 10.1038/s41598-025-26072-3. Model fit vs. predictive reliability: a case study of the 1978 influenza outbreak
Denis Tverskoi 1 2 , Grzegorz A Rempala 3 4
Affiliations
- PMID: 41298851
- PMCID: PMC12658211
- DOI: 10.1038/s41598-025-26072-3
The 1978 A/H1N1 influenza outbreak in a boarding school in northern England provides a widely used dataset for epidemiological modelling. However, its focus on recovery dynamics rather than transmission presents challenges for standard SIR (SEIR) modelling. Recent attempts to fit the data using a delay differential equation (DDE) model have demonstrated an excellent retrospective fit but raise concerns regarding complexity, the effects of overfitting, and an unusually high estimate of the basic reproduction number. This study assesses the robustness of the DDE model, revealing that its parameter estimates and outputs-particularly the basic reproduction number-are highly sensitive to dataset size distortions, leading to poor predictive performance in the early epidemic stages. To address these limitations, we propose a simpler stochastic SIR model that, while yielding a slightly worse retrospective fit, provides more stable and reliable forward predictions. Our findings challenge the notion that classic SIR-type models cannot capture the dynamics of the 1978 outbreak and highlight the risks of undesirable effects of over-parametrization when employing complex models. More broadly, this study underscores the importance of balancing model complexity with predictive reliability in epidemiological modeling, in order to support robust public health decision-making when data availability is limited.
Keywords: Approximate Bayesian computation; Delay differential equation; Forward prediction; Overfitting; SIR model.