Water Res


. 2021 Sep 25;205:117710.
doi: 10.1016/j.watres.2021.117710. Online ahead of print.
High-throughput sequencing of SARS-CoV-2 in wastewater provides insights into circulating variants


Rafaela S Fontenele 1 , Simona Kraberger 2 , James Hadfield 3 , Erin M Driver 4 , Devin Bowes 4 , LaRinda A Holland 2 , Temitope O C Faleye 4 , Sangeet Adhikari 5 , Rahul Kumar 4 , Rosa Inchausti 6 , Wydale K Holmes 6 , Stephanie Deitrick 7 , Philip Brown 8 , Darrell Duty 9 , Ted Smith 10 , Aruni Bhatnagar 10 , Ray A Yeager 2nd 10 , Rochelle H Holm 10 , Natalia Hoogesteijn von Reitzenstein 11 , Elliott Wheeler 11 , Kevin Dixon 11 , Tim Constantine 11 , Melissa A Wilson 12 , Efrem S Lim 1 , Xiaofang Jiang 13 , Rolf U Halden 14 , Matthew Scotch 15 , Arvind Varsani 16



Affiliations

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) likely emerged from a zoonotic spill-over event and has led to a global pandemic. The public health response has been predominantly informed by surveillance of symptomatic individuals and contact tracing, with quarantine, and other preventive measures have then been applied to mitigate further spread. Non-traditional methods of surveillance such as genomic epidemiology and wastewater-based epidemiology (WBE) have also been leveraged during this pandemic. Genomic epidemiology uses high-throughput sequencing of SARS-CoV-2 genomes to inform local and international transmission events, as well as the diversity of circulating variants. WBE uses wastewater to analyse community spread, as it is known that SARS-CoV-2 is shed through bodily excretions. Since both symptomatic and asymptomatic individuals contribute to wastewater inputs, we hypothesized that the resultant pooled sample of population-wide excreta can provide a more comprehensive picture of SARS-CoV-2 genomic diversity circulating in a community than clinical testing and sequencing alone. In this study, we analysed 91 wastewater samples from 11 states in the USA, where the majority of samples represent Maricopa County, Arizona (USA). With the objective of assessing the viral diversity at a population scale, we undertook a single-nucleotide variant (SNV) analysis on data from 52 samples with >90% SARS-CoV-2 genome coverage of sequence reads, and compared these SNVs with those detected in genomes sequenced from clinical patients. We identified 7973 SNVs, of which 548 were "novel" SNVs that had not yet been identified in the global clinical-derived data as of 17th June 2020 (the day after our last wastewater sampling date). However, between 17th of June 2020 and 20th November 2020, almost half of the novel SNVs have since been detected in clinical-derived data. Using the combination of SNVs present in each sample, we identified the more probable lineages present in that sample and compared them to lineages observed in North America prior to our sampling dates. The wastewater-derived SARS-CoV-2 sequence data indicates there were more lineages circulating across the sampled communities than represented in the clinical-derived data. Principal coordinate analyses identified patterns in population structure based on genetic variation within the sequenced samples, with clear trends associated with increased diversity likely due to a higher number of infected individuals relative to the sampling dates. We demonstrate that genetic correlation analysis combined with SNVs analysis using wastewater sampling can provide a comprehensive snapshot of the SARS-CoV-2 genetic population structure circulating within a community, which might not be observed if relying solely on clinical cases.

Keywords: High-throughput sequencing; SARS-CoV-2; Surveillance; Wastewater; Wastewater-based epidemiology.