I have been trying to understand this zoonotic outbreak and have noticed some discussion which, in my opinion, is based on a misinterpretation of the available data and this post is aimed at trying to give a more comprehensive overview of that data, and implications, than I have managed in short posts elsewhere on the site. I have posted in my workshop because it is just my understanding and does not reflect the view of FluTrackers also, as I hope to add additional posts as things progress, I wanted it not to get lost in a fast moving discussion thread.
Testing.
To understand the data being released you also need to understand the method and limitations of testing. Once a new pathogen is identified and sequenced a RT PCR primer can be generated. The primers will be a short length of the pathogens DNA/RNA chosen from a conserved (slowly changing) part of the genome which can be matched to pathogen samples. The sequence needs to be long enough not to keep finding matching sequences in the hosts DNA but not so long that random variations in the genome start causing false negatives. Now we have a primer we need to distribute it to as many labs as possible that have RT PCR capacity and can safely handle the pathogen. These are limited and while not a problem outside China, at present due to few cases, in China the situation is different.
Labs are testing as many samples as they can but there is a backlog collected prior to the primers being available and many more suspect cases being generated so only a sub set of these can be tested. So who to test, there is a rush on to find the animal host so very early suspect cases, even before the fish market cluster, become a priority along with the more severe cases. The implication being that mild cases, or ones with atypical symptoms, will have to wait their turn giving a heavily skewed sample towards severe cases the implications of which I will look at in the section on epidemiology. All of this is normal. In H1N1(2009) before primers were widely distributed the CDC was testing about 100 samples per day with a backlog in the 10s of thousands, in the West African Ebola outbreak the case counts broke down due to ETCs (Ebola Treatment Centres) being full and closed to new patients who consequently could not be counted.
While it is only beginning to be possible to do serological testing, due to the time lag between infection and antibody build up, this should soon begin to provide useful data. Serological tests will put pressure on different lab equipment and personnel so should be able to run in parallel to RT PCR. Here blood is tested for the presence of antibodies, with antisera, a positive reaction showing the patient had been challenged by the virus in the past and overcame it. What this should tell us is how many cases there have been who did not show symptoms – or at least not to the point of warranting hospitalisation and virus testing. These people should now have a reasonable level of protection making them not part of the ‘susceptible population’, see epidemiology, and add to herd immunity. If there are large numbers of infections that are self-limiting and mild it radically changes the outbreak’s sustainability and impact.
Contacts.
It is normal to expect a disease to increase in transmissibility as the patients symptoms get worse due to high viral titres. Exactly how this happens is not constant SARS did not become infectious until there were very high viral loads late in the disease, flu – and apparently nCoV – are infectious prior to symptom onset but there is an infectivity curve over time where and how this peaks is disease dependent but will show some correlation to viral load. ‘Close contacts’ refers to the unprotected contacts in a family or healthcare setting where protracted intimate contact can be expected as opposed to ‘contacts’ which are what you get if you meet someone while walking or shopping. These are very different in terms of the probability in catching the pathogen.
Symptoms, treatment and capacities.
The symptom set is quite wide but from early data high temp, cough and breathing difficulty top the list. For a respiratory pathogen it has a high incidence of digestive tract symptoms. Some form of oxygen treatment is used in most cases but not that many end up with invasive mechanical oxygenation or ECMO, in fact one paper shows more patients were on Liver function treatment. The key indicators from blood testing is low lymphocyte and platelet counts. Wuhan is having hospital capacity problems which they are working hard to overcome but as spread continues nationally and internationally this is likely to become a problem for us all. At present we are able to help out the epicentre by bringing in extra capacity from outside but as it spreads first to other Chinese cities and then globally this option will dry up. Treatment is just alleviation of symptoms as there is nothing specific. Vaccine and antivirals specific to this disease will not be available, probably for years, so various off label drugs are being tried with some anecdotal evidence of limited success – early days. One study showed human and horse serum could be useful for treatment.
Links:
https://reader.elsevier.com/reader/s...B848F5AD13AA87
While not the paper I took the data in the text from (which I can't now find) it does cover the same ground. One point of interest is in Table 3 which shows 3 patients (n=41) on continuous renal replacement therapy. I have been keeping an eye open for this as the kidney are rich in ACE2 receptors and so kidney damage in later disease may indicate both increased tissue tropism and that ACE2 is the being used.
The second link is to a US case history following a single patient's treatment. It informative as it gives significant detail on the procedures used and is an example of idealised treatment in an unstressed health system with vast resources. If I get it I very much doubt I would get the same. https://www.nejm.org/doi/full/10.105...jmO7T4.twitter
Virology.
wCov is a positive sense single stranded RNA virus (+ssRNA) with one very long (30,000 nucleotide) segment. As most readers are going to be more familiar with Flu I will try and show how and where they are different. Flu is –ssRNA with 8 segments of about 12k bases in total and half of this is used to make its RNP complex which is only needed for negative sense viruses. Both are encapsulated viruses with prominent surface spike proteins, for flu theses are HA and NA in nCoV it is the S protein. The S protein contains the primary antigenic site and the RBD (Receptor Binding Domain) while HA does the same for flu. Flu uses host proteases to cleave HA for cell endocytosis but nCoV needs to supply its own RdRp (RNA dependent RNA protease) which it cannot replicate without. RdRp and S’s RBD are probably the most likely candidates for a specific nCoV small molecule pharmaceutical. S’s RBD probably binds to the host ACE2 receptor (as HA does to sialic acid residues). SARS uses ACE2 and nCoV has been shown to bind to HeLa cells genetically engineered to express various types of ACE2 receptor including human, bat and pig but not mouse. This does not necessarily mean it is the only cell it can bind to but is an obvious place to start.
Genetically there are not many samples from the virus’ natural genetic reservoir in bats which is not due to a lack of effort, one team literally spent years netting caves across China looking for the SARS reservoir before they found their first match. CoVs are divided into groups and SARS and nCoV are part of the b (beta) branch. The related bat sequences are grouped into two some of which are referred to as SARS like (SL betaCoVs) while the others generally get listed as bat betaCoVs. Most of both these groups have a very short sequence in the Spike RBD but SARS, MERS and nCoV all have a massive insertion making a long appendage which, while attached through the amino acid's backbone covalent bond, is otherwise largely unlinked to the rest of the protein either by covalent or hydrogen bonding. These inserts are not common across those that have them with very different structures in SARS, MERS and nCoV. The sequences we are getting now for nCoV are showing a lot of sequencing errors which is making it very difficult to say if the minor deviations from the consensus strand are due to lab error or real genetic change, of which there has been little. All the new sequences are basically identical and a number of estimates for the date of convergence are all pointing towards late Nov/ early Dec 2019 implying the all stem from a single introduction. Testing may yet find other independent introductions, either form the same host animal or others, which may be significantly different but they do not seem to have been involved in the general expansion of cases found to date.
Links
The graphic above is taken from https://www.bilibili.com/read/cv4457676/ it is in Chinese but this worked for me https://translate.google.com/transla...2Fcv4457676%2F . From top to bottom the first line (0 - 30k) shows the nucleotide sequence position, below it the open reading frames (ORF1, S, N) then we have the data plot showing how well SARS & various bat SL CoVs match nCoV. RaTG13 being much closer than any other sequences particularly across the spike (S gene covering nucleotides 22 to 25k), CZ45 is close across ORF1a but reverts to SL CoV consensus homology on S. (more detail can be found in the nCoV-genetics thread).
The second link is to https://www.thelancet.com/journals/l...251-8/fulltext
(again I can not find the paper I originally read that had clearer protein models but will update if I do) the first graphic shows the protein model with their host binding protein, the point to note is the non purple sections on the CoV proteins, these are insertions mentioned in the text, in the second graphic the table shows which sequences have them, the big white block showing most SL CoV just don't have them. Note also that while MERS does it uses CD26 not ACE2 to bind to.
As scientists begin to work on the problem towards peer reviewed publication they have been using virological.org to clarify their thoughts and then biorxiv.org before the paper is available for pre-review publication on the journals site - all of these are open access and are goto sites for early information.
This link is to the article pathfinder translated and posted re. finding the bat samples. It is quite long but I include it as it shows firstly how much effort goes in to getting samples from wild animals which is why the sequence databases have so few of them. The second point is (from memory) for all their years of work they got 3 live virus samples and 14(?) sequences and all from one cave. The table above holds most of the relevant sequences that exist and if 14 of them came from one cave over a short period how confident should we be they are a realistic representative gene pool. Within these 14 sequences they had all the genes needed to make SARS - with a bit of cut'n'paste. This a problem I have found in flu, lots of one sub-type in a waterfowl only to find all but a couple were netted by one team at one spot over 2 days.
Epidemiology.
The pattern of introductions has been exactly what you would expect for a respiratory disease with a single introduction. Clustering about case zero with spread to other areas with high traffic to the epicentre which is creating new geographically and temporally independent clusters. This pattern should become more confused as cross seeding between the new clusters – rather than all coming from a common point source (Wuhan) – starts making the origins of new clusters harder to locate. ‘Self-sustaining clusters’ are those where you consider only locally generated extended transmission chains and not those patients, or their close contacts, who came (or are still coming in) from another area. We are seeing this in other areas in China and early signs in other countries. Unless these other countries can catch every incubating carrier at the boarder – which seems highly unlikely - I see no reason this pattern should not be repeated everywhere else. When modelling the spread and impact of a disease other metrics come into play.
R0(R naught) is how many new hosts each current host infects directly. It is not a fixed number for a given disease but will vary both by location, and over time, and can be depressed deliberately by our attempts at quarantine, hand hygiene, masks, social distancing etc. A R0 above 1 implies continued spread below 1 and the outbreak peters out, numbers a little above 1 are amenable to wrestling below 1 with interventions but 2 or above is going to merely slow not stop the spread as the necessary interventions would be impractical. To get a decent estimate of R0 we need accurate numbers of infected and new infections but until testing capacity gets close to matching everyone we would like to test this can only be a rough estimate with a large margin of error. It is however very important in calculating treatment capacity, as each city cluster grows and wains it puts terrible strain on medical surge capacity. Once capacity is exceeded treatment of even severe cases becomes impossible and the fatality rate will rise, regardless of how good/advanced a nation’s health care facilities are, as anyone above the capacity limit will not benefit from them. If you cannot stop spread then slowing becomes the priority so, while the total patient load may be the same, distributing them over a long time means few patients above the surge limit and better treatment for more people.
CAR (clinical attack rate) how many patients catch a disease divided by the susceptible population. Who is susceptible? Homo sapiens would not have been very successful, and may not have survived as a species, without significant diversity in immunity. While you or I may die it is the variation across all of us that stop us from getting wiped out. For all the diseases I know of at least some of us have a natural immunity and we all handle infections differently. Even if everybody is challenged by the virus some will not get infected, some will get infected and recover (with or without symptoms) while developing antibodies others will die – which ever route they take all of these are no longer part of the susceptible population and increase the herd immunity while depressing the R0. The elephant in the room is the first of these for which we have no way of deriving a reliable estimate for at this stage, the others are insignificant at the moment but will grow over time.
CFR (Case Fatality Rate) number who die/number of infected. Used as a proxy for disease severity it is again dependent on the true number of infections (including mild & subclinical) which again we do not know due to lack of testing across all of the clinical severity spectrum.
Notes on disease progression.
There is some evidence that patients can be infectious almost immediately and asymptomatically but I expect them to be much more infectious if they become severely clinically ill. They seem to become symptomatic after ~6 days, if ill enough to warrant testing and found positive they will be treated but not ‘cured’ until virus free for 2 weeks (estimated time before deemed safe from starting further spread), so there is going to be a time lag between those who died and those declared virus free who came into the system on the same day. To compare the two for CFR estimates you would need to be taking the numbers of survivors coming out of the system a couple of weeks after the death figures, with an adjustment for outbreak growth, and that still does not allow for those infected but never tested.
Conclusion
Much of the above is an explanation of my reasoning in posting that I was expecting this to most closely resemble H1N1(2009), in terms of impact and spread, which seemed at odds with the discussion consensus of a CFR of 2%+. This would be similar or worse than H1N1(1918) - assuming a similar 'susceptible population' size - leading to over 100 million deaths (allowing for global population growth). 1918 saw 25 million dead in the first 25 weeks (according to wikipedia) which feels wrong for this outbreak. I may be way off the mark - time will tell.
Comment