No announcement yet.

Discussion - 2019-nCoV genetics

  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    I have been trying to work out what I can make the nextstrain tool do and was fairly sure it was capable of more than I had found by trial and error. My search led me to a lecture which I will come back to and link below.

    re rosmarina's post I don't think there has been anything close to evidence that humans were involved in this viruses evolutionary history, beyond being unwitting hosts. I suspect the short fragment was not deemed worth uploading at the time but once COVID arrived it, and bits like it, suddenly became a lot more interesting. I expect more partial sequences but few full genomes. I looked at the human sequence data using nextstrain and looked at the AA mutation frequency across the full genome and its entropy (first image below) to get a feel for which parts were conserved and which changing. In the second image I zoom in to just the short section that matches the fragment (note the little black triangles at the bottom) and there are a few AA changes at random, with very low entropies, indicating they have little impact on the phylogenic tree's structure.

    Click image for larger version  Name:	orf1b1.JPG Views:	0 Size:	85.6 KB ID:	837357

    Click image for larger version  Name:	orf1b2.JPG Views:	0 Size:	73.6 KB ID:	837358
    The sequence covers the region around 15,340 to 15,709 which is part of the RdRp gene which in turn is part of ORF1b. This accounts 1/75th of the genome and so would be expected to show high homology in a conserved region, and when the full sequence is blasted against the NCBI data set I get 89% homology in a range of bat and SARS-1 sequences. This is lower than I expected. It came up in today's TWiV (link below) that when the civit intermediate host for SARS-1 was found those sequences had 99% homology with the human strain (presumably across the full genome, based on context, all though not stated explicitly).

    In the top image I highlighted one spike in green, this is C14408T (therefore outside KP876546) resulting in ORF1b P314L and creating clade A2a which is active in Northern Europe, hence its high entropy score.

    The promised lecture link by Richard Neher, University of Basel on 6 March 2019
    He was a co-developer of Nextstrain and uses it as research tool. He starts with an intro on flu and then starts using it to analyse H3N2 data. Unfortunately in later parts of the video he is not at the podium and the sound is variable also he is pointing out features on graphics for the audience which we can not see which makes it trickier to follow. However if you persevere he looks at the predictive ability of the tree structures and how well their predictions for H3N2 over the years have compared with what actually occurred. The system is obviously making a better than average estimate of which branches would show mutations and become dominant.

    This is a plug for the current TWiV which the panel, and I, both thought was Awesome! I am not even going to try list the areas covered in detail as the list of items not covered would be shorter. It is 2hrs long but I doubt you could improve you understanding of this epidemic with 2hrs spent in any other way. TWiV 591
    Last edited by JJackson; March 15, 2020, 05:37 PM.


    • Kathy
      Kathy commented
      Editing a comment
      JJackson: Thank you for your comments and links. I still belive that is very suspicious that the authors Zhou et al, in the article published by Nature "forgot" to cite that the RaTG13 was first identified and named KP876546 in the work of Ge et al on the coexistence of multiple coronaviruses in bat colonies. I can't believe that they did not go on sequencing and working on such an interesting sequence, that they define a possible novel betacoronavirus species. I stay with my opinion: the backbone of SARS-CoC2 was discovered with that study, they went on working on it, possibly manipulating the spike protein using a CoV from an animal such pangolin, that has a better affinity for ACE2 receptor. The furin cleavage site is extremely unnatural in this group of viruses and must probably added to the virus. I recently read that SARS-CoV2 has four ways to attack human cells and that looks to me not natural.

      We also need to keep in mind that the sequence RaTG13 could have been edited before to submit it to NCBI.
      I will go on with my research on the topic and try to find virologists that could help me writing a publication on the possible human manipulation of SARS-CoV2. Too many people are dying because of it, I can't accept it.

  • #32
    Has anyone seen anything on ORF1a L3606F? I was looking at the NTs which have been regularly cropping up and this one is interesting because it has occurred several times in different parts of the tree and the subsequent cases are holding their own against the wild type. It is the only change I have seen so far that appears to show signs of host adaption. As I do not even know which protein this changes I have no idea what functional change it may make. This is what I have been looking for but a search on ORF1a L3606F only found a few references none of which were particularly helpful.


    • #33
      This paper is excellent and provides a ton of data I had not seen elsewhere on the coding regions and their proteins, functions, biological conformations. Not an easy read but worth the effort.


      • #34

        First SARS-CoV-2 genomes in Austria openly available
        by CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences

        ...Initial sequence analysis of the 29,900 nucleotide-long SARS-CoV-2 genomes from Austria revealed on average 6 mutations different to the reference genome isolated in Wuhan, the capital city of the province of Hubei, China in December 2019. The observed number of mutations is in line with other recently reported SARS-CoV-2 genomes. Most of the observed mutations lead to changes in viral proteins, providing evidence for positive selection pressure and evolution within the human population. Assessing the actual impact of these mutations for the virus life cycle and its interactions with both the host and the immune system will be within the scope of future investigations. Ongoing in-depth genomic analyses focus on mutational hotspots, dissect viral diversity between the Austrian strains and the strains from other countries as well as study of the mutational dynamics of pandemic SARS-CoV-2...


        • #35
          The link is a useful lecture on SARS-2 and looks into how the virus persuades the host translational system to read its ORFs.


          • #36
            This is an interesting paper in terms of the technique and the interactive heat-map tool for looking at the data. The experiment produced engineered yeast cells each of which has a different SARS-CoV-2 spike RBD displayed on its surface. They replace every nucleotide at every position in the RBD and then measure, for each AA change, the effect on protein expression and binding affinity. They also show if that AA is in direct contact with the ACE2 receptor and what the consensus AA was at that position for each of SARS-1, RaTG13 and Pangolins. All of this data can be accessed by hovering the mouse over the squares on the heat map. This is useful data for vaccine formulation, especially mono or poly-clonal antibodies, as it shows functional biological constrained AA positions which will be more resistant to vaccine escape mutations.
   - Interactive heat-map tool.
   - TWiEVO podcast discussion with the authors.

            The heat map works well as an adjunct to the Nextstrain tool, discussed earlier, to examine how well the experimental data matches the actual mutational frequencies. It may also be helpful to have a look at my first post in this thread which looks at additional SL BetaCoV RBDs and the S1 protein structure around the RBD pocket.


            • #37
              Originally posted by Pathfinder View Post
              Mining coronavirus genomes for clues to the outbreak’s origins

              By Jon CohenJan. 31, 2020 , 6:20 PM
              “One of the biggest takeaway messages [from the viral sequences] is that there was a single introduction into humans and then human-to-human spread,” says Trevor Bedford, a bioinformatics specialist at the University of Washington and Fred Hutchinson Cancer Research Center.
              The longer a virus circulates in a human populations, the more time it has to develop mutations that differentiate strains in infected people, and given that the 2019-nCoV sequences analyzed to date differ from each other by seven nucleotides at most, this suggests it jumped into humans very recently. But it remains a mystery which animal spread the virus to humans.
              According to Xinhua, the state-run news agency, “environmental sampling” of the Wuhan seafood market has found evidence of 2019-nCoV. Of the 585 samples tested, 33 were positive for 2019-nCoV and all were in the huge market’s western portion, which is where wildlife were sold. “The positive tests from the wet market are hugely important,” says Edward Holmes, an evolutionary biologist at the University of Sydney ...
              Yet there have been no preprints or official scientific reports on the sampling, so it’s not clear which, if any, animals tested positive. “Until you consistently isolate the virus out of a single species, it’s really, really difficult to try and determine what the natural host is,” says Kristian Andersen, an evolutionary biologist at Scripps Research.
              It’s not just a “curious interest” to figure out what sparked the current outbreak, Daszak says. “If we don't find the origin, it could still be a raging infection at a farm somewhere, and once this outbreak dies, there could be a continued spillover that’s really hard to stop. But the jury is still out on what the real origins of this are.”


              Jimmy Tobias

              Very interesting email from the Fauci documents obtained by

              1:33 PM · Jun 1, 2021·Twitter Web App
              "Safety and security don't just happen, they are the result of collective consensus and public investment. We owe our children, the most vulnerable citizens in our society, a life free of violence and fear."
              -Nelson Mandela


              • JJackson
                JJackson commented
                Editing a comment
                The location of the viral sequences in the market were documented in the WHO origins report. Most were from the drains as the procedure at the market was to hose down the stalls, where animals were slaughtered for food, at the end of the day making it almost impossible to pin point any one stall as being contaminated.
                Last edited by JJackson; June 3, 2021, 06:11 AM.