No announcement yet.

Discussion - 2019-nCoV genetics

  • Filter
  • Time
  • Show
Clear All
new posts

  • gsgs

    here is another paper :
    > The BatCoV RaTG13 sequence was downloaded from the GISAID BetaCov 2019-2020 repository

    IMO it's a shame, that this important sequence is not public and at genbank.
    When it's at GISAID, it may only be revealed to other GISAID member,
    that's not really "publicly available", as they claim.

    no recombination BaTG13 -- 2019-nCoV
    Last edited by gsgs; January 30, 2020, 10:37 PM.

    Leave a comment:

  • Emily
    There is an interesting post at, “not snakes v2”, that discusses fungi and SARS, MERS, and nCov-2019. They joke that in spite of the high CAI values that there is no significance to this.

    I'm wondering now.

    Widespread Bat White-Nose Syndrome Fungus, Northeastern China

    Do Viruses Exchange Genes across Superkingdoms of Life?

    Leave a comment:

  • gsgs
    2013 is still 7 years ago, surprisingly long for the 96.5% similarity. Compared with the other viruses.
    Suggesting lots of bat-coronavrus diversity ...
    I don't know about the different proteins but read about the recombinations and different mutation rates.
    That's why I made the pics with mutationrate over the 30000-genome.

    The region ~12000-~20000 looks suitable for mutation-timing-comparison


    where did you find 2013 ? It's not in the paper. Well, the 13 in RaTG13 may stand for 2013

    > RaTG13 which we previously detected in Rhinolophus affinis from Yunnan Province showed

    I couldn't find a current authors in a related reference

    Yang, L. et al. Novel SARS-like Betacoronaviruses in Bats, China, 2011.
    Emerg Infect Dis 19, 989-991, (2013)

    Hu, B. et al. Discovery of a rich gene pool of bat SARS-related
    coronaviruses provides new insights into the origin of SARS coronavirus.
    PLoS pathogens 13, e1006698, (2017)

    Wang, N. et al. Serological Evidence of Bat SARS-Related Coronavirus
    Infection in Humans, China. Virol Sin 33, 104-107, (2018)
    Last edited by gsgs; January 27, 2020, 11:17 PM.

    Leave a comment:

  • JJackson
    gs Recombination seems to be a feature of this virus, along with deletions and insertions. There has been an interesting conversation developing over at Virological relating to the dangers of using whole genome homology or even single gene homology to achieve the true proximity of isolates. Large recombination events causing poor homology overall while very high homology remains across the unaffected sections of RNA.

    Do you have a sample collection date for RaTG13? The very high homology across the Spike gene is at odds with everything else except nCoV2019 including all other bat SL CoVs given that this is probably the least conserved region, unless the sample was very recent, I do not see how it has maintained its sequence so faithfully nor why this outlier has given its S gene genetics to nCoV. Is there something specific to this unique Spike sequence that makes it well adapted to infect humans? The SL CoVs generally are well know for there ability to spread to other mammalian host (civet, racoon dog etc.) so it seems unlikely that these two Chinese culinary favourites have not presented SL CoVs to humans since SARS yet this atypical sequence has very successfully made the jump and seems to have little difficult binding to our ACE2 receptors (assuming that is what they are using for access this time around).

    I found the RaTG13 date and it was 2013 so several years of drift.
    Last edited by JJackson; January 27, 2020, 08:43 PM.

    Leave a comment:

  • gsgs
    I finally got it aligned etc. , 395 coronavirus genomes from genbank, mostly SARS.
    file corona20.c6 , 16MB , someone wants it ?
    The recombinations are not so clear and there could be variations of the mutation rate in the 30000 nucleotide genome
    I made these charts, #mutations per 100 nucleotides:

    the fasta was 1.2GB, now downloading the whole genbank records with the dates etc.
    Last edited by gsgs; January 27, 2020, 10:12 PM.

    Leave a comment:

  • gsgs

    we need this RaTG13 virus, which is by far the closest.
    Allegedly from Rhinolofus affinis from Yunnan

    Data Availability statement. Sequence data that support the findings of this study
    have been deposited in GISAID with the accession no. EPI_ISL_402124 and


    Yen Shu Chen • 3 days ago
    The authors did not share (no GenBank/GISAID accession number are
    provided) the genome sequence of the critical bat-CoV that represents a
    close relative to human 2019-nCoV.
    No way to access/reproduce/further use their result. Do scientific journals accept such practice?
    reuns Yen Shu Chen • 2 days ago • edited
    Exactly, there has been thousands of bats coronavirus sequenced since the Sars epidemic,
    it is not unusual that most of them are just lost into some dusty lab, but it is weird they
    didn't upload thisone since it is the most astonishing result of their study, and the exact
    genome might contain some useful information on how bat viruses can mutate and
    contaminate human. Also note that the paper is from Wuhan's institute of virology, probably
    the same lab which discovered the aforementioned bat coronavirus
    Last edited by gsgs; January 27, 2020, 01:53 AM. Reason: link

    Leave a comment:

  • JJackson
    I am learning as I go as I have not looked at CoVs before. What I have gathered is that a fair bit of research was performed in to SL SARS which are the group of beta CoVs found in the host reservoir that forms the genetic pool from which SARS emerged. Much of the RNA is conserved with 95%+ homology but like flu there are areas where that drops to 85%ish. One area that seems to be a focus of interest S 153syn which sits in the S2 binding pocket making it a potential drug target which, if I understood the TWiV you linked to, is an ACE2 receptor. This pocket is part of the low homology zone and includes the primary antigenic site making life harder for the immune system. Much of this may be my poor understanding of what I read as I am getting out of my comfort zone with these papers.

    Leave a comment:

  • gsgs
    there was speculation about a lab-release , biggameindia zerohedge , are the mutations and comparisons with older bat-viruses matching or are there timely gaps , has this been checked

    I'm currently downloading and aligning coronaviruses from genbank, got 388 , so slow with mafft

    Leave a comment:

  • JJackson
    started a topic Discussion - 2019-nCoV genetics

    Discussion - 2019-nCoV genetics

    For those of us with an interest in viral genetics I have started this thread to look at the little data that is now available. What follows is not my analysis but based on the work posted on the forum.

    1] The consensus is that much of the sequence data is unreliable due to the high number of sequencing errors.
    2] Based on the more trusted sequences the MRCA (most recent common ancestor) is probably in early Dec. 2019 giving a recent single common ancestor for all sequences. If accurate this means it probably has not been circulating below the radar for a while.
    3] Due to the recent MRCA the cladogram is boringly flat with only a one or two AA variance from WH01 and most being identical. Where there are changes they seem random with no evidence of host adaption.

    A note on the Virological site. If you followed the H7N9 discussion thread you may remember Andrew Rambaut from Edinburgh Uni. who posted really useful phylogenetic trees and analysis on the University's Epidemic site (which had sections on MERS, Ebola and flu). When this outbreak occurred I checked the site only to find it had disappeared but Andrew had reappeared as Administrator of Virological which at that time only had a couple of threads. In the main discussion thread I posted a speculation this was a Beta CoV before it was announced, I would not normally have done this based on a single media report but the article was by Lisa at Cidrap and she was quoting Marion Koopmans of Erasmus. I trust MK not to have made the comment without good reason and LS not to use it without reason to think MK had good reason for saying it. Which brings me back to Virological where MK & AR are contributing and which was used to post the first open source sequence. Currently some sequences are being deposited at GISAID and others at Genebank.