Hi all
I'm new here, but I'm currently working on a project for a computational biology class involving the evolution of the influenza a virus, and similarities between the genes of various pandemic strains.
Our hypothesis is currently that all strains of pandemic influenza A descend from the H1N1 virus that caused the Spanish Flu pandemic. Also, we feel that the closer a virus is (genetic distance) to the H1N1 virus, the more of a pandemic threat it may be. However, with more research I've considered that we may want to think that the H2N8 virus that caused the "Asiatic" pandemic of 1889 may be a better reference point than the H1N1 virus. I've also read some research that suggests the H2N2 virus is a recombination of the H2N8 virus.
Now, our methods are relatively simple, we use the program Matlab to get sequences of genes from NCBI, and locally align them, create phylogenetic trees from the Jukes-Cantor distance values, and then finally run a bootstrapping algorithm to assess the confidence of the tree branches.
We use two of the HA, NA, PB1 and PB2 genes for each viral strain. The current strains we use are H1N1, H3N2, and H5N1.
Our results are sparse, as that is pretty limited information. We decided to increase the number of genes to six per virus (6 HA, 6 NA, 6 PB1, 6 PB2). This is promising, but I have a few questions for Influenza experts.
What should be our 'baseline' influenza strain? Should we use H1N1, H2N8 (there's very limited data on NCBI available) or H2N2 (which is reportedly very similar or was the cause of the 1889 pandemic).
Should we add more influenza strains? I've considered throwing in the H7N7, H1N2 and H2N2, as well as the H2N8.
Just in terms of notation of the viruses, I've just assumed this, but if two strains share the same "H" or "N" notation, does that mean they have a certain mutation within the HA or NA gene? And if so, would it be valid to draw a conclusion that relates the relation of similar mutations (H1/2/3/etc) to the virulence of the strain, as well as its pandemic potential as compared to the 'baseline' strain?
I can add our inital phylogenetic trees as well as other specifics, but please help!
I'm new here, but I'm currently working on a project for a computational biology class involving the evolution of the influenza a virus, and similarities between the genes of various pandemic strains.
Our hypothesis is currently that all strains of pandemic influenza A descend from the H1N1 virus that caused the Spanish Flu pandemic. Also, we feel that the closer a virus is (genetic distance) to the H1N1 virus, the more of a pandemic threat it may be. However, with more research I've considered that we may want to think that the H2N8 virus that caused the "Asiatic" pandemic of 1889 may be a better reference point than the H1N1 virus. I've also read some research that suggests the H2N2 virus is a recombination of the H2N8 virus.
Now, our methods are relatively simple, we use the program Matlab to get sequences of genes from NCBI, and locally align them, create phylogenetic trees from the Jukes-Cantor distance values, and then finally run a bootstrapping algorithm to assess the confidence of the tree branches.
We use two of the HA, NA, PB1 and PB2 genes for each viral strain. The current strains we use are H1N1, H3N2, and H5N1.
Our results are sparse, as that is pretty limited information. We decided to increase the number of genes to six per virus (6 HA, 6 NA, 6 PB1, 6 PB2). This is promising, but I have a few questions for Influenza experts.
What should be our 'baseline' influenza strain? Should we use H1N1, H2N8 (there's very limited data on NCBI available) or H2N2 (which is reportedly very similar or was the cause of the 1889 pandemic).
Should we add more influenza strains? I've considered throwing in the H7N7, H1N2 and H2N2, as well as the H2N8.
Just in terms of notation of the viruses, I've just assumed this, but if two strains share the same "H" or "N" notation, does that mean they have a certain mutation within the HA or NA gene? And if so, would it be valid to draw a conclusion that relates the relation of similar mutations (H1/2/3/etc) to the virulence of the strain, as well as its pandemic potential as compared to the 'baseline' strain?
I can add our inital phylogenetic trees as well as other specifics, but please help!
Comment