Announcement

Collapse
No announcement yet.

genbank update 2018/08/25

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • genbank update 2018/08/25

    genbank 2018/08/25 , available influenza sequences
    -----------------------------------------------------
    641444 sequences , 1053 MB , 42MB headers, 1011MB nucleotides

    3590 Kommata


    A/:553240,281557hu,184272av,67796sw,7733eq,6763en,26 48ca,
    119Seal,110Ferret,107Cat,98Tiger,97Feline,94Mink,5 6Pika,45Wild boar
    40Penguin,40Bat,20Muskrat,17Civet,16Raccoon dog,15Leopard,15Camel
    13Whale,13Mouse,11Cheetah,8Stone marten,8Sloth bear,8Panda,8Lion
    6Raccoon,4Giant anteater,3Blow fly,2Skunk,1Beetle
    965Reass,240NON,5Unknown
    8 segments :
    553240=55484+55205+55333+122346+56174+82325+68806+ 57567
    aHu:281557=26984+26720+26813+65591+27766+44005+355 15+28163
    aAv:184272=21080+21069+21011+32119+20723+24883+219 21+21466
    aSw:67796=5952+5917+5998+16143+6168+11715+9651+621 1
    aEq:7733=163+168+156+6364+170+271+210+231
    aEn:6763=787+781+778+1154+789+842+809+823
    aCa:2648=228+264+290+597+245+265+384+375

    B/:86076,85816hu,2seal,258reass.(hu)
    86076=8438+8460+8466+20404+8456+13441+8575+9836

    C/:1726,1698hu,18sw,2ca,28bo
    1726=230+226+225+334+226+257+228

    rest:402,25hu,21av,7sw,82eq,263en,4(lab)
    402=27+22+16+137+18+11+158+13
    A:383=26+20+14+135+16+9+153+10 , B:18=2+2+2+2+2+3+2+3 , ?:1


    years:
    aHu:
    1918:24,1,0,0,0,0,0,0,0,0,0,0
    1930:00,00,00,77,198,52,8,0,0,0
    1940:18,00,20,39,00, 8,25,39,25,23
    1950:36,51, 1,16,36, 2,17,368,68,45
    1960:42,23,31,70,52,51,35,171,601,90,1969
    1970: 84,113,215, 95,105, 94,227,189,180, 73
    1980:162, 98,161,506, 94,174,163,136,193,345
    1990:201,464,463,1030,713,1074,1160,977,1007,1900
    2000:2669,2350,3105,5742,4784,5555,5508,9788,9764, 52832
    2010:14205,12396,11661,12659,13241,19255,31265,312 81,11930,0,0
    6776*NON/ , 30*unkn , 2*//

    aAv:
    1900:0,0,17,0,0,0,0,0,0,0
    1910:0,0,0,0,0,0,1,3,0,0
    1920:0,0,0,0,1,8,0,21,0,0
    1930:0,0,0,0,56,0,0,0,0,0
    1940:0,0,0,0,0,1,0,0,0,31
    1950:0,0,0,11,0,0,78,0,0,26
    1960:3,22,2,77,1,49,86,10,45,23
    1970:2,35,69,58,187,455,1092,565,702,826
    1980:572,475,618,803,493,567,713,648,586,357
    1990:326,396,223,309,1154,803,636,912,1347,2487
    2000:3574,3931,4884,4128,5432,8906,11840,13559,116 89,13821
    2010:11105,11070,9337,14553,15100,11138,7127,3722, 58,0,0
    295*NON/ , 1*unkn , 14*//

    aSw:
    1930:16,40,0,0,0,10,0,4,3,3
    1940:0,0,9,0,0,2,2,0,0,1
    1950:0,0,1,0,2,0,0,10,0,0
    1960:0,12,0,1,0,2,8,9,4,2
    1970:10,8,1,3,2,26,147,644,150,139
    1980:91,189,32,25,43,77,42,75,63,27
    1990:49,64,105,223,214,81,123,147,261,522
    2000:404,666,494,1225,1189,1453,1535,1321,1559,337 7
    2010:4921,7785,7318,6672,5196,5464,6133,3668,1234, 0,0
    2449*NON/ , 8*Unkn , 1*//





    serotypes aHu :
    392*H1 , 126921*H1N1 , 460*H1N2 , 2*H1N9 , 33*mixedŽH1
    2*H2N1 , 1058*H2N2 , 2*mixedŽH2
    1258*H3 , 2*H3N- , 11*H3N1 , 145061*H3N2 , 30*mixedŽH3
    2577*H5N1 , 24*H5N6
    9*H7N2 , 16*H7N3 , 60*H7N7 , 2752*H7N9 , 1*MixedŽH7
    157*H9N2
    2*H10N7 , 48*H10N8
    5*HxN1 , 90*N1 , 29*mixedŽN1
    111*N2 , 28*mixedŽN2
    2*mixedŽN6
    1*MixedŽN9
    344*mixed , 57*unknown







    serotypes (typesz1o 1 2)

    B: 8437,731>B1 ; 8461,732>B2 ; 8466,733>B3 ;20403,734>B4
    8456,735>B5 ;13440,736>B6 ; 8575,737>B7 ; 9836,738>B8
    1,423>23*5 , 1,680>62.I92_09

    C: 230,739>C1 ; 226,740>C2 ; 225,741>C3 ; 337,742>C4
    223,743>C5 ; 253,744>C7 ; 224,745>C8
    2,546>5bat ; 4,627>1Eg_10 ; 2,718>44.17g_06

    A: human

    #222-229,BM , 1, 1, 1, 17, 1, 1, 1, 7
    #214-221,eh , 18, 25, 5, 19, 8, 36, 26, 15
    #278-285,1_33, 15, 14, 18, 20, 13, 19, 20, 17
    #286-293,1-34, 128, 97, 118, 99, 132, 80, 150, 129
    #206-213,2_57, 30, 19, 2, 22, 19, 12, 4, 9
    #230-237,sJ-76, 4, 7, 7, 13, 6, 10, 9, 5
    #198-205,1_77, 37, 37, 62, 86, 55, 56, 47, 70
    #302-309,1_83, 49, 51, 49, 76, 58, 66, 52, 55
    #246-253,1_86, 28, 39, 31, 71, 32, 29, 66, 28
    #294-301,1_91, 78, 76, 76, 213, 80, 98, 104, 83
    #254-251,1_95, 24, 28, 17, 36, 16, 4, 48, 25
    #262-269,1-99, 688, 508, 751,2238, 490, 955,1044, 515
    #238-245,Mx ,10009,9802,9932,22378,10222,16957,12903,10708
    #270-277,sweu, 9, 3, 4, 1, 4, 33, 1, 12
    #746-753,sweu-11,3, 9, 9, 8, 10, 2, 11, 3
    #334-341,s_91, 4, 5, 4, 16, 6, 6, 5, 5
    #310-417,2_07, 733, 920, 649,3956, 931,4077,1564, 889
    #446-453,.57 , 66, 118, 89, 168, 80, 83, 105, 82
    #438-445,.68 , 160, 128, 131, 213, 183, 226, 192, 218
    #350-357,3_72, 109, 90, 108, 131, 135, 121, 89, 90
    #358-365,3_79, 41, 40, 51, 134, 28, 57, 57, 28
    #366-373,3_83, 49, 40, 41, 175, 61, 41, 85, 58
    #374-381,3_89, 394, 355, 351,1023, 366, 399, 540, 141
    #382-389,3_97, 502, 444, 468,2448, 472, 521, 714, 520
    #470-477,Fujr, 583, 587, 596, 812, 606, 866, 648, 777
    #454-461,Fuj2, 223, 637, 626,1300, 226, 230,1119,1050
    #462-468,Fuj3, 33, 34, 35,1829, 39, 94, 65,
    #390-397,3_05, 639, 363, 308,2005, 386,1588,1442, 3
    #414-421,3_07,1223, 623, 662,3849,1136,9065, ,1351
    #398-405,3_P9, 540, 721, 336,1529, 325, 774, 829, 328
    #406-413,3_I9, 9806,10058,10462,19659,10701, 6241,12761,10068
    #479-485,_12 , 58, 94, 124, 98, 246, 424, 69, 177
    #754-761,32v_11,27, 22, 27, 27, 31, 21, 46, 30
    #510-517,bi51, 224, 236, 164, , 237, 383, 250, 244
    #619-626,BD_11, 2, 2, 1, 2, 2, 2, 1, 11
    #518-541,bi45i-SCUFAJKGOQ , 5, 24, 1,132,128, 94, 1, 15, 66, 5

    ...


    ================================================== =========


    counts of sequences with identical names (usually genomes)

    01-08:47845,15224,9857,1794,995,1282,2259,49879
    09-16:1707,537,288,219,302,236,404,2279
    17-24:68,46,24,22,28,14,28,139
    25-32:17,6,8,6,7,8,11,27
    33-40:10,5,3,1,0,4,7,13
    41-48:3,3,0,2,4,4,3,1

    49:1,50:3,55:1,56:1,60:1,61:2,62:2,63:1,64:1
    65:2,67:2,68:1,72:1,73:1,74:2,77:1,80:2
    81:1,82:2,86:1,103:2,116:1,144:1,163:1,165:1,171:1
    289:1,380:2,2567:1,2982:1



    more than 40 :
    28468 A/Brisbane/59/2007
    32807 A/California/07/2009
    48623 A/Canterbury/200/2004
    73052 A/Florida/62/2014
    78150 A/Georgia/M5081/2012
    89858 A/Hong Kong/1/1968
    109694 A/Japan/305/1957
    5* 110730 A/Japan/5../2008
    111994 A/Japan/921/2009
    153155 A/Netherlands/602/2009
    192893 A/Panama/2007/1999
    203254 A/Puerto Rico/8/1934
    213133 A/Sendai/TU66/2008
    232797 A/Taiwan/1/1986
    254985 A/USSR/90/1977
    261165 A/Viet Nam/1203/2004
    264490 A/WSN/1933
    277214 A/Wisconsin/67/2005
    365950 A/duck/Moscow/4182-C/2010
    384773 A/gull/Maryland/704/1977
    401527 A/mallard/Italy/3401/2005
    470864 A/swine/England/453/2006
    535032 A/equine/Newmarket/1/1993
    538030 A/equine/Newmarket/5/2003
    30* 538913 A/equine/United Kingdom/30..../2003
    550450 A/canine/Pennsylvania/10915/2007
    583910 B/Japan/700/2009
    641083 Equine influenza virus H3N8
    641220 unidentified influenza virus


    mutations-charts :
    84752 http://magictour.free.fr/32-18a.GIF
    13145 http://magictour.free.fr/32G-18A1.GIF
    39393 http://magictour.free.fr/32g-18aa.GIF
    34678 http://magictour.free.fr/32g-182.GIF
    17061 http://magictour.free.fr/SW-18B1.GIF
    8552 http://magictour.free.fr/SW-18C1.GIF
    6892 http://magictour.free.fr/SW-18D1.GIF
    6719 http://magictour.free.fr/51G-18B1.GIF
    7802 http://magictour.free.fr/H5-18A.GIF
    27340 http://magictour.free.fr/H5-18B2.GIF
    7478 http://magictour.free.fr/92G-18B.GIF
    308278 http://magictour.free.fr/H9-18Q.BMP
    http://magictour.free.fr/BG-18.GIF
    http://magictour.free.fr/CG-18A.GIF



    
    Last edited by gsgs; September 13th, 2018, 11:23 PM.
    I'm interested in expert panflu damage estimates
    my current links: [url]http://bit.ly/hFI7H[/url] ILI-charts: [url]http://bit.ly/CcRgT[/url]

  • #2
    keyword : gbrel.txt
    https://ftp.ncbi.nlm.nih.gov/genbank/gbrel.txt
    December 15 2019
    NCBI-GenBank Flat File Release 235.0
    - the VRL division is now composed of 35 files (+1)
    2353. gbvrl1.seq - Viral sequence entries, part 1.
    ....
    2381. gbvrl35.seq - Viral sequence entries, part 35.
    average size=500MB
    release,date,base pairs , entries
    227 Aug 2018 260806936411 208831050
    235 Dec 2019 388417258009 215333020
    WGS:
    227 Aug 2018 3204855013281 665309765
    235 Dec 2019 6277551200690 1127023870
    the files are here:
    https://ftp.ncbi.nlm.nih.gov/genbank/
    the 35 virus files :
    https://ftp.ncbi.nlm.nih.gov/genbank/gbvrl1.seq.gz
    ...
    https://ftp.ncbi.nlm.nih.gov/genbank/gbvrl35.seq.gz
    (this is a huge download, I will do it next night
    last time I did this was probably 2015/11/21)
    I'm interested in expert panflu damage estimates
    my current links: [url]http://bit.ly/hFI7H[/url] ILI-charts: [url]http://bit.ly/CcRgT[/url]

    Comment

    Working...
    X