Announcement

Collapse
No announcement yet.

genbank sequences

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • genbank sequences

    I finished my first version (untested) of flu-genbank at:

    5MB compressed, 110MB expanded, version from 2008/04/18
    Description at

    copy below

    errors corrected, notations uniformized, computer-readable
    so hopefully future changes will be easy.

    some improvements are still possible...

    then I have tools to extract/merge headers
    extract subsets by keyword
    make mutation-tables, draw mutation graphs etc.

    to be uploaded later
    work in progress, I can send by email if someone is interested

    names.exe
    xtract.exe
    merge.exe
    seq1.exe
    seqa.exe
    align.exe
    mn.exe
    seq1q.exe

    source-code attached to the executables



    --------------------------------

    file flu.gz
    62869 records consisting of 2 lines, the first has a header
    with 16 entries, separated by commas , the 2nd line has
    the nucleotide-sequence.

    my current headers:

    examples:
    >AB000605,H,6,,Japan,1971,1136,C,C/Sapporo/71,,,y,199356,,26-MAR-2003,
    >CY009388,H,4,H3N2,New Zealand,2000,1721,A,A/Canterbury/94/00(H3N2), 31411,F,y,363048,20-10-2000,15-MAR-2006,36817

    1) genbank access code
    2) species (H:human,A:avian,S:swine)
    3) segment 1..8 , 1..7 for C
    4) serotype empty for B,C,u
    5) country
    6) year
    7) length
    8) type (A,B,C,u)
    9) name
    10) host-age in days
    11) host-sex (m,f)
    12) full-length ?
    13) taxon
    14) collection date (year and month at least, else empty)
    15) submission date
    16) days since 1900/01/01 (if collection date is given)

    the nucleotide-sequences are aligned by inserting "-" for
    influenza-A :segments 1,2,3,5,7,8, 4-H1N1,4-H3N2,4-H5N1,6-H1N1,6-H3N2,6-H5N1

    (simple alignment : "-"s are only attached at the start and end


    if no neighbor <5% then print to extra-file instead

    don't calculate all d(f,g), if d>min then exit-for
    I'm interested in expert panflu damage estimates
    my current links: http://bit.ly/hFI7H ILI-charts: http://bit.ly/CcRgT
Working...
X