Difference between revisions of "EMBL format"

From BioUML platform
Jump to: navigation, search
(Automatic synchronization with BioUML)

Revision as of 15:24, 4 April 2013

EMBL sequence format

The EMBL flat format is a format for storing sequences and their associated meta-information, feature coordinates, and annotations.

One sequence entry starts with an identifier line ("ID"), followed by further annotation lines. The start of the sequence is marked by a line starting with "SQ" and the end of the sequence is marked by two slashes ("//").

Example

ID   ADHBADA2   standard; DNA; VRT; 1145 BP.
XX
AC   J00923; J00924;
XX
DT   13-JUN-1985 (Rel. 06, Created)
DT   22-NOV-1994 (Rel. 41, Last updated, Version 2)
XX
DE   Duck alpha-A-globin gene and 5' flank.
XX
KW   alpha-globin; globin.
XX
OS   Cairina moschata (duck)
OC   Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Aves;
OC   Neornithes; Neognathae; Anseriformes; Anatidae.
XX
RN   [1]
RP   603-696
RX   MEDLINE; 83028533.
RA   Niessing J., Erbil C., Neubauer V.;
RT   "The isolation and partial characterization of linked alpha-A- and
RT   alpha-D-globin genes from a duck DNA recombinant library";
RL   Gene 18:187-191(1982).
XX
RN   [2]
RP   1-1145
RX   MEDLINE; 83158759.
RA   Erbil C., Niessing J.;
RT   "The complete nucleotide sequence of the duck alpha-A-globin
RT   gene";
RL   Gene 20:211-217(1982).
XX
DR   EPD; 33033; Cm  a'A-globin.
DR   SWISS-PROT; P01987; HBA_CAIMO.
XX
CC   The alpha-A-globin gene is linked to the alpha-D-globin gene. [1]
CC   compared their alpha-A-globin gene sequence with chicken alpha-A-
CC   and alpha-S-globin gene sequences, as well as with other avian and
CC   mammalian alpha-A-globin gene sequences. NCBI gi: 212911
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..1145
FT                   /organism="Cairina moschata"
FT   prim_transcript 331..1145
FT                   /note="alpha-A-globin mRNA"
FT   CDS             join(367..461,612..816,921..1049)
FT                   /note="alpha-A globin; NCBI gi: 212914"
FT                   /codon_start=1
FT   exon            367..461
FT                   /note="alpha-A globin"
FT                   /number=1
FT   intron          462..611
FT                   /note="alpha-A-globin intron A"
FT   exon            612..816
FT                   /number=2
FT   intron          817..920
FT                   /note="alpha-A-globin intron B"
FT   exon            921..>1049
FT                   /note="alpha-A globin"
FT                   /number=3
XX
SQ   Sequence 1145 BP; 193 A; 435 C; 291 G; 226 T; 0 other;
     ctcatgctgg ggttgcctcc ccccctcaaa ccctaacctt aatcccatct cgtgctgggg        60
     tcagaccccc ctaaccctaa cccagttcat gccgggatca gcccccccaa accctaaccc       120
     taaacccatc tcgtgccggg gtcagacccc ccccaaccct aaccccgacc ccagttcatg       180
     ccggggtcgc ccccccccgg tggtgccggt gccgcaggcg gggcagggcg gcggccccgc       240
     ctggccgagg tccagccgcg acggggcggg cggggcgggg cggcgcccgg gccggcacgg       300
     ggatataagg ccggcggcac cagtgggggc acccgtgctg ggggctgcca acgcggagct       360
     gcaaccatgg tgctgtctgc ggctgacaag accaacgtca agggtgtctt ctccaaaatc       420
     ggtggccatg ctgaggagta tggcgccgag accctggaga ggtaggtgtc tgtccccgtc       480
     ctttgtccgt ccctgatcct ctcctctcta accccatgct ctcccccacc ataactgtcc       540
     gtgtcctacc ccaccccatc catcccccct gtccgttgat cccgctggcc ctgactcgct       600
     ctgctccaca ggatgttcat cgcctacccc cagaccaaga cctacttccc ccactttgac       660
     ctgcagcacg gctctgctca gatcaaggcc catggcaaga aggtggcggc tgccctagtt       720
     gaagctgtca accacatcga tgacattgcg ggtgctctct ccaagctcag tgacctccac       780
     gcccaaaagc tccgtgtgga ccctgtcaac ttcaaagtga gtctggtgac tccccccagc       840
     tcctcttcag cacccatcct gggccatccg gccacccctt tacctccccc actcgctcac       900
     cgtctccttt tgcctttcag ttcctgggcc actgcttcct ggtggtggtt gccatccacc       960
     accccgctgc cctgacccca gaggtccacg cttccctgga caagttcatg tgcgccgtgg      1020
     gtgctgtgct gactgccaag taccgttaga cggcaccgtg gctagagctg gacccaccct      1080
     gttgccagcc ttccaactgc aagcagccaa atgatctgaa ataaaatctg ttgcatttgt      1140
     gctcc                                                                  1145
//

References

  1. http://www.embl.org
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox