IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

Analysis and Annotation Email Server

Xiaoqiu Huang huang at mtu.edu
Fri Jul 11 15:29:13 EST 1997


The AAT email server at Michigan Tech identifies genes in a DNA sequence
by comparing the query sequence against cDNA and protein sequence databases:
(1) Human_Gene_Index, a database of human cDNA sequences at TIGR,
(2) dbEST, a database of EST sequences at NCBI,
(3) SwissProt, a database of protein sequences at University of Geneva,
(4) nr, a database of non-redundant protein sequences at NCBI.

Author: Xiaoqiu Huang (Email: huang at cs.mtu.edu)
Dept. of Computer Science, Michigan Technological Univ., Houghton, MI 49931

The analysis and annotation tool (AAT) includes two sets of programs,
one for comparing the query sequence with the protein database,
and the other for comparing the query with the cDNA database.
Each set contains a fast database search program and
a rigorous alignment program. The database search program
quickly identifies regions of the query sequence that
are similar to a database sequence. Then the
alignment program constructs an optimal alignment
for each region and the database sequence.  The alignment
program also reports the coordinates of exons in the query sequence.
Each alignment program handles the problem of introns.
The DNA-protein alignment program corrects frameshifts.
The AAT tool reduces the labor-intensive work of locating the exons
of the query sequence and improves the process of defining intron/exon
boundaries by using the wealth of available protein and cDNA data.

Obtaining Help

To receive information on using the AAT email server,
send a mail message to:

	aat at cs.mtu.edu

Put the word 'HELP' on a single line in the body of the mail message.

Examples of Results by AAT

A portion of a DNA-protein alignment:

Top sequence is the query and bottom one is a database sequence.

Accession: SP|P32198|CPT1_RAT MITOCHONDRIAL CARNITINE PALMITOYLTRANSFERASE I
Score: 2003  Identity: 482/773 (62%)  Strand: plus

Script                                                    M  A  E  A
  62479 GCGCCCGCGCACCCATCTGCCCCCGTCCTAGGTGCCGACCAACCCCCAGGATGGCGGAAG
        --------------------------------------------------::::::::::
      1                                                   M  A  E  A

Script    H  Q  A  V  A  F  Q  F  T  V  T  P  D  G  V  D  F  R  L  S
  62539 CTCACCAGGCCGTGGCCTTCCAGTTCACGGTGACCCCAGACGGGGTCGACTTCCGGCTCA
        :::::::::::::::::::::::::::::::::::::::::::: ..:::.. :::::::
      5   H  Q  A  V  A  F  Q  F  T  V  T  P  D  G  I  D  L  R  L  S

Script    R  E  A  L  K  H  V  Y  L  S  G  I  N  S  W  K  K  R  L  I
  62599 GTCGGGAGGCCCTGAAACACGTCTACCTGTCTGGGATCAACTCCTGGAAGAAACGCCTGA
        ::.  ::::::::::::..  ... .::::::::: .. ..::::::::::::    . :
     25   H  E  A  L  K  Q  I  C  L  S  G  L  H  S  W  K  K  K  F  I

EXON  1    62529    62669  CONFIDENCE: 100  66

Script    R  I  K                                                   
  62659 TCCGCATCAAGGTGCGCACAGGTGCTTCTCCCAGAGCGTAGGCAGAGGCCGGCTGTCAGC
        ::::: ..:::-------------------------------------------------
     45   R  F  K                                                   

Script                                                              
  62719 TGTTAAGCGCTTTGTTAGGGTCCCTCACTGCCTCCTTGGCTGGCACTTCTGCCCGGTACA
        ------------------------------------------------------------
     48                                                             

Script                                                              
  62779 GGTTGTGGAAGTACAGACACCAGAGGGGTGCACAGGATGTGGTCGGACACAGGGAGCTGT
        ------------------------------------------------------------
     48                                                             

Script                                                              
  62839 GGGTGTGGCGGAGGAAGGAGCACAGCAGGGCATCAGGAGAGAAAGCCTTCCAGGCCAAGA
        ------------------------------------------------------------
     48                                                             

Script                                                              
  62899 CCAGGAGCCAGTTCCCAAGACTTCACAGGCAGGCTAACCTCCCGCCTTCCGGCTCCATAA
        ------------------------------------------------------------
     48                                                             

Script                        N  G  I  L  R  G  V  Y  P  G  S  P  T 
  62959 GGGCGCCTGTTTCTGCCCACAGAATGGCATCCTCAGGGGCGTGTACCCTGGCAGCCCCAC
        ----------------------::::::::: ... .::::::. .:::. .. .:::. 
     48                       N  G  I  I  T  G  V  F  P  A  N  P  S 

Script   S  W  L  V  V  I  M  A  T  V  G  S  S  F  C  N  V  D  I  S 
  63019 CAGCTGGCTGGTCGTCATCATGGCAACAGTGGGTTCCTCCTTCTGCAACGTGGACATCTC
        .::::::::: ..::: .. ... .  . .  ..:::     .  ... ::::::  .::
     61  S  W  L  I  V  V  V  G  V  I  S  S  M  H  A  K  V  D  P  S 

EXON  2    62981    63120  CONFIDENCE:  60  40

Script   L  G  L  V  S  C  I  Q  R  C  L  P  Q  G                   
  63079 CTTGGGGCTGGTCAGTTGCATCCAGAGATGCCTCCCTCAGGGGTAAGGAGTGAAACTGGA
        ::::::: .. ..  .   :::  .:::  .:::  .  .  ------------------
     81  L  G  M  I  A  K  I  S  R  T  L  D  T  T                   


A portion of a DNA-cDNA alignment:

Top sequence is the query and bottom one is a database sequence.

EXON  1    12304    12896  CONFIDENCE: 100 100

  12853 TTCACAGACTTCTACGTGCCTGTGTCTCTGTGCACACCCTCTAGGTAAAGAGGGGGCCGC
        ||||||||||||||||||||||||||||||||||||||||||||----------------
    549 TTCACAGACTTCTACGTGCCTGTGTCTCTGTGCACACCCTCTAG                

  12913 GCCTCTTCCCCGCCCCGACCCTCCATCCCTTTCCTCCCAATGGATTGCAGGGGGGCGGGA
        ------------------------------------------------------------
    593                                                             

  12973 AAAACGTCTGTCTCTCTCTCTAGGGAAGGCCACATTTCTGTCTGTCTCAGGGACTCTGTG
        ------------------------------------------------------------
    593                                                             

  13033 ACTTGTCCCGCAGGGCCGCCCTCCTGACCGGCCGGCTCCCGGTTCGGATGGGCATGTACC
        -------------|||||||||||||||||||||||||||||||||||||||||||||||
    593              GGCCGCCCTCCTGACCGGCCGGCTCCCGGTTCGGATGGGCATGTACC

  13093 CTGGCGTCCTGGTGCCCAGCTCCCGGGGGGGCCTGCCCCTGGAGGAGGTGACCGTGGCCG
        ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    640 CTGGCGTCCTGGTGCCCAGCTCCCGGGGGGGCCTGCCCCTGGAGGAGGTGACCGTGGCCG

  13153 AAGTCCTGGCTGCCCGAGGCTACCTCACAGGAATGGCCGGCAAGTGGCACCTTGGGGTGG
        ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    700 AAGTCCTGGCTGCCCGAGGCTACCTCACAGGAATGGCCGGCAAGTGGCACCTTGGGGTGG

  13213 GGCCTGAGGGGGCCTTCCTGCCCCCCCATCAGGGCTTCCATCGATTTCTAGGCATCCCGT
        ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    760 GGCCTGAGGGGGCCTTCCTGCCCCCCCATCAGGGCTTCCATCGATTTCTAGGCATCCCGT

EXON  2    13046    13286  CONFIDENCE: 100  96

  13273 ACTCCCACGACCAGGTAGGAACCACCCGGGCCCTCAGCCACCCTCCCACCTCCCAAAGTC
        ||||||| ||||||----------------------------------------------
    820 ACTCCCAyGACCAG                                              

  13333 CCCCAGCCCTTGATGCTCCCGCAGCCCCACCTGCCAGCCCAGCCCTCACGGCAGCTGCCC
        ------------------------------------------------------------
    834                                                             

  13393 GCCTCAGGGCCCCTGCCAGAACCTGACCTGCTTCCCGCCGGCCACTCCTTGCGACGGTGG
        -------||||||||||||||||||||||||||||-||||||||||||||||||||||||
    834        GGCCCCTGCcAGAACCTGACCTGCTTCC gCCGGCCACTCCTTGCGACGGTGG

  13453 CTGTGACCAGGGCCTGGTCCCCATCCCACTGTTGGCCAACCTGTCCGTGGAGGCGCAGCC
        |||||||||||||||||||||-||||||||||||||||||||||||||||||||||||||
    886 CTGTGACCAGGGCCTGGTCCC aTCCCACTGTTGGCCAACCTGTCCGTGGAGGCGCAGCC

  13513 CCCCTGGCTGCCCGGACTAGAGGCCCGCTACATGGCTTTCGCCCATGACCTCATGGCCGA
        |---|||||-|||||||||||||||||||||||||||||||||-||||||||||||||||
    945 C   tGGCT cCCGGACTAGAGGCCCGCTACATGGCTTTCGCC aTGACCTCATGGCCGA

EXON  3    13400    13618  CONFIDENCE:  96  96

  13573 CGCCCAGCGCCAGGATCGCCCCTTCTTCCTGTACTATGCCTCTCACGTAAGTGATCTTGG
        ||||-|||||||||||||||||||||||||||||||||||||| ||--------------
   1000 CGCC aGCGCCAGGATCGCCCCTTCTTCCTGTACTATGCCTCTmAC              





More information about the Bio-soft mailing list

Send comments to us at archive@iubioarchive.bio.net