IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

49171 Human Genes/NEW genome annotation/Search

webmaster webmaster at softberry.com
Fri Dec 8 05:20:21 EST 2000


Ab initio annotation of sequences in Human genome draft: 
             (49171 Genes and 282378 exons)

The nucleotide sequence of nearly 90% of the Human genome (3 GB) has been 
determined in worldwide sequencing community. We annotated these sequences 
predicting genes by one of the most accurate FGENESH program 
(at http://www.softberry.com/nucleo.html) and 
annotated similarity of each exon  with the PfamA protein domain database.
The complete results of this analysis are presented in Table 1 and can be seen 
  in the InfoGene database at:
              http://www.softberry.com/inf/infodb.html

where the Infogen Java viewer can by used to visualize the predictions along
the chromosomes and by Action meny and Obtain Locus to get Prediction data

Blast search against the predicted Human proteins is provided at: 

httpd: //www.softberry.com/scan.html . 

  The sequences of exons and gene annotation data can be copied 
for using them locally or to create microarray oligos: 

>Human genome predicted genes/exons 
>Predicted amino acid sequences of exons with PfamA annotation 

Table 1. Summary of predicted genes and proteins in Human genome sequences

           GENES    EXONS     BASES       MASKED+N  %N %N+M  GENE_PER EXON_PER
   Total:  49171   282378   3374262130   1755813225  19  52    68623    11949

Predicted Genes Contain: Total number of different types pfamA domains - 1154
   (the same domains in neighbor exons counted here one time)

467 pkinase Eukaryotic protein kinase domain
372 7tm_1 7 transmembrane receptor (rhodopsin family)
308 Myc_N_term Myc amino-terminal region
256 Topoisomerase_I Eukaryotic DNA topoisomerase I
224 ig Immunoglobulin domain
183 rrm RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain)
182 PH PH domain
180 Myosin_tail Myosin tail
166 EGF EGF-like domain
159 filament Intermediate filament proteins
154 Syndecan Syndecan domain
143 ras Ras family
138 RNA_pol_A2 RNA polymerase A/beta'/A" subunit
123 BTB BTB/POZ domain
119 Granin Granin (chromogranin or secretogranin)
119 Troponin Troponin
113 Herpes_glycop_D Herpesvirus glycoprotein D
111 homeobox Homeobox domain
110 SH3 SH3 domain
102 trypsin Trypsin
102 helicase_C Helicases conserved C-terminal domain
100 KRAB KRAB box
98 dehydrin Dehydrins
96 ABC_tran ABC transporter
   and etc...........................


---






More information about the Bio-www mailing list

Send comments to us at archive@iubioarchive.bio.net