Blastp search in DB of Human Predicted Genes and Proteins
------------------------------------------------------------------
We install NCBI's Gapped BLASTP search in Database of protein sequences of
predicted genes (INFOGENEP) of finished and unfinished human sequences
at http://genomic.sanger.ac.uk/db.html
Web page of Computational Genomic Group of the Sanger Centre.
If you find some interesting similarity with your sequence you can use
ID to check the gene structure of this protein in the INFOGENP DB
and get clone name and sequence
Example:
==============
a sequence T0078 from CASP3 has significan similarity with protein of
predicted gene GHS005230 from clone dJ337O18 of Sanger finished sequences
Query= Query:
(288 letters)
Database: INFOGENE_PREDICTIONS
18,574 sequences; 3,010,387 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits)
Value
ID GHS005230 dJ337O18 human_pg Sanger_finished 319 212 8e-56
ID GHS005240 dJ337O18 human_pg Sanger_finished 477 124 2e-29
>ID GHS005230 dJ337O18 human_pg Sanger_finished 319
Length = 319
Score = 212 bits (533), Expect = 8e-56
Identities = 120/282 (42%), Positives = 167/282 (58%), Gaps = 10/282 (3%)
Query: 12 TLLNLEKIEEGLFRGQSEDLGLRQVFGGQVVGQALYAAKETVPEERLVHSFHSYFLRPGD 71
T+LNLE ++E LFRG+ + +++FGGQ+VGQAL AA ++V E+ VHS H YF+R GD
Sbjct: 30 TVLNLEPLDEDLFRGRHYWVPAKRLFGGQIVGQALVAAAKSVSEDVHVHSLHCYFVRAGD 89
Query: 72 SKKPIIYDVETLRDGNSFSARRVAAIQNGKPIFYMTASF-QAPEAGFEHQKTMPSAPAPD 130
K P++Y VE R G+SFS R V A+Q+GKPIF ASF QA + +HQ +MP+ P P+
Sbjct: 90 PKLPVLYQVERTRTGSSFSVRSVKAVQHGKPIFICQASFQQAQPSPMQHQFSMPTVPPPE 149
Query: 131 G-LPSETQIAQ-----SLAHLLPPVLKDKFICDRPLEVRPVEFHNPLKGHVAEPHRQVWI 184
L ET I Q +L P L + P+E++PV + EP + W+
Sbjct: 150 ELLDCETLIDQYLRDPNLQKRYPLALNRIAAQEVPIEIKPVNPSPLSQLQRMEPKQMFWV 209
Query: 185 RANGSVPD-DLRVHQYLLGYASDLNFLPVALQPHGIGFLEPGIQIATIDHSMWFHRPFNL 243
RA G + + D+++H + Y SD FL AL PH + + ++DHSMWFH PF
Sbjct: 210 RARGYIGEGDMKMHCCVAAYISDYAFLGTALLPH--QWQHKVHFMVSLDHSMWFHAPFRA 267
Query: 244 NEWLLYSVESTSASSARGFVRGEFYTQDGVLVASTVQEGVMR 285
+ W+LY ES A +RG V G + QDGVL + QEGV+R
Sbjct: 268 DHWMLYECESPWAGGSRGLVHGRLWRQDGVLAVTCAQEGVIR 309
-----------------------------------------------
Currently DB includes
genes predicted for Sanger finished and unfinished sequences.
It is about 1500 locuses and 18000 protein sequences corresponding
predicted genes (by Fgenes and Genescan programs).
Exons predicted by both programare much more often the real ones.
Known Protein and EST similarity included in the data.
This DB will include all predicted genes and protein for
the Human genome draft as well as genes and proteins predicted for
other model organisms.
The database list:
Predicted GENES Structure Database (INFOGENEP Rel 1.)
Nucleotide and Protein sequences of INFOGENEP genes
GENES Structure and Functioning Database (INFOGENE Rel 1.)
Nucleotide and Protein sequences of INFOGENE genes
--
Victor Solovyev
The Sanger Centre, Hinxton, Cambridge CB10 1SA, UK
Email: solovyev at sanger.ac.ukhttp://genomic.sanger.ac.uk
Phone: 44-1223-494799 FAX: 44-1223-494919