New BCM Gene-Finder service:
===========================================================================
POLYAH - Recognition of 3'-end cleavage and polyadenilation region
of human mRNA precursors
===========================================================================
Department of Cell Biology, Baylor College of Medicine
Analysis of uncharacterized human sequences is available through WWW:
http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html
or by sending your file containing a sequence (the sequence format is
described below) to University of Houston and soon to Weizmann Institute
of Science Email services:
service at bchs.uh.edu or services at bioinformatics.weizmann.ac.il
with the subject line "polyah".
Examples: mail -s polyah service at bchs.uh.edu < test.seq
mail -s polyah services at bioinformatics.weizmann.ac.il < test.seq
where test.seq a file with the sequence.
METHOD DESCRIPTION:
Algorithm predicts potential position poly-A region by linear discriminant
functions combining characteristics describing various contextual
features of these sites. The default LDF threshold in the server is equal 0.
Accuracy:
The accuracy has been estimated for the set of 131 poly-A regions
and 1466 non-poly-A regions of human genes, having AATAAA sequence.
For 86% accuracy poly-A region prediction the algorithm has 8% false
predictions (Sp=50%; C=0.62). For example, with threshold 0.7 it
predicts 8 of 9 poly-A sites of AD2 genome (35937 bp.) and overpredict
4 false (Compare with method of poly-A site prediction
(CABIOS 1994,10,597-603), which for 8 true predicted sites gives 968
false positive sites).
SUBMITTING SEQUENCES VIA EMAIL:
For email submission the sequences must have the following format:
Name of your sequence
ccatctctgtcttgcaggacaatgccgtcttctgtctcgtggggcatcctcctgctggca
ggcctgtgctgcctggtccctgtctccctggctgaggatccccagggagatgctgcccag
aagacagatacatcccaccatgatcaggatcacccaaccttcaacaagatcacccccaac
ctggctgagttcgccttcagcctataccgccagctggcacaccagtccaacagcaccaat
atcttcttctccccagtgagcatcg...............
(The line length must be less than 80 letters).
You have to send the file containing the sequence to:
service at theory.bchs.uh.edu
Subject line must be:polyah
Example: mail -s polyah service at bchs.uh.edu < test.seq
POLYAH output:
1st line - name of your sequence; 2nd line - Length of your sequence
Next lines - positions of predicted sites and their 'weights',
Position shows the first nucleotide of the AATAAA consensus in the
predicted region
FOR EXAMPLE:
HSG11C4A 1741 bp DNA PRI 21-FEB
Length of sequence- 1741
1 potential polyA site was predicted
Pos.: 988 LDF- 4.06
REFERENCE:
Salamov A.A., Lawrence C.B., Solovyev V.V. Recognition of
3'-end cleavage and polyadenilation region of human mRNA precursors.
(1995) (in preparation).
Questions:solovyev at cmb.bcm.tmc.edu
===============================================================
The other services are
===============================================================
FGENEH - search for gene structure with exons assembling by dynamic programming
FEXH - search for 5'-, internal and 3'-exons
HEXON - search for internal exons
HSPL - search for splice sites
RNASPL - prediction exon-exon junctions in cDNA sequences
CDSB - prediction of Bacterial coding regions
HBR - recognition of human and bacterial sequences to test a library
for E. coli contamination by sequencing example clones
SSP - prediction of a-helix and b-strand in globular proteins
by segment-oriented approach.
NSSP - prediction of a-helix and b-strand segments in globular proteins
by nearest-neighbor algorithm.