SPLM - Canonical and Non-canonical splice site prediction in
Human DNA sequences
SPLM - Prediction of splice sites in Human DNA sequences program
developed by Salamov A. and Solovyev V.
installed at http://www.softberry.com/nucleo.html
It locates potential splice site positions based on 5 weight
matrices for donor sites and a model including dinucleotide
composition and weight matrix for acceptor splice site.
Program includes prediction of potential GC -donor sites and non-
standard splice sites as AT-AC
Program does not EXCLUDE splice sites close to sites predicted with
higher scores or sites on different chains. User could make
processing based on the reported scores. It designed to be useful
to analyze ALTERNATIVE Splice variants and NON-CANONICAL splice
sites. Program has much higher number of overpredicted sites
comparing with SPL program.
Some description see at:
Solovyev V.V. (2001) Statistical approaches in Eukaryotic gene
prediction. In Handbook of Statistical genetics (eds. Balding D.
et al.), John Wiley & Sons, Ltd., p. 83-127.
TO RUN LOCALLY : ./splm param sequence
where param - name of file with parameters
and sequence - name of file with sequence
Options:
-d threshold for donor splice sites (default = 95: -d 95)
-a threshold for acceptor splice sites (default = 95: -a 95)
-dGC threshold for GC donor splice sites (default = 95: -dGC 95)
-nc 1 allow search for AT-AC sites (default = 0: -nc 0)
Threshold values are from 1 to 100.
For example, value 30 means that threshold set
on the level which detects 30% of highest scoring sites
from the database of all known splice sites
Score 20 means that this site has score better than
bottom 20% of score-ordered known sites
Example to run with default parameters: splm hum_spl.dat t.seq >
t.res
Or: splm hum_spl.dat t.seq -d 90 -a 90 -dGC 90 -nc 1 > t.res
Example of output:
splm Wed Apr 11 23:16:32 EDT 2001
Prediction of splice sites on Human sequences
Length of sequence 2040
Number of Donor sites: 10 Threshold: 90
1 130 68 - GT
2 463 14 + GT
3 642 26 + GT
4 710 12 + GT
5 845 30 + GT
6 962 55 - GT
7 1024 48 + GT
8 1255 22 + GT
9 1363 42 + GT
10 2029 70 + GT
Number of Acceptor sites: 29 Threshold: 90
1 23 43 - AG
2 131 13 - AG
3 188 13 - AG
4 191 91 - AG
5 314 44 - AG
6 359 14 - AG
7 380 29 - AG
8 446 74 - AG
9 499 14 - AG
10 704 15 - AG
11 805 19 - AG
12 839 39 - AG
13 900 14 - AG
14 925 9 - AC
15 940 26 - AG
16 1065 93 + AG
17 1401 36 + AG
18 1488 80 + AG
19 1542 41 + AG
20 1593 62 + AG
21 1626 49 + AG
22 1637 18 - AG
23 1674 32 + AG
24 1708 41 + AG
25 1786 11 + AG
26 1825 15 + AG
27 1859 84 + AG
28 2003 13 + AG
29 2020 23 - AG
---