Nicotiana tabacum Gene finding parameters for FGENESH
the program with parameters for many major model organisms
is available for on line usage at:
http://www.softberry.com/berry.phtml?topic=gfind
Method description:
A new parameter set for gene prediction in Tobacco Genome is developed
for FGENESH program. Accuracy of prediction of Nicotiana tabacum protein
coding genes is about 98% on the nucleotide level. Note that Arabidopsis
parameters is aboput 20% less accurate for prediction of tobacco genes.
The FGENESH algorithm is based on pattern recognition of different types of
signals and Markov chain models of coding regions. Optimal combination of
these features is then found by dynamic programming and a set of gene
models is constructed along given sequence.
FGENESH is the fastest and most accurate ab initio gene prediction program
available. It can process the whole chromosome sequences.
Fgenesh output:
fgenesh Thu Dec 19 17:05:00 EST 2002
FGENESH 1.1 Prediction of potential genes in Nicotiana_dicot genomic DNA
Time : Thu Dec 19 17:05:00 2002
Seq name: >putrescine N-methyltransferase, NsPMT1
Length of sequence: 2209
Number of predicted genes 1 in +chain 1 in -chain 0
Number of predicted exons 8 in +chain 8 in -chain 0
Positions of predicted genes and exons:
G Str Feature Start End Score ORF Len
1 + TSS 83 -4.38
1 + 1 CDSf 201 - 426 33.05 201 - 425 225
1 + 2 CDSi 608 - 684 19.89 610 - 684 75
1 + 3 CDSi 767 - 994 37.57 767 - 994 228
1 + 4 CDSi 1100 - 1172 17.59 1100 - 1171 72
1 + 5 CDSi 1283 - 1354 6.75 1285 - 1353 69
1 + 6 CDSi 1444 - 1639 24.78 1446 - 1637 192
1 + 7 CDSi 1802 - 1934 19.43 1803 - 1934 132
1 + 8 CDSl 2033 - 2089 12.26 2033 - 2089 57
1 + PolA 2173 -0.55
Predicted protein(s):
>FGENESH: 1 8 exon (s) 201 - 2089 353 aa, chain +
MEVISTNTNGSTIFKSGAIPMNGHQNGTSKHQNGHKNGTSEEQNGTISHDNGNELLGNSN
CIKPGWFSEFSALWPGEAFSLKVEKLLFQGKSDYQDVMLFESATYGKVLTLDGAIQHTEN
GGFPYTEMIVHLPLGSIPNPKKVLIIGGGIGFTLFEMLRYPTIEKIDIVEIDDVVVDVSR
KFFPYLAANFNDPRVTLVLGDGAAFVKAAQAEYYDAIIVDSSDPIGPAKDLFERPFFEAV
AKALRPGGVVCTQAESIWLHMHIIKQIIANCRQVFKGSVNYAWTTVPTYPTGVIGYMLCS
TEGPEIDFKNPVNPIDKETAQVKSKLAPLKFYNSDIHKAAFILPSFARSMIES
---