karplus at cse.ucsc.edu (Kevin Karplus) writes:
> Determining what transmembrane predictor works best is quite
> difficult, as there is very little solid experimental data determining
> transmembrane regions---most of the transmembrane regions are known
> only as a result of homology to other proteins or by prediction with
> one of the methods. Determining who has the most reliable results is
> difficult when most of the "known answers" are just someone else's
>> If you use several different transmembrane predictors that use
> different methods, and they all give you the same transmembrane
> helices, then you can be pretty sure of them---the signal is quite strong.
> When some predict transmembrane helices and some do not, then some
> experimental evidence should be obtained before you put too much
> believe in the predictions.
>> One can reasonably expect that predictors that use multiple alignments
> are more likely to be accurate than predictors that use only a single
> sequence, based on the experience of 2ry structure prediction of
> globular proteins. The lack of good large test sets makes even this an
> untested conjecture.
If you instead of studying the number of correctly predicted residues
study the number of correct topologies (i.e. number of TM segments and
localisation of N terminal) the predictions are not that good. It
seems as if the hidden Markov model based HMMtop and TMHMM are
significantly better than earlier methods. Interestingly enough TMHMM
does not use multiple sequence information but anyhow works better
than program that does.
> (Note: if someone has put together a large, experimentally verified
> set of transmembrane helices, I'd be interested in hearing about it.
> Maybe then I'd try making a transmembrane helix predictor.)
Look in the paper from Krogh et al (ICMB-1998) or the hmmtop program
they have decently large test sets (about 150 proteins)