David Jones (jones at globin.bio.warwick.ac.uk) wrote:
: Robert C Oehmke <oehmke at engin.umich.edu> wrote:
: > Now given the fact that my program takes around 6 hours to run on
: > a parallel computer. Does it seems useful to be able to be able to compare
: > sequences exactly for this range of number of sequences and length of
: > protein, or are the approximation methods good enough to render the extra
: > work useless? Note, that the program is not fixed at a size of 6 and 200
: > the number of sequences could also be adjusted down to vastly increase the
: > length of the protein.
: At the end of the day, no matter whether you use a rigorous MSA
: method or an approximation, the alignments are still not
: going to be "biologically optimal" unless the sequences are closely
: related. The additional rigour obtained from using an full N-way dynamic
: programming method is going to be swamped by the inadequacies of the
: amino acid substitution scoring scheme at the end of the day.
That's exactly what I was about to say :-)
Further to that, what is generally required from a sequence alignment
is the extrapolation to a structurally meaningful alignment. i.e.
when the sequences are aligned, the aligned positions map to equivalent
positions in the 3D structure. Now there are certainly cases where
a simple 2-sequence alignment which is very easy to see is "correct"
in sequence terms does not generate a correct structural alignment.
A case of this occurred in the CASP2 homology modelling contest.
Dr. Andrew C.R. Martin (at home)
Inpharmatica and UCL