Robert C Oehmke <oehmke at engin.umich.edu> wrote:
> Now given the fact that my program takes around 6 hours to run on
> a parallel computer. Does it seems useful to be able to be able to compare
> sequences exactly for this range of number of sequences and length of
> protein, or are the approximation methods good enough to render the extra
> work useless? Note, that the program is not fixed at a size of 6 and 200
> the number of sequences could also be adjusted down to vastly increase the
> length of the protein.
At the end of the day, no matter whether you use a rigorous MSA
method or an approximation, the alignments are still not
going to be "biologically optimal" unless the sequences are closely
related. The additional rigour obtained from using an full N-way dynamic
programming method is going to be swamped by the inadequacies of the
amino acid substitution scoring scheme at the end of the day.
The bottom line is that both methods are going to produce alignments which
are not biologically correct - so why not use the faster approximate method?
On a more practical note, we are now facing situations where we now have
to produce good multiple alignments for hundreds or even thousands of
sequences. Even the faster approximate MSA programs take a long time to
align this many sequences.
>---------------------------------------------------------------------------<
This message was written, produced and executively directed by Dr David Jones
Address: Dept. of Biological | Email: jones at globin.bio.warwick.ac.uk
Sciences, University of Warwick, | Tel: +44 1203 523729
Coventry CV4 7AL, U.K. | Fax: +44 1203 523568