wick at netcom.com wrote:
Isn't it true that when you follow a procedure like this you bias the
consensus in favor of the ones first entered?
That is exactly what I'm trying to determine? What do you think?
LINEUP constantly updates the consensus as I add a new sequence, so
I feel that the work I have done so far (7 full-lengths aligned with
PILEUP + 23 partials added in LINEUP) should accurately represent the
consensus. The problem I see is that residues that are just barely a
consensus might be misrepresented if I make a second order consensus
(ie consensus of consensuses). For example, consider where 50/100
residues in a column are "X". If I do 10 smaller alignments of 10 sequences
each, it is possible that 6 of the first-order consensuses are not X, because
they may have 4/10 X at that position. The other 26 X's may be distributed
among the other 4 smaller consensuses, but the end result is that the
consensus will not be X at that position in the final consensus! This
does not even address the problem that because the partial sequences are
scattered all over that place, an inherent bias could arise from one
first order consensus only weighing a single sequence at a given position,
and being compared equally with a consensus that weighed say 10 sequences at
that same position. Still with me? Anyways, I have realized some flaws to
this approach and feel that I should do one of the following:
1- Find a program that can handle the number of sequences I want
2- Modify LINEUP to do the same.
brett at borcim.wustl.edu