Hum, all the alignment display programs that I would have recommended to
Michael Baron (BARON at AVRI.AFRC.AC.UK), he has already tried:
-- The prettyprint option within SeqApp
All of the programs are not to his liking; they all have faults. This
is not, IMHO, uncommon amoung people who do protein alignments. I have
yet to find a common method of doing protein consensus that satisfies
everyone. I can't speak for the other programs but the statement:
>I HAVE ALREADY TRIED the GCG program Prettybox, but found that there
>are still some bugs in it, in that it makes some pretty strange
>decisions as to what is the consensus, and doesn't respond properly to
>the PLURALITY setting (which *should* allow one to say that one wants
>at least X residues to agree before there is a consensus).
Show a lack of understanding on how PrettyBox works -- perhaps not
suprising since I've never written documentation for it. PrettyBox
does a consensus by 'voting' amoung the amino acids. The 'votes' are
the scores as defined by the file 'prettypep.cmp'; said file can be
changed if one doesn't like the scores. Whichever amino acid that
gathers the most votes is considered to be the consensus amino acid.
Because of the default scores, this can sometimes lead to what some
people would consider strange results. An example:
Given 5 aligned amino acids: Y Y Y W W
What is the alignment? Some people would say 'Y' because 3 of the 5
amino acids are 'Y'.
However, if you look at the scoring table and tally up votes, you will
find that 'F' is the most common denominator. Why?
Y vs. Y has a score of 1.5
Y vs. W has a score of 1.1
Y vs. F has a score of 1.4
W vs. F has a score of 1.3
W vs. W has a score of 1.5
Looking at how the aligned AAs vote:
The 3 'Y's give 3 times 1.5 or 4.5 votes to a 'Y' consensus
The 3 'Y's give 3 times 1.1 or 3.3 votes to a 'W' consensus
The 3 'Y's give 3 times 1.4 or 4.2 votes to a 'F' consensus
The 2 'W's give 2 times 1.5 or 3.0 votes to a 'W' consensus
The 2 'W's give 2 times 1.1 or 2.2 votes to a 'Y' consensus
The 2 'W's give 2 times 1.3 or 2.6 votes to a 'F' consensus
Totaling everything up:
A 'Y' consensus receives 4.5 + 2.2 or 6.7 votes
A 'W' consensus receives 3.3 + 3.0 or 6.3 votes
A 'F' consensus receives 4.2 + 2.6 or 6.8 votes
So the 'F' consensus 'wins' and PrettyBox will shade all 5 aligned
amino acids as 'similar' but not 'identical'.
More complex examples can be created but the process is the same.
Aside from changing the score data file, there are a couple
of command line switches that can modify the scores. '/threshold'
will keep low scoring amino acids from voting. '/plurality'
will only consider consensuses that gather a minimum number of votes (note
that this is not the same as saying 'X residues must agree', just that
'X number of votes must be gathered'). '/simplify' can sometime be
useful by making similar amino acids act the same.
I am, slowly, working on another version of PrettyBox which will have
even more command line switches that enable even finer control of the
consensus algorithm. But even then, I suspect, not everyone will be