Hello,
I have obtained sequences from a number of different parasite
populations,
some from the same species and some from a different "strain/species".
The sequences consist of a series of 23base repeats which can vary from
each
other in 1-6 bases. The order in which the repeats occur and the total
number of repeats also vary between some populations. A number of
populations have "unique" repeats, but these may just vary in one base
compared to others. A number of repeat types only occur in a given
"strain/species". The overall level of variation is not high. I would
like to try to analyse these sequences using distance/phylogenic
techniques
to see if, at least , the different strains/species can be grouped
together.
Is this possible using this type of data? - I am not certain how to
align
the sequences. Even though one repeat could be changed to another type
through one mutation, I suspect that variation in the order, and number,
of
repeats is due to unequal crossing over. To treat the repeats as
elements
may also incur error, as two different repeats, differing in only one
nucleotide, would be treated as two features, where only one difference
actually exists.
I would be very grateful for any help.
Thanking you in advance.