I'm looking for an elegant (failing that, feasable) method to map
genomic sequences to
protein sequences. The net result should look something like this:
Protein: ****** *********** ******
Where the "+" are exon codons, "-" intron codons, and "*" amino-acids.
Naturally, the data can appear as a mapping table, or anything else
which is parseable.
Given protein sequence (e.g. from SwissProt), how do I:
1) Trace the genomic sequence? Problem: usually I cannot find it, rather
I come up with a cDNA. I do not actually need the intron information,
but I do need to know where an exon begins and ends on the cDNA
2) Map the above information automatically to my protein sequence.
I have GCG v9.0, and WWW access. My platform is a Silicon-Graphics Indy.
Any comment would be useful.
email: idoerg at cc.huji.ac.il
More info: finger idoerg at cc.huji.ac.il