Hi,
I'm looking for an elegant (failing that, feasable) method to map
genomic sequences to
protein sequences. The net result should look something like this:
Genomic: -------++++++-----------+++++++++++-----++++++
Protein: ****** *********** ******
Where the "+" are exon codons, "-" intron codons, and "*" amino-acids.
Naturally, the data can appear as a mapping table, or anything else
which is parseable.
Given protein sequence (e.g. from SwissProt), how do I:
1) Trace the genomic sequence? Problem: usually I cannot find it, rather
I come up with a cDNA. I do not actually need the intron information,
but I do need to know where an exon begins and ends on the cDNA
sequence.
2) Map the above information automatically to my protein sequence.
I have GCG v9.0, and WWW access. My platform is a Silicon-Graphics Indy.
Any comment would be useful.
Many thanks,
Iddo
--
Iddo Friedberg
Phone: (972)-2-6758647
email: idoerg at cc.huji.ac.il
web: http://www.ls.huji.ac.il/~idoerg
More info: finger idoerg at cc.huji.ac.il