NEW (RNA_MAP) RNA/EST mapping program/EST-SET map and OLIGO_MAP
New program RNA_MAP is one of group (including ESTS_MAP, OLIGO_MAP and
DBSAN). This group is devoted to comparisons with long genomic
sequences 20-300 MB.
It is available at:
http://www.softberry.com/scan.html
RNA_MAP is a fast algorithm to accurately map mRNA sequence to genomic
sequence taking into account splice sites flanking intron sequences.
Time to map mRNA 300 bp on 52 MB of unmasked Y chromosome is about 19
sec, for 7300 bp the time about 47 sec (checked both chains, one DEC
alpha processor 500 Mz).
EST_SMAP is for mapping the whole set to a chromosome sequence. For
example, 11000 sequences of full mRNA from NCBI reference set are
mapped to 52 MB of unmasked Y chromosome for 18 25 min (depending on
computer memory size).
OLIGO_MAP is designed to map a set of oligonucleotides used for
microarray production. The program map 300000 oligos 25-30 bp long on
49 MB of unmasked Chromosome 22 for 8 min. Program is useful to check
location of oligos and their uniqueness in genome.
Example of an output of the RNA_MAP program:
Sequence hsNM_005405 RefSeq human
[D] Sequence: 0, S: 1040, chrY
1 ----------(..)----------AAATCATCCACTTTCCCGAGAATCTAGGGATTATGC
1 ggtagctcag(..)atgcccacagAAATCATCCACTTTCCCGAGAATCTAGGGATTATGC
37 TCCACTGTCTAGAGACTATGCATACCATGATTATGGTCCTTCTAGTTGGGATCAACATTT
26343643 TCCACTGTCTAGAGACTATGCATACCATGATTATGGTCATTCTAGTTGGGATGAACATTT
97 CTCTAGAGGATATAG----------(..)----------TGATTGTGATGGCTGTGGTGA
26343703 CTCTAGAGGATATAGgtattacaac(..)ttcaatttagTGATTGTGATGGCTGTGGTGA
133 GGTGATGTTAGAGATCATTCTGAACGTCCAAGTGGAAGTTCTTATAGAGATGCATTTCAG
26343821 GGTGATGTTAGAGATCATTCTGAACGTCCAAGTGGAAGTTCTTATAGAGATGCATTTCAG
193 AGATAGG----------(..)----------GAACCTCTCATGGTGCACCATCTGCAGGA
26343881 AGATAGGgtaagggtcc(..)tcccctgcagGGACCTCTCATGGTGCACCATCTGCAGGA
229 GTGCCTCTGTTGTCTTATGGNGGAAGCAGCCACCATGATTATAGCAATAAATGAGATAGA
26344341 GTGCCTCTGTTGTCTTATGGTGGAAGCAGCCACCATGATTATAGCAATAAATGAGATAGA
289 TATGGCAT----------(..)----------
26344401 TATGGCATaagtcgggag(..)nnnnnnnnnn
[R] Sequence: 0, S: 1040, chrY
1 ----------(..)----------ATGCCATATCTATCTCATTTATTGCTATAATCATGG
1 ggtagctcag(..)ctcccgacttATGCCATATCTATCTCATTTATTGCTATAATCATGG
37 TGGCTGCTTCCNCCATAAGACAACAGAGGCACTCCTGCAGATGGTGCACCATGAGAGGTT
21018059 TGGCTGCTTCCACCATAAGACAACAGAGGCACTCCTGCAGATGGTGCACCATGAGAGGTC
97 C----------(..)----------CCTATCTCTGAAATGCATCTCTATAAGAACTTCCA
21018119 Cctgcagggga(..)ggacccttacCCTATCTCTGAAATGCATCTCTATAAGAACTTCCA
133 CTTGGACGTTCAGAATGATCTCTAACATCACCTCACCACAGCCATCACAATCA-------
21018576 CTTGGACGTTCAGAATGATCTCTAACATCACCTCACCACAGCCATCACAATCActaaatt
186 ---(..)----------CTATATCCTCTAGAGAAATGTTGATCCCAACTAGAAGGACCAT
21018636 gaa(..)gttgtaatacCTATATCCTCTAGAGAAATGTTCATCCCAACTAGAATGACCAT
229 AATCATGGTATGCATAGTCTCTAGACAGTGGAGCATAATCCCTAGATTCTCGGGAAAGTG
21018754 AATCATGGTATGCATAGTCTCTAGACAGTGGAGCATAATCCCTAGATTCTCGGGAAAGTG
289 GATGATTT----------(..)----------
21018814 GATGATTTctgtgggcat(..)nnnnnnnnnn
---