In article <CDEpLM.7vs at usenet.ucs.indiana.edu>
Will Fischer, wfischer at bio.indiana.edu writes:
>I need to extract given pieces of sequence from a set of EMBL/GenBank
>entries, using ranges defined in the features table. For example, I'd
>like to be able to extract, from a set of entries, the DNA sequence for
>each exon, or again for every complete CDS feature (all exons
>>Surely not everyone does this manually?
>>What software exists that can actually parse the (eminently parsible)
>joint features table format? Please post reviews of programs you have
>used, or mail me directly and I will summarize.
I won't claim that my software will do what you have requested exactly,
but if you have access to a Macintosh and HyperCard 2.x, it might be
worth checking out. In it's EMBL/GenBank format -> simple string format
conversion, if it finds a features table, it will prompt the user for an
option to also export a spreadsheet-style features table. This table can
then be imported to create a "smart" gene (or features) graphical maps.
For example, I have created maps for 11 different mtDNA and 3 chloroplast
genomes, which are distributed with my stacks. I call them "smart"
because you can zoom in on the map to display a portion of the entire
map and you can "click extract" one or more sequences for a particular
feature represented on the map (or menu select from scrolling list
of features). For example, you can simultaneously export 9 metazoan
CO III sequences from the metazoan mtDNA gene map card in the stack,
with or without translation to amino acid sequences, each to a file and
also directly to my manual sequence aligner stack if you wish. There is no
facility to assemble exons because I have mostly been using this for my own
purposes related to animal mtDNA, which lack introns (but that wouldn't
be a hard modification). There is a published description of an earlier
version of this software in CABIOS 8:177-184 (1992), and it is
available on the iubio archives (ftp.bio.indiana.edu) in the /molbio/mac
directory as something like "DNAstacks_1xx.hqx" (the current released
version is 1.0m7 but I'm not sure I uploaded it to iubio yet -- you can
get it at um.cc.umich.edu in /gdef as "dnastack.hqx").
Please don't expect this parsing feature to perform miracles as this is
just something that I have done on the side which hasn't had very many
users besides myself. Let me know if you create new maps that may be
of general interest. Also, email me for an explanation of how to modify
which features get ignored in the exported features table.
Doug_Ee at um.cc.umich.edu