SEALS
A System for Easy Analysis of Lots of Sequences
http://www.ncbi.nlm.nih.gov/Walker/SEALS/index.html
Version 0.82
Roland Walker
National Center for Biotechnology Information
walker@ncbi.nlm.nih.gov
-------------------------------------------------------------------------------
README
-------------------------------------------------------------------------------
Special warning regarding SEALS scripts involving Entrez retrievals:
gi2fasta
gi2asn1
gi2genbank
gi2defline
gi2abstract
gi2sibling
boundtable2fasta
taxnode2gi # these cause load to a much lesser
taxnode2amino_gi # extent
taxnode2nucleic_gi #
tax_filt # these cause excessive load only
tax_break # if your taxonomy index ('megatable')
tax_collector # becomes outdated
If you are performing retrievals on a large scale, you must set up your
local fasta indexes ("hints"). This will reduce load on the NCBI servers,
and give you a speed increase of many orders of magnitude. Hints are
created with the command 'fasta2index' and stored on the path specified
under the fasta2index option '-fast_index_path'. (See the readme for
fasta2index. At a minimum, indexes for the nr and nt databases which
are less than 60 days old must be installed. The recommendation is that
indexes be created for all available indexes which contain gi numbers.
If you are retrieving fasta records which are in your hints databases, the
retrieval will occur on your local system only. Other retrievals will
search Entrez through the NCBI servers.
Be very careful with large operations that may cause overloading of the
NCBI servers. In the interest of providing reliable service to all, NCBI
must request that large retrieval operations be planned carefully to
minimize impact on NCBI systems. Large retrievals should be scheduled to
run outside of the normal business hours at NCBI, namely 8 AM to 9 PM,
Eastern Standard Time.
All transactions are logged. Your site administrator will be contacted
if overuse occurs. Continued abuse of NCBI servers may result in limited
access for your site.
-------------------------------------------------------------------------------
To install the SEALS package, change to the seals/install directory and follow
the instructions in the README in that directory.
If you are not the SEALS administrator, but wish to use an already installed
version of SEALS, you only need to
0 be using tcsh as your shell
1 add a line sourcing .seals_login.tcshrc (from this directory), as
near to the end of your ~/.login as possible
2 add a line sourcing .seals_shell.tcshrc (from this directory), as
near to the beginning of your ~/.tcshrc as possible. If you don't
have a ~/.tcshrc, create one that sources .seals_shell.tcshrc
and your ~/.cshrc, if any.
When you are done with all that, create a new login shell, and execute
seals
with no arguments to see a list of scripts.
seals --status
will give you some status information about your installation, and
seals --describe
will return single-line descriptions of each command.
The rwidget man page will give you a very important overview of the SEALS
command-line interface. Please read it.
The listing returned by
seals --bugs
is recommended reading as well.
For most programs, a readme will be given if the program is executed without
arguments. In all cases, executing
command --help
will give you a readme.
NOTES:
There are a few scripts missing, and blastmore is not in this release. These
programs must be prettified a bit more before I am happy with distributing
them. There are also many scripts here which are not described in the ISMB
paper. The best reference is the output of the 'seals' command.
gi2report is not included, because the report format is longer supported by
NCBI.
If you have difficulties with retrievals using gi2fasta, etc, there may be
a problem with the Entrez server. Please try 'entrezping' to see if the
server is up.
There are man pages available for gref, but other programs have no
documentation other than the built-in README.
There will be more docs whenever I get the change to write them, and a 1.0
release to go with our paper in Bioinformatics (formerly CABIOS), which is in
preparation. If you have any specific needs, especially regarding things
mentioned in the paper but not listed, but also any suggestions, thoughts, or
bug reports, write me without hesitation.
Roland
Roland Walker
walker@nbci.nlm.nih.gov
Send comments to us at
archive@iubioarchive.bio.net