GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP |
Summary
SeqPup, version 0.9, September 1999
SeqPup is a biological sequence editor and analysis program. It includes links
to network services and external analysis programs. It is usable on common
computer systems that support the Java 1.1 runtime environment, including
Macintosh, MS-Windows and X-Windows.
Features include
Many of the features have been significantly improved in this release. As well, this release is much easier to install and use. This application is a work in progress; it has bugs.
Originally written in C++; this program has been ported to the new Java language.
SeqApp/SeqPup was started in 1990 as sequence editor/analysis platform on which
analysis programs from other authors could be easily incorporated into a useable interface.
You can obtain this release thru anonymous FTP or HTTP to iubio.bio.indiana.edu, in
folder /molbio/seqpup/java/. This version will work on any computer system that supports
Java runtime version 1.1, including Macintosh, MS Windows, and Unix/XWindows
systems. The Internet URLs to this software are
=== Brief Usage and installation, release 0.9 ===== You will need to fetch the SeqPup.jar java archive. It includes help documentation and data files, which will be installed when you first run the program. To start SeqPup from MSWindows or Unix, use one of these command lines jre -cp SeqPup.jar run (for Java 1.1.x with jre) java -cp SeqPup.jar run (for Java 1.2.x - not a recommended version) java -classpath SeqPup.jar:$CLASSPATH run (for Java 1.1.x without jre) To start SeqPup from MacOS, use the MRJ application, in SeqPup9-macos.hqx. Apple Java (MRJ) version 2.1.2 or later is recommended. There are several external analysis applications available for SeqPup, as compiled programs for MacOS and MSWindows. Find these in seqpup9-methods-msdos.zip (MS Windows ZIP archive) seqpup9-methods-macos.sit (MacOS Stuffit archive) These should be installed in a methods/ folder in the same folder as SeqPup.jar. Java version: This program will not work properly with the Java 1.2 runtime that is now commonly available on MSWindows systems. SeqPup will work with the Java 1.1.8 for MSWindows, which can be installed in addition to Java 1.2.Developers will find the source code for this application and others in the iubio:/molbio/java/source folders. Comments, bug reports and suggestions for new features are very welcome and should be sent via e-mail to SeqPup@Bio.Indiana.Edu.
Fetching
The current state of Java applications on computer systems means that you will need to
fetch a Java runtime system as well as the files for this specific application. If you have
other Java applications on your computer, you may already have the Java runtime system
needed. Some current operating systems, including MacOS 8.1 and Sun Solaris 2.6
include the Java 1.1 runtime already.
SeqPup is available over the Internet at
This current release is based on Java version 1.1. To run it, you should have installed a Java Runtime Environment (JRE), version 1.1 or later. This can be found through Javasoft, and at various mirror sites around the world.
This link provides Javasoft's runtime for Solaris and MSWindows systems
http://www.javasoft.com/products/jdk/1.1/jre/index.html
For Java runtimes for various operating systems, see
http://www.javasoft.com/cgi-bin/java-ports.cgi
Installing
The program is installed as follows. On all systems, the following items should be in one
folder: the SeqPup.jar Java class archive file, the data/ and classes/ folders, and the
seqpup-doc.html document. Each system also needs in this folder an application or
script that starts the application in the Java runtime environment.
1. Keep in the same folder the application SeqPup, the SeqPup.jar Java class archive, and the data and classes folders. Move the application SeqPup from the local/Macintosh/ folder into this main folder.
2. Use the MRJ installer from Apple Computer to install this Java runtime software. It needs to be version 2.0 or later. If you have MacOS 8.1, this is included as part of the OS. With MacOS 8, an earlier version of MRJ is included, but it isn't compatible with this software. You should upgrade to the MRJ 2.0 release.
3. Unpack the local/Macintosh/data-methods-macos.sit archive. It includes child app methods that need to be placed in the data/nethods/ folder.
This program calls an Internet browser to display HTML documents. The default is to call
Netscape. Also you now need to have this browser application already open; SeqPup
won't yet open it for you (a bug). The browser can be changed by editing SeqPup
preferences and changing the user.openurl variable. It needs to be set to the MacOS
"creator" name for the program you want (sorry I don't yet have an easy interface to set
this). For Netscape, the creator is "MOSS", for MS Internet Explorer, the creator is
"MSIE". This is what the setting looks like now
user.openurl=MOSS
1. Keep in the same folder the program batch file SEQPUP.BAT, the SeqPup.jar
Java class archive, and the data and classes folders. Move the SEQPUP.BAT file
from the local/MSWindows/ folder to this main seqpup folder.
2. Install a Java Runtime system for MS Windows. A recommended Java runtime is
found at http://www.javasoft.com/products/jdk/1.1/jre/index.html. You may want to
install this in a general MSWindows folder, perhaps C:\WINDOWS\JAVA. I don't
know of a prefered location yet on MS Window systems for Java runtime files. You
will need to edit the batch file (step 3) to account for this location.
3. Edit the SEQPUP.BAT file to make the path names match the file locations on your
computer.
set JAVA=C:\WINDOWS\JAVA
set APPPATH=C:\seqpup08
NOTE: If you get this message when running the batch file OUT OF ENVIRONMENT SPACE
user.openurl="C:\\Program
Files\\Netscape\\Navigator\\Program\\netscape.exe"
1. Keep in the same folder, the program start script seqpup, the SeqPup.jar Java class archive, and the data and classes folders. Move the seqpup script file from the local/Unix/ folder to this main seqpup folder.
2. Install a Java Runtime system for your system. For Sun Solaris systems, see
http://www.javasoft.com/products/jdk/1.1/jre/index.html. For other systems, see
http://www.javasoft.com/cgi-bin/java-ports.cgi
3. Edit the seqpup file to make the path names match the file locations on your computer.
set java=/usr/local/java
4. You will want to compile or install binaries of the child applications for your system to
use this feature (see Child Tasks below). Source code is provided for example child
apps in the data/methods/ folder. You can use other pre-compiled versions of these on
your system. You will need to edit the .command files in data/methods/ if these apps
are located in other folders.
You need to define the user.openurl= variable to find your Netscape or equivalent. You
can do this from within the application; see the Options/Edit basic prefs... Menu. Or edit
the file ~/.dclaprc directly to enter such a line. The variable line for my unix system is
user.openurl=/usr/local/bin/netscape
Also, you might instead use a shell script (like the "netscape.sh" included). If you rename
that to netscape, edit it to suit, and put in the folder with SeqPup.jar file, it may take the
place of editing the preference file.
See also below section Bugs for a list of known program bugs and some work-around hints.
Views, Windows and Dialogs
About the application
The first window displayed when you start SeqPup is a splash screen that tells you a bit
about the application and has active buttons to perform some basic commands. These
include opening sequence files, fetching sequence from Internet servers, opening the help
information, and network links to application updates, e-mail comments, and application
source code.
All these functions are also accessible from the standard application menus. This form of Hypercard-like picture window with active buttons is used in all the DCLAP applications. Active button areas are highlighted when your mouse moves over them, and its function is explained at the bottom of the window. Mouse clicking, once or more depending on clicksToActivate preference, will activate that function. These Hypercard-like windows are configured as per standard HTTP NCSA-Imagemap information, stored in the data/ folder pix/about.gif.map file. Functions can be changed and new images substituted if you desire.
Data views
The program has these main kinds of views and windows onto data:
A multiple-sequence view which is the primary display when you open a sequence
document; the single sequence editting view; various print views which result from an
analysis, like the Restriction map; and dialog views where you control some function.
Many of these views have dialog controls -- push buttons, check boxes, radio controls and edittable text items -- to let you fine-tune a view to fit your preference. Many of these views also will remember your last preferences.
When a view has editable text items, including the sequence entry views, most usual undo/cut/copy/paste features will work.
Two or more views of the same data are possible. Some of these are truly views of the same data -- changes made in one view are reflected in another. For instance, one can have a single sequence view open, select a feature and mark that feature position on the main document view, and also have that feature mark show in any open pretty print of that sequence.
Other views are static pictures taken of the data at the time the analysis was performed -- later changes to the data do not affect that picture.
Aligned multi-sequence view
The main view into a sequence document is the multiple sequence editor window, which
lists sequence names to the left and sequence bases as one line that can be scrolled thru.
Bases can be colored or black. Sequence can be edited here, especially to align them, and
subranges and subgroupings can be selected for further operations or analysis. Entire
sequence(s) can be cut/copied/pasted by selecting the left name(s). Mouse-down selects
one. Shift-mouse down selects many in group, Command-mouse down selects many
unconnected. Double click a name to open a single sequence view. Select name, then grab
and move up or down to relocate.
Select the lock/unlock button at the view top to lock/unlock text editting in the sequence line. With lock on (no editting) you can use shift and command mouse to select a subrange of sequences to operate on.
Bases can be slid to left and right, like beads on an abacus, when the edit lock is On (now default). Select a base or group of bases (over one or several sequences), using mouse, shift+mouse, option+mouse, command+mouse. Then grab selected bases with mouse (mouse up, then mouse down on selection), and slide to left or right. Indels "-" or spacing on ends "." will be added and squeezed out as needed to slide the bases. See also the "Degap" menu selection to remove all gaps thus entered from a sequence.
Single sequence view
For entering/editting a single sequence, this view displays one sequence with more info and
control. Edit the name here (later other documentation). Bring out this view by
double-clicking sequence name in align view, or choosing Edit from Sequence menu.
Print views
Various analyses provide non-editable displays. These are usually saveable as PICT,
POSTSCRIPT and GIF formats for editing in your favorite graphic editor program, or
printing. When a print or graphic view is displayed, choosing the File/Save As command
will offer you the choice of where to save and in what format.
Data files
SeqPup uses plain text files for its basic sequence data. These files can be exchanged
without modification with many other sequence analysis programs. SeqPup automatically
determines the sequence format of a data file when opening it. You have an choice of
several formats to save it as.
The program looks in the folder "data/prefs" for text files containing various data. At present these files include "codon.prefs", "renzyme.table" and "color.prefs".
Various temporary files are created for child tasks, currently in the main folder where the
program lives. Currently you cannot run the Child Tasks portion of SeqPup from a locked
file server because these temporary files need to be created. Otherwise, SeqPup should
operate from a locked fileserver properly, and can be launched by several users at once.
In the data/prefs/ that is comes with the application, you find these files
color.prefs -- for base colors in displays
seqmasks.prefs -- for pretty printing displays
renzyme.table -- for restriction maps
codon.prefs -- for protein translation
any of these can subsitute for the codon.prefs file
codon-drosophila.prefs
codon-human.prefs
codon-ecoli.prefs
codon-rat.prefs
codon-tobacco.prefs
Restriction Enzyme Table
The file called "renzyme.table" contains restriction enzyme data, as distributed in REBASE
by R.Roberts. The format used is identical to that used by GCG software.
{ documentation ...}
Commercial sources of restriction enzymes are abbreviated as follows: A Amersham (12/91) B BRL (6/91) ... X New York Biolabs (4/91) Y P.C. Bio (9/91) .. { separates data} ;AatI 3 AGG'CCT 0 ! Eco147I,StuI >OU AatII 5 G_ACGT'C -4 ! >EJLMNOPRSUVX AccI 2 GT'mk_AC 2 ! >ABDEIJKLMNOPQRSUVXY ;AccII 2 CG'CG 0 ! Bsp50I,BstUI,MvnI,ThaI >DEJKQVXY ;AccIII 1 T'CCGG_A 4 ! BseAI,BsiMI,Bsp13I,BspEI,Kpn2I,MroI >DEJKQRVY ;Acc65I 1 G'GTAC_C 4 ! Asp718I,KpnI >DFNY
#Escherichia coli # # any documentation # #Codon AmAcid Number /1000 Fraction .. GGG= Gly 1743.00 9.38 0.13 GGA= Gly 1290.00 6.94 0.09 GGT= Gly 5243.00 28.22 0.38 GGC= Gly 5588.00 30.08 0.40
File
New will create a document of sequence data (alignment view). With a new document one
can add new sequences, or copy selections from another document.
Open commands will open exising files. -
The Open as Sequence... choice will open a file of sequences into a new align view
document. -
You can also open appending sequences to the current document (Append to sequence
list). -
You can fetch sequences from an Internet server (see below SRS information) with the
Open sequence from databanks... command. -
The Open Text command will open and display a file as plain text. -
The Open URL command will open an Internet connection (or local file) given a URL of
the format http://internet.address:port/path/to/data.file, as in
http://iubio.bio.indiana.edu/Readme. If the file is sequence data it will be displayed in an
alignment window. Currently only the HTTP protocol is supported for this command.
Save and Save as will save the current document to disk files. Save is context sensitive and will be active when a document has been changed.
Revert will restore the open align view to the last version saved to disk.
Save selection wil saves only highlighted sequences to a new disk file. Doesn't affect
save status of current full alignment document.
Print setup, Print will print the current view (see bugs).
Check Updates will connect to the home server for the application and offer information on new versions and updates to the application.
Help brings up a view to page thru the help file.
Quit - terminate the program
Editing
Undo, redo -- Standard application commands to return a document to its state before a
command was performed (undo), and to again do the command (redo) after an undo. For
instance, complementing (changing) a sequence should be undoable. These are context
sensitive, and should be enabled only when possible. Current design is to offer several
levels of undo and redo, but see bugs.
Cut, copy, paste, clear, select all -- Standard application commands that are availble in a context-sensitive manner. Cut moves a selection from the document to the clipboard. Copy makes a copy to the clipboard. Paste copies from the clipboard to the active document. Clear removes a selection without copying to the clipboard. The clipboard is an application-wide special document that stores these data until overwritten by new data. Clipboard data is potentially copyable to other applications (see Bugs).
For instance, selected editable text should have these functions to manipulate the text. Sequence selections enable these functions to move sequence data within and between alignment documents. Not all appropriate contexts may yet have these commands enabled (see Bugs).
Find, Find same, Find "selection" will search for strings in text.
Find ORF, this will select the first or next open reading frame of the selected sequence.
Sequence manipulations
New sequence -- append a new, blank sequence to the sequence document.
Edit -- open single sequence editting view for selected items.
Reverse, Complement, Rev-complement -- Reverse, complement or reverse+complement a sequence. Works on one or more sequences, and the selected subrange.
Rna-Dna,Dna-Rna -- Convert dna to rna (t->u) and vice versa. Works on one or more sequences, and the selected subrange.
Degap -- remove alignment gaps "~". Works on one or more sequences, and the selected
subrange. Gaps of "-" are locked and not affected by Degap. Works on one or more
sequences, and the selected subrange.
Lock Indel & Unlock Indel -- Convert from unlocked gaps "~", to locked gaps "-".
Unlocked gaps will disappear and appear as needed as you slide bases left and right.
Locked gaps are not affected by sliding nor by Degap. Works on one or more sequences,
and the selected subrange.
Consensus -- generate a consensus sequence of the selected sequences. The Options/Seq
Prefs... dialog modulates this function.
Translate -- translate to/from amino acid. This relies on Codon.prefs data, which can be
changed for specific needs (see optional species-specific codon preference files).
Distance -- generate a distance or similarity matrix of the selected sequences. The Options/Seq Prefs... dialog modulates this function.
Pretty print -- a prettier view of a single or aligned sequences. Use these views to print your sequences. Printing from the editing display will not be supported fully, and may not print all of your sequence(s).
Restriction map -- Restriction enzyme cut points of selected sequence. Also protein
translation options.
Dotty plot -- provide a dot plot comparison of two sequences.
Nucleic, amino codes -- These provide both reminders of the base codes, and a way to
select colors to assocate with each code. See below for some discussion of the two
"aa-color" documents that now ship with SeqPup.
Single sequences - editing and features
The Edit sequence function opens a single sequence editing window for selected
sequence(s). One can edit sequence bases here, change sequence name and perform some
sequence manipulations and analyses.
A recent addition is Document and Features sections, along with the Sequence editing window. These sections are currently editable text. The format is not yet formalized but follows the specific sequence file format. Currently only Genbank and EMBL formats are parsed for documentation and features.
The features section includes sequence position information, as per
GC_signal 115..122
exon <447..571
CDS join(447..571,1786..2005,3441..3554)
These positions will be read by the program when you highlight the text, then choose the
commands in the Features/ menu. Mark on main view command will copy the selected
position to the alignment window, erasing any other mark for that sequence. Add to
main view command will copy the position, adding it to any other marks. This is most
useful when the main view has a Mask level selected. One can add feature marks to
different mask levels. Then one can pretty print the sequence and these marked features will
be highlighted according to the current styles for those masks.
Also among the options are dialogs to edit directly application and framework preference files. Generally you can ignore these, as other dialogs handle this. But some options don't yet have an easier interface for changing. One important one is the framework preference AWTs.clicksToActivate=1 which sets number of mouse clicks to active an icon button or other relevant item. Many people prefer clicksToActivate=2 (double-click).
For MSWindow and XWindow systems, the framework preference user.openurl is
important. For Macintoshes, the equivalent is done using InternetConfig software.
See the above Installing section for details of setting the user.openurl preference.
An application preference with no current dialog choice is Adorns.backColor=
0xe8f0ff, which sets window background color (0xE8F0FF is a light blue).
Option files are stored in a system specific location as text files. One can edit them, when
the application is not running, with a text editor. On Macintosh systems, the files are
stored in System Folder:Extenstions:MRJ Libraries: as dclap.prefs and seqpup.prefs files
(when using the MRJ runtime). On MS Windows sytems, they are stored in
C:\WINDOWS folder as dclap.ini and seqpup.ini. On Unix systems, these options are
stored in ~/.xxxrc files, including .dclaprc for the framework prefs, and .seqpuprc for the
application prefs.
Pretty Print configuration
This is the syntax for specifying style information used in the pretty print function.
This information is currently stored in the data/prefs/seqmasks.prefs file, and can be edited
with the Options/Base style table... command, or with a standard text editor. Each Style
label should now be prefixed with the mask level it applies to, as in
mask1.style=bold underline uppercase
maks2.style=italic box lowercase
Style tags for pretty print include
style=
bold - bold font
italic - italic font
underline - underline font
box - put a box line around selected mask region
uppercase - convert base to uppercase
lowercase - convert base to lowercase
invertcolor - invert the colors of the font and background
Use any combination of values for style, separated by space or commas
repeatchar=.
- use this if you want mult-align repeated chars set to a single character
fontname=
- set a valid computer font name, like Courier, Helvetica, Times, ...
fontsize=
- set point size of the font
fontcolor=
- set rgb color of the font, using 6 digit hexadecimal value, see sample values in table
(e.g., 0xff0000 is red, 0x00ff00 is green, and 0x0000ff is blue, 0x000000 is black and
0xffffff is white, 0xaaaaaa is one shade of grey).
backcolor=
- set rgb color of the background behind font
boxstyle=solid
set the style of the boxing line
current values are dashed, dotted, solid, dark, medium or light
fillpattern=
- set the pattern used to draw the background color or fill. This
will allow "hatching" types of shades. Not well tested yet (mostly needs
printer output to see).
- set this with two 8-digit hexadecimal values (to create an 8x8
pattern array). You need to experiment with values to find a nice
pattern. An example is fillpattern=0xaa55aa55 0xaa55aa55
Currently you can set four mask styles in this table. These should start with a header like
below, but name as you like. Lines starting with "#" or "!" are comments that are ignored.
Style names starting with "mask1." are associated with the sequence alignment mask called
"Select mask 1...". Start the names with "mask2." to associated with "Select mask 2...",
start with "mask3." to use wiht "Select mask 3..." and start with "mask4." to use with
"Select mask 4...".
##----------------------------
##[mask1]
mask1.description=a test style
## style values=bold,italic,underline,box,
## uppercase/lowercase,invertcolor
mask1.style=bold uppercase box
## repeatchar -- use if you want mult-align repeated chars
mask1.repeatchar=.
## font selection
mask1.fontname=Courier
mask1.fontsize=9
mask1.fontcolor=0xff0000 # red
mask1.backcolor=0x80e0e0 # lt.blue
mask1.boxcolor= 0x0000ff # blue
## boxstyle values= dashed dotted solid dark medium light
mask1.boxstyle=dashed
## fillpattern= use 2 hex-long values for this 8-byte pattern
mask1.fillpattern=0x88228822 0x88228822
mask1.fillpattern=0xaa55aa55 0xaa55aa55
Currently color values are stored as hexadecimal codes. This is stored as a 3-byte hex value of Red-Green-Blue (RGB) values. 0xFF0000 is red, 0x00FF00 is green, 0x0000FF is blue. Future versions of the program should include a color picker interface.
A few early users of this new version provided color amino selections that ship with
SeqPup. Here is one description.
Date: Mon, 7 Jun 1993 15:50:09 +0200 From: Heikki.Lehvaslaiho@Helsinki.FI (Heikki Lehvaslaiho) Subject: aa colors 2 4 - b i t M a c COLOR AA R G B R G B --------------------------------------------------------------------- Magenta AGPST 255 000 255 65535 0 65535 Black BDENQZ 000 000 000 Red C 225 000 000 57600 0 0 Blue FWY 000 000 255 0 65535 65535 Light blue HKR 000 192 192 0 49344 49344 Green ILMV 000 192 000 0 49344 0 Gray JOUX 145 145 145 37265 37265 37265
Internet Sequence Search and Fetch
New in version 0.7 are (a) a client for the Sequence Retrieval System (SRS) to look up and
fetch sequences from databanks like GenBank and EMBL, Swiss Protein an PIR, and (b) a
client for the NCBI-BLAST server.
SRS client
The SRS client lets you search Internet databanks for sequences based on key words in the
documentation, such as title, accession number, locus name, organism, author, and other
documentation. To learn more of the Sequence Retrieval System (SRS), see
http://srs.ebi.ac.uk:5000/, or http://iubio.bio.indiana.edu/srs/, or others listed at
http://srs.ebi.ac.uk:5000/srs5list.html.
Use the File/Open sequences from databank menu command (or the Fetch sequences button on About SeqPup) to access SRS servers. Type one or more key words to describe the sequences you want to view. Sequence titles are fetched for all matches (which may be hundreds or thousands) from the selected server, and displayed in an alignment document view. You then can fetch full data for specific sequences by active clicking the name, or choosing the Sequence/Edit command.
You can use boolean operators & (AND), | (OR), ! (NOT) to join several key words in a query to tailor your search. SRS servers offer searches by fields of data. The general field "all" searches all indexed fields; each databank offers a selection of fields such as organism, accession, title, comments, and so forth.
The current SRS client in SeqPup is fairly simple, and doesn't offer the rich range of
options you will find via an HTML browser, but it does offer the direct step of loading
sequence data from an Internet server into this sequence editor.
The Options/SRS setup dialog lets you set your prefered server, data libraries and data
fields for a query.
NCBI-BLAST client
The NCBI BLAST server performs a sequence similarity search of GenBank and/or other
sequence databanks, matching your sequence against published sequences. To learn more
of BLAST at NCBI, see http://www.ncbi.nlm.nih.gov/
The current BLAST client in SeqPup is also fairly simple. It doesn't offer any more than an HTML browser, except for the direct step of loading sequence data from your sequence editor to the analysis server.
To perform a BLAST search, select a sequence entry in a document, choose the
Sequence/BLAST@NCBI command which will open a sequence edit view with
BLAST option choices. You can edit the sequence here (without affecting the sequence in
your main document). You can select the results document file (in HTML format which
will be opened by your prefered HTML viewer). There is an Options drop-down dialog,
click the BLAST options triangle/arrow to open this section. Choose among which
BLAST program, which data library to search. Both of these are sequence context
sensitive -- DNA and Amino sequences have different selections. The Do BLAST button
sends your sequence to the server at NCBI via HTTP, and saves results to the selected file
which will be displayed in your HTML viewer.
The current Externals menu has
The general design of child applications is taken to be data analysis programs that have a command-line user-interface, and that take input data from a file or from the system "standard input" file (stdin), and that write outputs to files and to two system standard files "standard output" (stdout) and "standard error" (stderr). This is how many existing analyses programs work, and it is straightforward to program this basic kind of interface.
The value of SeqPup joined with these kinds of programs is that the SeqPup can concentrate on providing an easy-to-use interface for biologists, and the analysis application can concentrate on data analyses, without having to add a lot of software to provide a humanly usable interface.
Many command-line biocomputing programs, including versions of Clustal, CAP, tacg,
primer, FastA, and so forth can be added as Child apps or BOP remote services.
Which child applications?
I hope this new ChildApp/BOP method is general enough to let you add almost any
command-line program. I'm still working on special cases like Phylip package that
requires a structured command-file instead of command-line options. If you add any
biocomputing programs that can be freely distrubuted with SeqPup, consider sending
them, or the command configuration file, back to IUBio archive for addition to the general
distribution.
On command-line systems, including Unix and MSDos/MSWin, you should be able to use any pre-compiled version of a program that runs in this command-line style. On Macintosh systems (command-line-less), you will need to compile a command-line program with the ChildAppJ.c main program source (see the data/methods folder). This allows SeqPup to send command line parameters using a file.
Configuring child applications
You can add new child apps to SeqPup by adding text files to the data/methods/ folder
with the suffix .command,that include the string "Content-type: biocompute/command" at
the top, and follow the syntax described below and given in example files. See especially
the clustalw.command file.
Newlines or ';' separate key=value pairs in a structure. Values that include white space
need to be quoted with "" or ''. -
use backslash to escape special characters in a string, mainly tabs, newlines and such. A
string can be continued on multiple lines using \ just before the line end. Enclose such a
string in quotes.-
A structured value (with subfields) needs to be enclosed in curly brackets {}. -
The order of fields in a structure does not matter. Some fields are required, some are
optional. -
Strings in a string list in the value.list, menupath, resultsKinds and others can be
separated with tab or | (pipe) or comma characters. -
Comment lines starting with # are ignored.
The top level key is command = { various other key=value pairs }
Within a command, most of the fields are parameter lists (parlist = { list of pars } )
and parameters (par = { structure} ). All parameter values should include an id field, a
value field, and can include a label field for display.
See bopper2.idl and ReadCommand.java for current key words (these may change)
Key words match fields in the bopper.idl, and are case-insensitive .
At this writing the key words are
commandKeys = { "id", "transport", "action", "filepath", "parlist", "menu", "command" };
parameterKeys = { "id", "label", "value", "ifelseRules", "runSwitch" };
containerKeys = { "required", "parlist" };
choiceKeys = { "multiple", "minToShow", "parlist" };
dataKeys = { "datatype", "dataflow", "filename", "flavor", "data" };
ID values are case-sensitive, unique strings. You reference IDs and other variables with
dollar sign, as $ID. TITLE, INFO and HELP are special parameter ids.
The command key includes these subfields:
id = a unique string (required)
action = the command line to be executed, with runswitches of parameters to be
substituted (required)
transport = local: for use on the same computer, bop: for bopper. This may be optional,
and should be set by software
parlist = { list of parameters } (required)
resultKinds = string list of MimeTypes: text/plain, biosequence/fasta, ... (optional)
filepath = path to app on server (optional)
menupath = menu item name, with submenu path, e.g. "Utilities|Reformat" (optional)
The parameter key includes these subfields:
id = a unique string (required)
label = visible label (optional)
value.type = value (see below, required)
runSwitch = the command line string to be inserted in the action string. It is optional.
This often includes the term $value, which is the special variable signifying the parameter
value chosen by user. In the case of value.boolean types, this runswitch is set to null if the
value is false.
ifelseRules = string list of rules to enable the parameter, based on things like protein or
nucleic type of the input data; yet to be implemented (optional)
Labels and values of a parameter are shown to the user in a dialog form. The values can be changed by the user, depending on type of value. Other parts of this description are mostly for the server's use in determining how to run a command-line program, and how to get and return data.
There are many variants of the value field. These are specified as value.boolean =,
value.integer =, value.string = , and so forth. These match the ValueUnion structure of the
bopper2.idl.
Pimitive value types are value.boolean, value.integer, value.float, value.string.
value.title displays a non-editable string
value.url is an Internet URL
value.integerRange is a range of integers specified as "default,minimum,maximum,step
value".
value.floatRange is a range of real values specified as "default,minimum,maximum,step
value".
value.data specifies a data file with these subfields (all required?):
datatype = mime/type of data, e.g. text/plain, biosequence/fasta
dataflow = input or output
filename = string value of file, e.g, 5srna.fasta
flavor = file flavor, from the set stdin, stdout, stderr, input, output, serverlocal
A final subfield, data need not be specified and is used in the interface to pass actual
data. An example is
value.data={ dataflow= output; datatype= text/plain;
flavor= stdout; filename= clustalw-out.text; }
value.container = is a value that includes other parameters, and is displayed as a
container of options to the user. It may be a required or optional container. It has the
subfields:
required = true or false
parlist = { list of pars } (required)
value.choice = is a value that includes other parameters, often boolean options. It has
the subfields:
multipleChoices = true or false
minToShow = minimum number of choices to be displayed
parlist = { list of pars } (required)
value.list = a list of strings to select from, separated with pipe or tab chars, e.g.
value.list= "AatII|AccI|AceIII|AciI|AclI|AflII"
For MacOS, there is limited support for AppleScript commands when using the MRJ java
runtime. Use the word applescript as the action command:
action = "applescript text of script to run here"
Currently no objects are returned, but script results are printed to System.out
command = {
id = clustalw
## server only config -- I think these values are set by
parlist = {
par = {
value.container = {
par = {
par = {
par={ id= STDOUT; label="Command output";
par={ id= STDERR; label="Errors";
par = {
## more option parameters here...
} # end parameters
Content-type: biocompute/command
menu = "Sequence Alignment|Clustal multiple alignment"
filepath = data/methods
## software, but this may need fixing.
transport = local:
#transport = bop:
action= "$filepath/clustalw \
$INFILE $OUTFILE $ALIGN $TREE $QUICK $BOOT \
$GAPEXT $GAPOPEN $200 $220 $100 $PAIRGAP $KTUP \
$TOPDIAGS $WINDOW $PWGAPOPEN $PWGAPEXT $PWMATRIX \
$300 $221 $211 $212 $213"
par = {
id = TITLE
value.title = "Clustal W Alignment"
}
par = {
id = INFO
label = "About Clustal W"
value.title = "Clustal W - for multiple sequence
alignment \
by Des Higgins and colleages. Clustal W is a general purpose
multiple \
alignment program for DNA or proteins."
}
id = main
label = "Clustal W - A multiple sequence alignment
program"
required = true
parlist = {
par = {
id = HELP
label = "Help with Clustal W"
value.url = file://$filepath/clustalw_help
}
par = {
id = ALIGN
label = "Do full multiple align"
value.boolean = true
runSwitch = -align
}
}
}
} # end main
id = IOfiles
label = "Input/Output files"
value.container = {
required = false
parlist = {
par = {
id = INFILE
label = "Input sequences"
value.data = {
dataflow = input
datatype = biosequence/nbrf
filename = clustalw.pir
flavor = input
}
runSwitch = "-infile=$value"
}
id = OUTFILE
label = "Output aligned sequences"
value.data = {
dataflow = output
datatype = biosequence/gcg
filename = clustalw.msf
flavor = output
}
runSwitch = "-outfile=$value -output=GCG"
}
value.data={dataflow= output; datatype= text/plain;
flavor= stdout;
filename= clustalw-out.text;}
}
value.data={dataflow= output;datatype= text/plain;
flavor= stderr;
filename= clustalw.err;}
}
}
}
}
id = treeoptions
label = "Tree options"
value.container = {
required = false
parlist = {
par = {
id= TREE
label= "Calculate NJ tree"
value.boolean = false
runSwitch = -tree
}
par = {
id= BOOT
label= "Bootstrap NJ tree"
value.boolean = false
runSwitch = "-bootstrap=$BOOTVAL"
}
par = {
id= BOOTVAL
label= "No. of boostraps"
value.integer = 1000
}
}
}
}
} # end command
Side note: Prior versions of SeqPup used an HTML FORMS syntax. This has been
replaced by a new syntax, with some misgivings, because programming effort to support
HTML was much more costly, and this new syntax can be extended more easily to include
features needed for biocomputing. The syntax evolved from this prior work and the GCG
SeqLab configurations. It will be extended in the future to more fully cover needs for
biocomputing programs. That may include adding back some of the HTML formatting
options.
Bopper and Internet biosequence analsyses
An Internet method of using "child apps" is now available with SeqPup. This allows one
to run analyses programs on a remote computer, and interface with SeqPup's editor
platform (fairly) transparently, as for the local child apps. This is made possible with a
network protocol I've acronymed BOP (Biocomputing Office Protocol; obviously the
acronym came first). The first version of BOP written in 1996 was based directly on the
POP internet mail protocol. BOP2 (Bopper2) uses a CORBA-based interface, and replaces
the unfinished BOP1 methods.
Many command-line programs, including versions of Clustal, CAP, tacg, primer, FastA, BLAST, the Phylip series, fastDNAml, and so forth can be added as BOP services fairly simply.
One potentially popular use for this BOP interface may be to offer a simple-to-use client for Genetics Computing Group (GCG) command-line software. As of this writing, an example Bopper server for GCG software isn't quite ready, but will soon be.
If you are an administrator of GCG software for your institution and would like to test this
experimental version of Bopper2 with GCG at your site, please let me know.
The configuration of apps on a server computer is essentially the same now as
configuration of local child apps running from the SeqPup data/methods/ folder.
To install and configure a Bopper2 server, see the distribution software, in
ftp://iubio.bio.indiana.edu/molbio/java/source/bopper2.tar.gz
To provide BOP services to SeqPup or other clients, follow these steps:
-- Install Bopper2 on a server computer. The current Bopper2 is based on a CORBA Interface Definition (IDL). It is implemented in Java, using the free Omnibroker ORB. It will potentially run on any system with a Java runtime, but has only been tested on Unix. The bopper2 distribution should include all Java source and classes needed to run it, excluding a Java runtime and the command-line programs themselves.
-- Configure bopper2 to add command-line programs. The same .command file syntax is used for local and remote external commands with SeqPup. But one may need to modify file path and perhaps other information for each specific system. See the data/methods/ folder in SeqPup for example .command files.--
run the bopper server and publish its access url. I hope to add some directory of bop
servers mechanism, but that currently isn't available. The URL for the test bop server at
IUBio archive is
iiop://iubio.bio.indiana.edu:7000/bop
Note the IIOP protocol specifier, which is a CORBA standard network protocol. "bop" is the name of this specific service. Other named services may be run at the same host:port.
These four biocomputing applications now share about 60 - 70% of their code. Improvements in one lead to improvements in the other in many cases, and that holds for future applications written with this framework.
This framework called DCLAP was started with the NCBI toolkit, a cross platform C
toolkit on which Entrez, Sequin and other apps from NCBI are written (Thank you to
Jonathan Kans and colleages at NCBI for this wonderful, free toolkit). On top of this
toolkit I wrote a C++ framework which is meant to handle much of the basic application
chores such as document opening, saving, doc and window management, menu and
command management, etc. With the advent of Java as a C++-like language that has
broad support and funding of tools from the commercial sector, but which also is available
in free form at its basics, it looked like a good underpining for rapid, cross-platform app
development. However, neither NCBI toolkit nor Java nor other sources provide the kind
of application framework freely that makes it quick and easy to produce robust, easy to
use, full featured applications for the biosciences. As I write new applications, I aim at
improving such a framework so that the next application can be written more quickly than
the last. The source code for this framework, in C++ and now its beginnings in Java, is
available freely to others for scientific application development. The current Java version
of DCLAP is preliminary, and will change significantly over coming months as it is more
fully converted to use Java version 1.1. However I find it now very helpful in producing
new applications, and hope that other programmers may also find it useful.
Developers will find the source code for this application and others in the
iubio:/molbio/java/source folders.
Source code and DCLAP
This describes the pre-Java version of DCLAP (still available).
The C++/C source code for the prior version is at iubio:/util/dclap/source/
SeqPup is built on an object-oriented application framework, originally written in C++, called DCLAP. This framework is designed to speed the development of easy to use, complex programs with a rich user-interface. At this point, DCLAP is an unfinished framework. It is lacking in documentation. However, it is complete enough to build complex programs like SeqPup.
DCLAP includes the following segments
DClap/ -- basic application framework, including command, control, dialog, file, icon,
list, menu, display panel, table view, mouse tracker, child application,
window and view classes.
Drtf/ -- rich text display handlers, including RTF, HTML document, PICT and GIF
image format readers.
DNet/ -- Internet connection tools, including TCP/IP, SMTP, Gopher and
preliminary HTTP classes.
DBio/ -- Biocomputing methods, included biosequence, restrict enzyme, sequence
editor, seq. manipulator, seq. output classes.
New applications can be built to employ and reuse these classes fairly quickly. Variations
on the current methods are simple to add in the class derivation method of C++. For
instance, new document formats can be added on the Drtf display objects, and new
sequence manipulations can be added in the biosequence handlers, by building on current
methods.
DCLAP rests upon the NCBI toolkit, including the Vibrant GUI toolkit, which is designed for cross-platform functioning. The successful genome data browser Entrez is written with the NCBI toolkit.
All of this source is available without charge for non-profit use (see copyright). The NCBI
toolkit portion is further available for profit use, and such arrangements may be made for
use of DCLAP.
DCLAP will never compete with commercial programming frameworks, but it has the
virtue of being freely available and redistributable, and includes support specifically for
biocomputing applications. If you are undertaking a biocomputing project requiring a rich
user interface, and wish it to run on multiple computer platforms, this may be a worthwhile
choice, especially if you wish to redistribute your source code for the benefit of the
scientific community.
The DCLAP developer archive is at ftp://iubio.bio.indiana.edu/util/dclap/
Please contact Don Gilbert for further information on using this framework in other
applications.
Comments, Copyright, Bugs and History
Comments
With any bug reports, I would appreciate as much detail as is reasonable without putting you off from making the report. If you don't have time to send detailed descriptions of problems, please do send comments and reports, even if all you say is "Good" or "Bad" or "Ugly".
Please include mention of computer hardware, and operating system software, including version. Describe how the problem may be repeated, if it is repeatable. If it is sporadic or only seen once, please also describe actions leading up to it. Include copies of data if relevant.
If you need to use land mail, mail to
Don Gilbert
Biocomputing Office, Biology Department
Indiana University, Bloomington, IN 47405
Copyright
This SeqPup program is Copyright (C) 1990-1997 by D.G. Gilbert.
All Rights are reserved.
gilbertd@bio.indiana.edu
Biology Dept., Indiana University, Bloomington, IN 47405
You may use this program for your personal use, and/or to provide a non-profit service to
others. You may not use this program in a commercial product, nor to provide commercial
service, nor may you sell this code without express written permission of the author.
You may redistribute this program freely. If you wish to redistribute it as part of a
commercial collection or venture, you need to contact the author for permission.
The source code to this program is likewise copyrighted, and may be used, modified and
redistributed in a free manner. Commercial uses of it need prior permission of the author.
Any external applications that may distributed with SeqPup are copyrighted by their
respective authors and subject to distribution provisions as described by those authors. At
present this includes ClustalW, by Des Higgins and colleagues, CAP by Xiaoqiu Huang,
and FastDNAml, written by Joseph Felsenstein with modifications by Gary Olsen, Hideo
Matsuda and Ross Overbeek, is copyrighted by University of Washington and
Joseph Felsenstein.
Distribution of external analysis applications with this program is done as a convenience for users, and in no way modifies the original copyright. If there is a problem with this, instructions to users for obtaining and installing external applications will be substituted.
No warranty, express or implied, is provided with this software. The author is trying to produce a good quality program, and will incorporate corrections to problems reported by users of it.
Bugs
v0.8 [java] -- Known bugs and missing features:
General
- view size-sensitive window scroll bars are used in several windows. These may not
yet work fully and seemlessly. Views may be shifted above the scroll area, or scroll
bars may not show up as they should when views extend beyond the window.
Resizing the window will often cure these problems.
- drop down boxes are used extensively in the dialog windows, to hide/show
selected information. Currently when a box is dropped down by clicking its drop
arrow, the box isn't resized/displayed fully. One needs to resize the window with a
mouse drag to get it displayed fully.
- appMenuBar needs work -- menus not showing in new doc (mrj), other bugs
- context sensitive menus not always properly sensitive to context (disabled when should
be able, or vice versa).
- undo/redo isn't working in cases where it should. but does work in several cases.
repeated undo/redo generally doesn't work, while first level undo often does.
- copy/cut/paste functions may not be working as smoothly in as many contexts yet as
they should be.
- clipboard use (via copy/paste) and display needs work; export of clipboard to other
apps not yet supported ( will happen when converted to jdk1.1)
- window menu doesn't always list items (java runtime/os variable)
- preference editing needs improved user interface
Sequence functions
- not yet ready : Restrict map, Dot plot, Nucleic & Amino codes pictures.
- Consensus overwrites first sequence in selection w/ cons as well as appending cons
seq at end of list
- mask menu items not always enabled when mask views are selected (context sens.
menu bug).
- find bases not ready; find ORF may be okay but needs testing
- sequence file reading and writing (readseq functions) still need testing and may well
have bugs. Interleaved formats NEXUS/Paup, Phylip are not debugged. New
formats await adding.
- single sequence editor is slow for long sequences
- sequence manipulation functions for single sequence window may not be ready.
- feature able parsing is preliminary; expect it to be improved in future releases.
-- saving feature/document info associated with sequence works now only for genbank
and embl formats; cross-saving is still problematic (embl->genbank and vice versa)
-- editing feature/doc info in the single sequence windows should work but needs
testing and may have bugs
-- using feature ranges to mark up masks and pretty prints, while now possible, is still
too awkward a process; I hope to make this essentially automatic in later releases.
- changes to prefs such as codon prefs, color and style prefs may not be stored (edit these
files w/ external text editor if need be).
Child/external applications
- v0.8 1st java version supporting these, with new interface. Undoubtedly there are
bugs and missing features.
- remote external app interface is CORBA designed, using OmniBroker ORB. Interface
def. will change. Current system doesn't well support user/password logins
(primarily server-end problem)
- file handling for child apps still a problem -- where to put temp/results files
- need more sequence input checking, handling of specific child app needs
- need if/else handling of dialog items w/ respect to input seqs (prot vs nucleic)
Internet functions
- SRS functions need more testing and debugging;
Macintosh specific:
- MRJ 2.0 seems to have problems that other JRE's don't (besides slowness).
-- The work around with several window display problems is to resize the window a
bit (grab size box and drag some) to get it to display properly.
-- Menus disappear when a new window is opened. The work-around is to select
another SeqPup window then switch back to the new window, and menus appear.
-- scrolling dialog windows don't update on scroll -- esp. ones w/ dialog items.
-- text document display is horribly slow for any text longer than 20-30 lines
MS Windows specific:
- window menu may be non-functional, or picks wrong window
- functions that depend on window list (close at least) not working properly
XWindows specific:
- the BLAST dialog seems to send only a few bases to NCBI server
Fixes in v 0.8a
xx- printing directly now supported (java 1.0 missing feature; will happen
when converted to 1.1). Still needs work (printing graphics works, printing a
TextDoc fails).
x?- mswin & xwin: menu command keys may now supported (XWin test shows them
but non functional?)
xx- remote data analysis via biocomputing office protocol (BOP) now supported
xx- child apps now supported
xx- seqed/editable text areas now wrap
xx - mac - window location preferences are not used (seems fixed in MRJ2).
xx - save selection command fixed.
Fixes in v0.8b
xx - fix for mswindows file:/// url prefix ?
xx - improved readseq file format detection (I hope!)
xx - alignment editing of sequences enabled
Coming Features
Somewhat further on, I'd like to make SeqPup a bean-box, capable of incorporating new
functions using the JavaBeans technology. It is a hope, that I don't know if it is feasible in
my programming time frame, that this bean interface will be simple enough that an average
biologist with interests could put together a data analysis function in Java and add it to
SeqPup w/o having to spend a lot of time learning programming and software development
methods. There are suggestions that Java will become a more ubiquitous and easy to use
language than the combination of C, C++ and Perl, which are often used now for various
biocomputing analyses.
History
SeqApp was started Sept. 1990 as MacApp sequence editor/analysis platform on which
analysis programs from other authors, typically command line w/ weak user interfaces,
could be easily incorporated into a useable Mac interface.
January 1998: version 0.8 release
+ Update to Java version; C++ version no longer updated
+ Bopper2 remote/local interface to command-line applications added. This is based on
CORBA standard. It is experimental; the interface will likely change (improve I hope).
But it has the basic functionality needed to attach local or remote network command-line
style applications to this program. It is user-configurable (with help and better
documentation).
+ 1 Feb 98 - added color picker, background color command, base color prefs dialog,
sequence styles dialog. corrected several pretty print problems.
August 97: version 0.7b release
+ First java - based version
June/July 96: version 0.6d release
+ "bopper" Internet protocol for client/server use of command line programs such as the
GCG suite.
+ autoseq base calling app for reading ABI and SCF sequencer trace file data, plus
base/trace editing functions.
+ Started expanding maximum sequence limit to 2 megabases (from about 30Kb),
however most functions beyond viewing will still fail for >30Kb sequences.
+ Several bug fixes are included for mac, mswin, unix. Added background color in align
view, minimum ORF size pref, improved tracking of changed data, improved align
editing, save pretty print to PICT or text; fixed child app bugs; fixed mswin edit
truncation to 255 bases; editable data tables in selection dialogs
Jan. 96: version 0.5 of SeqPup.
fixed Save file in place -- now saves in proper folder, not in seqpup folder
improved seqpup folder path finding:
- MacOS: now should always find :tables:, :apps: if they exist w/in SeqPup folder, and
prefs paths are relative (e.g., apps=apps, tables=tables in .prefs)
- UnixOS: now can 'setenv SEQPUPHOME /path/to/seqpup/folder'
- MSDOS : ditto with 'setenv SEQPUPHOME c:\path\to\seqpup'
NOTE: must use "APPNAME"HOME, so if you change name of SeqPup to PeekUp,
you need to change env var to PEEKUPHOME.
added click-top-index-line to mask sequence column (only when sequence mask mode
1..4 is selected in main window popup)
added mask-to-selection, selection-to-mask commands -- mask-to-selection is not yet
useful because base DTableView selection methods need to be rewritten to allow
disjoint selections.
added seq-index display -- lists base number that mouse pointer is at
added mac file bundle rez & finder-open, finder-print
added save of pretty print in PICT format (mac), metafile (mswin - still buggy?!)
added variable position grey coloring in align display
added mswin/xwin sticky menubar window
fixed mswin mouse-shift commands
added mswin menu command keys
fixed mac/mswin text edit command keys: cut/copy/paste
many updates to mswin version for micsoft win32/winnt/win95
updated fastdnaml child app to new version 1.1.1
added configurable child-app launch parameters
-- dialogs in HTML.form format; needs more work, additions
added dna distance/similarity matrix function
added child apps: DeSoete's LSADT, Felsenstein's DrawTree & DrawGram
July 95: Version 0.4 of SeqPup.
This includes most of the features of its ancestor SeqApp. Alignment window: shift &
slide sequences, copy/cut/paste/undo sequence entries among windows; Restriction maps
and pretty print output; useable child apps for mac, mswin, and unix.
v0.4 corrections:
- File/Open for non-sequence data (text, rtf, etc.) has alternate open menu, to
distinguish from sequence data. Added sequence append-open.
- Cut/copy/paste/undo for align-seq view now available
- Sequence menu items that are now ready: Consensus, Pretty print, Restriction
Map, nucleic & amino codes. Some of these need further work (pretty, remap options).
- Child apps usage improved, may need more work though.
- The Mac/68K, Mac/PPC, MSWin, Unix now do Child applications.
- Include ClustalW, CAP, FastDNAml, child apps
- Restriction map function is extensively revised and improved.
- FindORF and Find string functions added
- Printing for pretty print, r.e.map now functional on Mac (and maybe MSWin)
v0.4 Known bugs and missing features (see above Bugs section for fuller list):
- Character editing (unlocked text) in the alignment (main) window is not working
on Xwindow systems, and may be bugging in MSWindow and Mac systems.
- Single sequence editor (Sequence/Edit) is very slow for long sequences
(6,000bases)
- Sequence menu items not yet ready : Dot plot.
- Child Apps fail in various ways on MSWindows and Unix systems.
-- CAP seems most likely to succeed completely.
-- ClustalW and FastDNAml may be launched and run properly, but SeqPup will fail
to automatically open their results files.
- MSWindows and XWindows versions are less stable than Mac versions.
- XWindows versions reliable crash/core dump when Quit is chosen. This is an
annoyance but doesn't seem to impair use.
- Internet menu needs testing & reworking - I haven't tested any of the e-mail
services listed since last year.
- Nucleic codes picture shows PICT processing bug -- misplaced text, and an error
in biology -- complement of W is W, not S, and complement of S is S, not W.
- Repeated copy/cut/paste of the alignment window entries might cause problems.
Please let me know if you see this.
- There is no printing for X Window systems.
21 Mar 95: Second release of SeqPup, version 0.1.
This release has more parts of the SeqApp program put into it. This includes some
alignment view manipulations, limited use of child applications, some undo-able
commands, choosing data tables for colors, codon and r.enzymes. This release also
includes much of the basics of GopherPup, including display of RTF, HTML, PICT, GIF
document formats. However there is still some work to be done to let you open these w/o
interpreting them as sequence data.
This release has just a Mac PowerPPC (SeqPup/PPC) and Mac 68000 processor
(SeqPup/68K) versions. When more of the basic bugs are worked out, I'll try Sun and
MSWindows versions.
v0.1 Known bugs/missing features:
- Use of character editing (unlocked text) in the alignment (main) window will lead
to a crash after a few windows have been opened/closed or other manipulations performed.
- File/Open for non-sequence data (text, rtf, etc.) may well mistakenly identify them
as sequence data. File/New is probably not doing anything useful, or bombing.
- Single sequence editor (Sequence/Edit) is very slow for long sequences
(6,000bases)
- Single seq. editor may be failing in various ways (I've not looked at it carefully
yet).
- No cut/copy/paste/undo for align-seq view yet (coming soon I hope).
- Internet menu needs reworking - I haven't tested any of the e-mail services listed
there since last year.
- Sequence menu items not yet ready : Consensus, Pretty print, Restriction Map,
Dot plot, nucleic & amino codes.
- Child apps usage needs more development to work smoothly.
- The Mac/68K version fails when using Child applications.
- Only the ClustalW child app is ready for distribution (may have FastDNAml,
CAP, and DNAml soon -- let me know of programs you would like to see here).
1 Mar 94: First public release of SeqPup, version -1.
It has plenty of bugs and missing features, including:
no Undo (this is a real bite to those used to it)
mostly no cut/copy/paste/clear
limited printing of documents or views
mostly no align-view manipulations (move,cut/copy,edit in place, shift, ...)
no pretty print views
no restriction maps
no dot plots
no ...
problems w/ window display & keeping track of active window (x,mswin)
I'll be adding back many of these features from the Macintosh SeqApp as time permits.
SeqApp 12+ June 93, version 1.9a157+
a semi-major update, and time extension release with various enhancements and
corrections. These include
-- lock/unlock indels (alignment gaps). Useful when sliding bases around
during hand alignment, to keep alignment fixed in some sections.
-- color amino (and nucleic) acids of your choice.
-- added support for more sequence file formats: MSF, PAUP, PIR. SeqApp now relies
on the current Readseq code for sequence reading & writing.
-- save selection option to save subset of bases to file.
-- addition the useful contig assembly program CAP, written by Xiaoqiu Huang.
-- major revision of preference saving method (less buggy, I hope)
-- major revision of the underlying application framework, due to moving from MacApp 2
to MacApp 3.
-- fixed a bug that caused loss of data when alignment with a selection was saved to disk.
5 Oct 92, version 1.8a152+ -- a semi-major update with various enhancements and
corrections. These include
- corrections to the main alignment display,
- improvements to the help system,
- major changes to the sequence print-out options,
-- including addition of a dotplot display (curtesy of DottyPlot),
-- a phylogeny tree display (curtesy of TreeDraw Deck & J. Felsenstein's DrawTree),
-- improved Pretty Print, which now has a single sequence form and a better aligned
sequence form,
-- improved Restriction map display,
- addition and updating of several e-mail service links,
-- including Blast Search and Genbank Fetch via NCBI,
-- BLOCKS, Genmark, and Pythia services,
- updated Internet gopher client (equal to GopherApp),
- editable Child Tasks dialogs
- addition of links to Phylip applications as Child Tasks
- addition of Phylip interleaved format as sequence output option
11 June 92, version 1.6a35 is primarily a bug fix release. Several of the disasterous bugs have been squashed. This version now works on the Mac SE model, except for sendmail. No new features have been added.
7Jun92, v. 1.5a?? -- fixed several of the causes of mysterious bombs (mostly uninitialized handles), link b/n multiseq and 1-seq views is better now, folded in GopherApp updates, death date moved to Jan 93,
25Mar92, v1.5a32 (or later). First release to general public. Includes Internet Gopher client. Also released subset as GopherApp for non-biologists.
4Mar92, v 1.4a38 -- added base sliding in align view. Bases now slide something like beads on an abucus. Select a section with mouse, then grab section and shift left or right. Gaps are inserted/removed as needed. For use as contig aligner, still needs equivalent of GCG GelOverlap to automatically find contig/fragment overlaps.
Also added "Degap" menu item, to remove "." and "-". Fixed several small bugs including Align pretty print which again should display.
2Mar92, v 1.4a19 -- fixed several annoying bugs, see SeqApp.Help, section on bugs for their resolution. These include Complement/Reverse/Dna2Rna/ Translation which should work now in align view; Consensus menu item; entering sequence in align window now doesn't freeze after 30+ bases; pearson/fasta format reading; ...
10Feb92, v 1.4a6 -- fix for Mac System 6; add Internet service dialogs for Univ. Houston gene-server, Geneid @ BU, Grail @ ORNL; correct About Clustalv attribution.
5Feb92, v 1.4a4 -- limited release to network resource managers, clustalv authors, testers.
Vers 1.4, Dec91 - Feb92. Dropped multi-sequence picker window, made multi-align window the primary view (no need for both; extra confusion for users). added pretty print, restriction map, sequence conversions. Generalized "call clustal" to Hypercard-like, System 7 aware menu for calling external tasks. Fleshed out internet e-mail objects, added help objects, window menu, nucleic/amino help windows. Many major/minor revisions to all aspects to clean out bugs. Preliminary release to a limited set of testers (1.4a?)
Vers. 1.3, Sept - Dec91. Modified clustalv for use as external app (commandline file, background task, ...). Added basic Internet e-mail routines call clustal routine (preliminary child task) Many major/minor revisions to all aspects to clean out bugs.
Jun91-Aug91: overwork at other tasks kept SeqApp on back burner.
Mar91-Jun91: not much work on SeqApp, fleshed out TCP methods (UTCP, USMTP, UPOP).
Feb 1991, vers 1.2? made available to Indiana University biologists and NCBI biocomputists.
Vers. 1.1, Oct 1990, multiple sequence picker and multiple sequence alignement window, including colored bases, added to deal with alignment and common multi-sequence file formats.
Version 1, Sep 1990. Single sequence edit window + TextEdit window, from MacApp skeleton/example source + readseq.