>From @mitvma.mit.edu:MACRIDES@WFEB2.BITNET Fri Oct 16 15:06:59 1992
Received: from net.bio.net by sunflower.bio.indiana.edu
	(4.1/9.7jsm) id AA03499; Fri, 16 Oct 92 15:06:47 EST
Received: from MITVMA.MIT.EDU by net.bio.net (5.65/IG-2.0) with SMTP 
	id AA09873; Fri, 16 Oct 92 12:27:36 -0700
Received: from MITVMA.MIT.EDU by mitvma.mit.edu (IBM VM SMTP V2R2)
   with BSMTP id 2752; Fri, 16 Oct 92 15:27:41 EDT
Received: from WFEB2.BITNET (MACRIDES) by MITVMA.MIT.EDU (Mailer R2.08 R208004)
 with BSMTP id 5953; Fri, 16 Oct 92 15:27:35 EDT
Received: from WFEB2.BITNET by WFEB2.BITNET (PMDF #2704 ) id
 <01GQ0RSFLGF4000H93@WFEB2.BITNET>; Fri, 16 Oct 1992 15:22:31 EST
Date: 16 Oct 1992 15:22:30 -0500 (EST)
From: Foteos Macrides <MACRIDES%WFEB2.BITNET@net.bio.net>
Subject: NCBISHELLS.SHARE
To: software-sources@net.bio.net
Message-Id: <01GQ0RSFLQ2A000H93@WFEB2.BITNET>
X-Envelope-To: software-sources@net.bio.net
X-Vms-To: in%"software-sources@net.bio.net"
Mime-Version: 1.0
Content-Transfer-Encoding: 7BIT
Status: R

Path: wfeb2.bitnet!macrides
From: macrides@wfeb2.bitnet
Newsgroups: bionet.software.sources
Subject: NCBISHELLS.SHARE
Message-ID: <1992Oct16.152215.86@wfeb2>
Date: 16 Oct 92 15:22:15 EDT
Organization: Worcester Fndn. for Exptl. Biol.
News-Moderator: Approval required for posting to bionet.software.sources
Lines: 986

        NCBISHELLS.SHARE is a VMS_SHARE set of Steve_Clark/Erik_Sonnhammer-
style command procedures for users of the Wisconsin GCG package to send the
NCBI Email servers requests for BLAST searches, sequence documentation
searches (like GCG's STRINGSEARCH), and sequence retrievals (like GCG's
FETCH).  Installation instructions are included as comments at the tops of the
files.  The *.HLP files can be inserted in the GCG on-line help library.

Contents:

00README.TXT         -- This message.

BLASTNCBI.COM (v1.1) -- For blastp, tblastn, blastn, and blastx searches.
  BLASTNCBI.HLP
  TOFASTA.FOR (Steve Clark's GCG to FastA format converter)
  TOFASTA.HLP

SEARCHNCBI.COM       -- For documentation searches with (a) query term(s).
  SEARCHNCBI.HLP                Returns titles of hits.

DBNCBI.COM           -- For retrieving sequences identified via BLASTNCBI or
  DCNCBI.HLP                    SEARCHNCBI.

=========================================================================
 Foteos Macrides           Worcester Foundation for Experimental Biology
 MACRIDES@WFEB2.BITNET     222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================

$! ------------------ CUT HERE -----------------------
$ v='f$verify(f$trnlnm("SHARE_VERIFY"))'
$!
$! This archive created by VMS_SHARE Version 7.2-007  22-FEB-1990
$!   On 16-OCT-1992 15:18:43.46   By user MACRIDES (Foteos Macrides)
$!
$! This VMS_SHARE Written by:
$!    Andy Harper, Kings College London UK
$!
$! Acknowledgements to:
$!    James Gray       - Original VMS_SHARE
$!    Michael Bednarek - Original Concept and implementation
$!
$! TO UNPACK THIS SHARE FILE, CONCATENATE ALL PARTS IN ORDER
$! AND EXECUTE AS A COMMAND PROCEDURE  (  @name  )
$!
$! THE FOLLOWING FILE(S) WILL BE CREATED AFTER UNPACKING:
$!       1. 00README.TXT;1
$!       2. BLASTNCBI.COM;1
$!       3. BLASTNCBI.HLP;1
$!       4. DBNCBI.COM;1
$!       5. DBNCBI.HLP;1
$!       6. SEARCHNCBI.COM;1
$!       7. SEARCHNCBI.HLP;1
$!       8. TOFASTA.FOR;1
$!       9. TOFASTA.HLP;1
$!
$set="set"
$set symbol/scope=(nolocal,noglobal)
$f=f$parse("SHARE_TEMP","SYS$SCRATCH:.TMP_"+f$getjpi("","PID"))
$e="write sys$error  ""%UNPACK"", "
$w="write sys$output ""%UNPACK"", "
$ if f$trnlnm("SHARE_LOG") then $ w = "!"
$ ve=f$getsyi("version")
$ if ve-f$extract(0,1,ve) .ges. "4.4" then $ goto START
$ e "-E-OLDVER, Must run at least VMS 4.4"
$ v=f$verify(v)
$ exit 44
$UNPACK: SUBROUTINE ! P1=filename, P2=checksum
$ if f$search(P1) .eqs. "" then $ goto file_absent
$ e "-W-EXISTS, File ''P1' exists. Skipped."
$ delete 'f'*
$ exit
$file_absent:
$ if f$parse(P1) .nes. "" then $ goto dirok
$ dn=f$parse(P1,,,"DIRECTORY")
$ w "-I-CREDIR, Creating directory ''dn'."
$ create/dir 'dn'
$ if $status then $ goto dirok
$ e "-E-CREDIRFAIL, Unable to create ''dn'. File skipped."
$ delete 'f'*
$ exit
$dirok:
$ w "-I-PROCESS, Processing file ''P1'."
$ if .not. f$verify() then $ define/user sys$output nl:
$ EDIT/TPU/NOSEC/NODIS/COM=SYS$INPUT 'f'/OUT='P1'
PROCEDURE Unpacker ON_ERROR ENDON_ERROR;SET(FACILITY_NAME,"UNPACK");SET(
SUCCESS,OFF);SET(INFORMATIONAL,OFF);f:=GET_INFO(COMMAND_LINE,"file_name");b:=
CREATE_BUFFER(f,f);p:=SPAN(" ")@r&LINE_END;POSITION(BEGINNING_OF(b));
LOOP EXITIF SEARCH(p,FORWARD)=0;POSITION(r);ERASE(r);ENDLOOP;POSITION(
BEGINNING_OF(b));g:=0;LOOP EXITIF MARK(NONE)=END_OF(b);x:=ERASE_CHARACTER(1);
IF g=0 THEN IF x="X" THEN MOVE_VERTICAL(1);ENDIF;IF x="V" THEN APPEND_LINE;
MOVE_HORIZONTAL(-CURRENT_OFFSET);MOVE_VERTICAL(1);ENDIF;IF x="+" THEN g:=1;
ERASE_LINE;ENDIF;ELSE IF x="-" THEN IF INDEX(CURRENT_LINE,"+-+-+-+-+-+-+-+")=
1 THEN g:=0;ENDIF;ENDIF;ERASE_LINE;ENDIF;ENDLOOP;t:="0123456789ABCDEF";
POSITION(BEGINNING_OF(b));LOOP r:=SEARCH("`",FORWARD);EXITIF r=0;POSITION(r);
ERASE(r);x1:=INDEX(t,ERASE_CHARACTER(1))-1;x2:=INDEX(t,ERASE_CHARACTER(1))-1;
COPY_TEXT(ASCII(16*x1+x2));ENDLOOP;WRITE_FILE(b,GET_INFO(COMMAND_LINE,
"output_file"));ENDPROCEDURE;Unpacker;QUIT;
$ delete/nolog 'f'*
$ CHECKSUM 'P1'
$ IF CHECKSUM$CHECKSUM .eqs. P2 THEN $ EXIT
$ e "-E-CHKSMFAIL, Checksum of ''P1' failed."
$ ENDSUBROUTINE
$START:
$ create 'f'
X`09NCBISHELLS.SHARE is a VMS_SHARE set of Steve_Clark/Erik_Sonnhammer-
Xstyle command procedures for users of the Wisconsin GCG package to send the
XNCBI Email servers requests for BLAST searches, sequence documentation
Xsearches (like GCG's STRINGSEARCH), and sequence retrievals (like GCG's
XFETCH).  Installation instructions are included as comments at the tops of t
Vhe
Xfiles.  The *.HLP files can be inserted in the GCG on-line help library.
X
XContents:
X
X00README.TXT         -- This message.
X
XBLASTNCBI.COM (v1.1) -- For blastp, tblastn, blastn, and blastx searches.
X  BLASTNCBI.HLP
X  TOFASTA.FOR (Steve Clark's GCG to FastA format converter)
X  TOFASTA.HLP
X `20
XSEARCHNCBI.COM`09     -- For documentation searches with (a) query term(s).
X  SEARCHNCBI.HLP                Returns titles of hits.
X `20
XDBNCBI.COM`09     -- For retrieving sequences identified via BLASTNCBI or
X  DCNCBI.HLP`09`09`09SEARCHNCBI.
X `20
X=========================================================================
X Foteos Macrides           Worcester Foundation for Experimental Biology
X MACRIDES@WFEB2.BITNET     222 Maple Avenue, Shrewsbury, MA 01545
X=========================================================================
$ CALL UNPACK 00README.TXT;1 589054760
$ create 'f'
X$ orig_veri = f$environment("VERIFY_PROCEDURE")
X$ v = f$verify(0) ! (BLASTNCBI turns off verification)
X$!
X$!                             BLASTNCBI.COM
X$!                             ------------
X$!
X$! Version 1.1
X$! Foteos Macrides (MACRIDES@WFEB2.BITNET), October 16, 1992
X$!
X$! Command procedure for users of the Wisconsin GCG package to send a
X$! sequence to NCBI for BLAST searches.  Modelled on Steve Clark's
X$! BLASTSEARCH.TXT and Erik Sonnhammer's BLASTMAIL.COM from the EMBL
X$! NETSERVer
X$!
X$! This procedure asks all the relevant questions, constructs a text file wi
Vth
X$! the sequence in native FastA format, and mails it to NCBI.  It accepts th
Ve
X$! name of the query sequence on the command line as P1, else prompts for it
V.
X$!
X$! Amgibuous sequences (e.g., ACTGAA) will be treated as nucleic, but can be
X$! forced to be treated as protein by specifying PROTEIN as P2, or as P1 if
X$! the sequence isn't entered on the command line.
X$!
X$! This script has been tested with GCG version 7.0 and VMS version 5.3-2
X$!
X$! Installation:
X$! -------------
X$! 1. The symbol SEARCH_ADDRESS below should be assigned the network address
X$!    for the NCBI Mail-BLAST service. This may have to be changed to
X$!    accomodate local gateways, etc.
X$!
X$! 2. Compile and Link ToFastA.For in the GCG environment:
X$!`09$ GCGSUPPORT
X$!`09$ FORTRAN/EXTEND TOFASTA
X$!`09$ GENLINK TOFASTA
X$!
X$! 3. Assign symbols (in the appropriate initializing GCG command procedure)
V:
X$!`09$ TOFASTA   :== $device:`5Bdirectory`5DTOFASTA
X$!`09$ BLASTNCBI :== $device:`5Bdirectory`5DBLASTNCBI
X$!
X$!--------------------------------------------------------------------------
V--
X$
X$`09on control_y then goto restore
X$`09bell`5B0,7`5D = 7
X$`09ws := "write sys$output"
X$`09iq := inquire/nopunctuation
X$
X$`09! Move PROTEIN to P2 if entered as P1
X$
X$`09IF(p1.EQS."PROTEIN")
X$`09 THEN
X$`09 p2 := "PROTEIN"
X$`09 p1 := ""
X$`09ENDIF
X$
X$`09! The Internet address for sending the search file is
X$`09! BLAST@ncbi.nlm.nih.gov
X$
X$`09search_address := """"IN%"""""BLAST@ncbi.nlm.nih.gov""""""
X$
X$`09ws ""
X$`09ws "This procedure initiates a BLAST search for similarity between"
X$`09ws "your query sequence and one of the databases maintained by NCBI."
X$`09ws "The information required for executing the search is sent to"
X$`09ws "NCBI via electronic mail and is executed by the NCBI people"
X$`09ws "themselves.  The results of the search will be returned to"
X$`09ws "you via e-mail."
X$
X$get_query:
X$
X$`09! Get query sequence if not specified as P1, so ToFastA won't
X$`09! issue its own prompt and confuse the user about what program
X$`09! is being used.
X$
X$`09ws ""
X$`09if(p1.EQS."") then iq p1 "NCBI BLAST with what query sequence? "
X$`09if(p1.EQS."") then goto get_query
X$
X$`09! ToFastA prompts for the sequence name (if not specified on the
X$`09! command line) and the region to search. It does all the error
X$`09! checking and returns all the relevant info to this procedure via
X$`09! global symbols.
X$
X$`09assign/usermode tt: sys$input
X$`09ToFastA/seqinfo/noreverse 'p1'
X$`09if(seqinfotype.EQS."NONE") then exit ! Error from within ToFastA
X$`09on control_y then goto terminate
X$`09if(p2.EQS."PROTEIN") then seqinfotype := "PROTEIN"
X$
X$get_program:
X$
X$`09! Find out which program to use.
X$
X$`09ws ""
X$`09ws "NCBI BLAST program to use:
X$`09ws ""
X$`09IF(seqinfotype.NES."PROTEIN")
X$`09 THEN
X$`09 ws " 1) blastn (your nucleic query vs. nucleic databases)"
X$`09 ws " 2) blastx (your nucleic query dynamically translated in all"
X$`09 ws "            reading frames vs. protein sequence databases)"
X$`09 ELSE
X$`09 ws " 1) blastp  (your protein query vs. protein or pre-translated"
X$`09 ws "             nucleic databases)"
X$`09 ws " 2) tblastn (your protein query vs. nucleic databases dynamically"
X$`09 ws "             translated in all reading frames)"
X$`09ENDIF
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09blprog := ""
X$`09IF(seqinfotype.NES."PROTEIN")
X$`09 THEN
X$`09 if(choice.EQS."1") then blprog := "blastn"
X$`09 if(choice.EQS."2") then blprog := "blastx"
X$`09 ELSE
X$`09 if(choice.EQS."1") then blprog := "blastp"
X$`09 if(choice.EQS."2") then blprog := "tblastn"
X$`09ENDIF
X$`09if(blprog.NES."") then goto get_database
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 2, inclusive."
X$`09goto get_program
X$
X$get_database:
X$
X$`09! Find out which database to search.  The default is the
X$`09! non-redundant database for DNA or proteins.
X$
X$`09ws ""
X$`09ws "Database to search:"
X$`09ws ""
X$`09if(blprog.EQS."blastp") then goto get_pepdatabase
X$`09if(blprog.EQS."blastx") then goto get_pepdatabase
X$`09ws " 1) nr:       Non-redundant database (includes GenBank, EMBL,
X$`09ws "                    and their cumulative updates)"
X$`09ws " 2) genbank:  GenBank database without updates"
X$`09ws " 3) gbupdate: GenBank cumulative daily updates"
X$`09ws " 4) embl:     EMBL database without updates"
X$`09ws " 5) emblu:    EMBL cumulative weekly updates"
X$`09ws " 6) vector:   Vector subset of GenBank"
X$`09ws " 7) dbest:    Database of Expressed Sequence Tags (ESTs)"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database := ""
X$`09if(choice.EQS."1") then database := "nr"
X$`09if(choice.EQS."2") then database := "genbank"
X$`09if(choice.EQS."3") then database := "gbupdate"
X$`09if(choice.EQS."4") then database := "embl"
X$`09if(choice.EQS."5") then database := "emblu"
X$`09if(choice.EQS."6") then database := "vector"
X$`09if(choice.EQS."7") then database := "dbest"
X$`09if(database.NES."") then goto set_program
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 7, inclusive."
X$`09goto get_database
X$
X$get_pepdatabase:
X$
X$`09ws " 1) nr:        Non-redundant protein database (includes SWISS-PROT,
X$`09ws "                    PIR, GenPept, and GenPept cumulative updates)"
X$`09ws " 2) swissprot: SWISS-PROT protein database"
X$`09ws " 3) pir:       PIR protein database"
X$`09ws " 4) genpept:   GenPept (translated GenBank)"
X$`09ws " 5) gpupdate:  GenPept cumulative daily updates"
X$`09ws " 6) tfd:       Transcription Factors Database"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database = ""
X$`09if(choice.EQS."1") then database := "nr"
X$`09if(choice.EQS."2") then database := "swissprot"
X$`09if(choice.EQS."3") then database := "pir"
X$`09if(choice.EQS."4") then database := "genpept"
X$`09if(choice.EQS."5") then database := "gpupdate"
X$`09if(choice.EQS."6") then database := "tfd"
X$`09if(database.NES."") then goto set_program
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 6, inclusive."
X$`09goto get_database
X$
X$set_program:
X$
X$`09! Set program parameters to NCBI defaults
X$
X$`09descrip := "100"
X$       alignmt := "50"
X$`09histogr := "yes"
X$       expect  := "10"
X$       cutoff  := "Calculate from expectation cutoff"
X$
X$show_search:
X$
X$`09ws ""
X$`09ws ""
X$`09ws "The following BLAST search will be executed:"
X$`09ws ""
X$`09ws " Query sequence: ''seqinfoiname' from ''seqinfostart' to ", -
X`09`09 "''seqinfoend' of ''seqinfolength' (''seqinfotype')"
X$`09ws " Program to run: ''blprog'"
X$`09ws " Database to be searched: ''database'"
X$`09ws ""
X$`09iq choice "Are these entries correct (* Yes *)? "
X$`09choice = f$extract(0, 1, choice)
X$`09if(choice.EQS."") then goto show_param
X$`09if(choice.EQS."Y") then goto show_param
X$
X$`09! Something is wrong. Give the chance to correct it, or give up.
X$
X$ask_search:
X$
X$`09ws ""
X$`09ws "Do you want to:"
X$`09ws ""
X$`09ws " 1) Start again"
X$`09ws " 2) Give up"
X$`09ws ""
X$`09iq choice "Please enter the number of your choice (* 1 *): "
X$`09p1 := ""
X$`09if(f$search("''seqinfooname'").NES."") then -
X`09`09delete/nolog 'seqinfooname';0
X$`09if(choice.eqs."") then goto get_query
X$`09if(choice.eqs."1") then goto get_query
X$`09if(choice.eqs."2") then goto terminate
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 2, inclusive."
X$`09goto ask_search
X$
X$show_param:
X$
X$`09ws ""
X$`09ws ""
X$`09ws "The following ''blprog' parameters will be used:"
X$`09ws ""
X$`09ws " Expectation cutoff: ''expect'"
X$`09ws " Cutoff score: ''cutoff'"
X$`09ws " Maximum short descriptions of matches: ''descrip'"
X$`09ws " Maximum high scoring segment pairs: ''alignmt'"
X$`09if(blprog.NES."blastx") then -
X        ws " Display histogram of scores: ''histogr'"
X$`09ws ""
X$`09iq choice "Do you wish to change any parameters (* no *)? "
X$`09choice = f$extract(0, 1, choice)
X$`09if(choice.EQS."") then goto do_it
X$`09if(choice.EQS."N") then goto do_it
X$
X$`09! Parameter change desired. Give the chance to make it,
X$`09! start all over, or give up.
X$
X$ask_param:
X$
X$`09ws ""
X$`09ws "Do you want to:"
X$`09ws ""
X$`09ws " 1) Change a ''blprog' parameter"
X$`09ws " 2) Start all over"
X$`09ws " 3) Give up"
X$`09ws ""
X$`09iq choice "Please enter the number of your choice (* 1 *): "
X$`09if(choice.eqs."") then choice := "1"
X$`09if(choice.eqs."1") then goto change_param
X$`09IF(choice.eqs."2")
X$`09 THEN
X$`09 p1 := ""
X$`09 if(f$search("''seqinfooname'").NES."") then -
X`09`09delete/nolog 'seqinfooname';0
X$`09 goto get_query
X$`09ENDIF
X$`09if(choice.eqs."3") then goto terminate
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 3, inclusive."
X$`09goto ask_param
X$
X$change_param:
X$
X$`09ws ""
X$`09ws "Parameter to change (current setting):"
X$`09ws ""
X$`09ws " 1) Expectation cutoff (''expect')"
X$`09ws " 2) Cutoff score (''cutoff')"
X$`09ws " 3) Maximum short descriptions of matches (''descrip')"
X$`09ws " 4) Maximum high scoring segment pairs (''alignmt')"
X$`09if(blprog.NES."blastx") then -
X        ws " 5) Display histogram of scores (''histogr')"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09ws ""
X$`09IF(choice.EQS."1")
X$`09 THEN
X$`09 iq expect " Expectation cutoff (* 10 *): "
X$`09 if (expect.EQS."") then expect := "10"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(choice.EQS."2")
X$`09 THEN
X$`09 iq cutoff "Cutoff score (* Calculate from expectation cutoff *): "
X$`09 if(cutoff.EQS."") then cutoff := "Calculate from expectation cutoff"
X$`09 if(f$locate(" ",cutoff).NE.f$length(cutoff)) then -
X         cutoff := "Calculate from expectation cutoff"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(choice.EQS."3")
X$`09 THEN
X$`09 iq descrip "Maximum short descriptions of matches (* 100 *): "
X$`09 if (descrip.EQS."") then descrip := "100"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(choice.EQS."4")
X$`09 THEN
X$`09 iq alignmt "Maximum high scoring segment pairs (* 50 *): "
X$`09 if(alignmt.EQS."") then alignmt := "50"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(blprog.EQS."blastx")
X$`09 THEN
X$`09 ws ""
X$`09 ws "''bell'Valid responses are 1 - 4, inclusive."
X$`09 goto change_param
X$`09ENDIF
X$`09IF(choice.EQS."5")
X$`09 THEN
X$`09 iq histogr "Display histogram of scores (* yes *): "
X$`09 histogr = f$extract(0, 1, histogr)
X$`09 IF(histogr.EQS."N")
X$`09  THEN
X$`09  histogr := "no"
X$`09  ELSE
X$`09  histogr := "yes"
X$`09 ENDIF
X$`09 goto show_param
X$`09ENDIF
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 5, inclusive."
X$`09goto change_param
X$
X$do_it:
X$
X$`09! Write the text file that will be mailed to NCBI
X$
X$`09ws ""
X$`09ws "Creating the file to be mailed to NCBI..."
X$
X$`09open/write outfile tmp$.tmp$
X$`09wc := "write outfile"
X$`09wc "PROGRAM ''blprog'"
X$`09wc "DATALIB ''database'"
X$`09wc "DESCRIPTION ''descrip'"
X$`09wc "ALIGNMENTS ''alignmt'"
X$`09if(blprog.NES."blastx") then wc "HISTOGRAM ''histogr'"
X$`09wc "EXPECT ''expect'"
X$`09if(cutoff.NES."Calculate from expectation cutoff") then -
X        wc "CUTOFF ''cutoff'"
X$`09wc "BEGIN"
X$`09close outfile
X$`09convert/append 'seqinfooname' tmp$.tmp$
X$
X$`09! Mail the file away.
X$`09! NCBI BLAST doesn't acknowledge, so also mail to self.
X$
X$       ws ""
X$`09ws "The file ''seqinfooname' will be sent to NCBI."
X$       ws "    NCBI does not mail an acknowledgment copy,"
X$`09ws "    so a self-copy will be mailed to you now."
X$       ws ""
X$`09ws "Mailing the file to you and NCBI..."
X$
X$`09mail/noedit/self/subj="''seqinfooname'" tmp$.tmp$ 'search_address'
X$
X$       ws ""
X$`09ws "The file ''seqinfooname' has been sent to you and NCBI."
X$`09ws "    The results will be mailed back to you shortly."
X$`09ws "    You can retrieve sequences via Email from NCBI with"
X$`09ws "    the DBNCBI command.
X$`09ws ""
X$
X$terminate:
X$
X$`09if(f$search("''seqinfooname'").NES."") then -
X`09`09delete/nolog 'seqinfooname';0
X$`09if(f$search("tmp$.tmp$").NES."") then delete/nolog tmp$.tmp$;*
X$
X$restore:
X$
X$`09! Restore verification to the status quo ante
X$
X$`09v = f$verify(orig_veri)
X$`09exit
$ CALL UNPACK BLASTNCBI.COM;1 1667828007
$ create 'f'
X1 BLASTNCBI
X     BLASTNCBI initiates a BLAST search for similarity between your query
X     sequence and databases maintained by the NCBI server.  BLAST is much
X     faster than FastA.
X
X     BLASTNCBI will ask you the appropriate questions, create a BLAST
X     request protocol, and Email it for you.  A copy of the protocol will
X     also be Emailed to you.  NCBI will Email you the results of the
X     analysis.
X
X     You may use either a PROTEIN or NUCLEOTIDE query sequence, in a
X     GCG formatted file.  BLASTNCBI will reformat the sequence (or a
X     designated portion of the sequence) into native FastA (Pearson)
X     format and insert that query into the protocol.
X
X     When reading Email, use the EXTRACT command to make a copy of the
X     BLAST results to your account:
X
X     MAIL> EXTR/NOHEAD filename.ext
X
X     Use DBNCBI to retrieve a known database entry (sequence) from the
X     NCBI server.  After the sequence arrives, extract it to a temporary
X     file and convert it to GCG format (e.g., with FROMGENBANK for a
X     GenBank sequence).  Then save disk space by deleting the temporary
X     file.
$ CALL UNPACK BLASTNCBI.HLP;1 337446814
$ create 'f'
X$ orig_veri = f$environment("VERIFY_PROCEDURE")
X$ v = f$verify(0) ! (DBNCBI turns off verification)
X$!
X$!                              DBNCBI.COM
X$!                              ----------
X$!
X$! Version 1.0
X$! Foteos Macrides (MACRIDES@WFEB2.BITNET), August 20, 1992
X$!
X$! Modelled on Steve Clark's DBMAIL.COM.
X$!
X$! Command procedure to mail a request to NCBI for a database sequence.
X$! NCBI will return the sequence via email.  The sequence can be specified
X$! by either its locus name or accession number.  The sequence to be`20
X$! retrieved can be specified on the command line as P1.
X$!
X$! Installation:
X$! -------------
X$! 1. The symbol RETRIEVE_ADDRESS below should be assigned the network
X$!    address for the NCBI retrieval service.  This may have to be changed
X$!    to accomodate local gateways, etc.
X$!
X$! 2. Assign symbol (in the appropriate initializing GCG command procedure):
X$!`09$ DBNCBI  :== $device:`5Bdirectory`5DDBNCBI
X$!
X$!--------------------------------------------------------------------------
V--
X$
X$`09on control_y then goto restore
X$`09bell`5B0,7`5D = 7
X$`09ws := "write sys$output"
X$`09iq := inquire/nopunctuation
X$
X$`09! The Internet address for sending the retrieval request is
X$`09! retrieve@ncbi.nlm.nih.gov
X$
X$`09retrieve_address := """"IN%"""""retrieve@ncbi.nlm.nih.gov"""""
X$
X$`09ws ""
X$`09ws "This procedure retrieves from NCBI a single sequence via"
X$`09ws "electronic mail.  The sequence must be specified by its"
X$`09ws "LOCUS NAME or ACCESSION NUMBER (e.g., as indicated in the"
X$`09ws "Email files returned from BLAST searches with BLASTNCBI"
X$`09ws "or from string searches with SEARCHNCBI)."
X$
X$check_for_seqspec:
X$
X$`09seqspec := "''p1'"
X$`09if(seqspec.NES."") then goto get_database
X$
X$ask_seqspec:
X$
X$`09ws ""
X$`09iq seqspec "Sequence to retrieve: "
X$`09if(seqspec.EQS."") then goto ask_seqspec
X$
X$get_database:
X$
X$`09! Find out which database to search.
X$
X$`09ws ""
X$`09ws "Database to search:"
X$`09ws ""
X$`09ws "  1) gb:     GenBank database without updates"
X$`09ws "  2) gbu:    GenBank cumulative daily updates"
X$`09ws "  3) e:      EMBL database without updates"
X$`09ws "  4) eu:     EMBL cumulative weekly updates"
X$`09ws "  5) vector: Vector subset of GenBank"
X$`09ws "  6) dbest:  Database of Expressed Sequence Tags (ESTs)"
X$`09ws "  7) sp:     SWISS-PROT protein database"
X$`09ws "  8) pir:    PIR protein database"
X$`09ws "  9) gp:     GenPept (translated GenBank)"
X$`09ws " 10) gpu:    GenPept cumulative daily updates"
X$`09ws " 11) tfd:    Transcription Factors Database"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database := ""
X$`09if(choice.EQS. "1") then database := "genbank"
X$`09if(choice.EQS. "2") then database := "gbupdate"
X$`09if(choice.EQS. "3") then database := "embl"
X$`09if(choice.EQS. "4") then database := "emblu"
X$`09if(choice.EQS. "5") then database := "vector"
X$`09if(choice.EQS. "6") then database := "dbest"
X$`09if(choice.EQS. "7") then database := "swissprot"
X$`09if(choice.EQS. "8") then database := "pir"
X$`09if(choice.EQS. "9") then database := "genpept"
X$`09if(choice.EQS."10") then database := "gpupdate"
X$`09if(choice.EQS."11") then database := "tfd"
X$`09if(database.NES."") then goto do_it
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 11, inclusive."
X$`09goto get_database
X$
X$do_it:
X$
X$`09! Encase sequence specification in double quotes so that if it
X$`09! contains an underscore the server will not treat it as an OR.
X$
X$`09seqname := """''seqspec'"""
X$
X$`09ws ""
X$`09ws "Constructing message file..."
X$`09open/write outfile tmp$.tmp$
X$`09on control_y then goto terminate
X$`09wc := "write outfile"
X$`09wcs := "write/symbol outfile"
X$`09wc "DATALIB ''database'"
X$`09wc "MAXDOCS 1"
X$`09wc "MAXLINES 2500"
X$`09wc "BEGIN"
X$`09wcs seqname
X$`09close outfile
X$`09ws "Mailing the request to NCBI..."
X$`09mail/noedit/noself/subject="" tmp$.tmp$ 'retrieve_address'
X$`09ws ""
X$`09ws "NCBI will mail the sequence back to you shortly. When it arrives,"
X$`09ws "    you will have to EXTRACT it from mail into a temporary disk"
X$`09ws "    file (e.g., tmp.seq), convert it to the desired (e.g., GCG)"
X$`09ws "    format, then delete the temporary disk file.
X$`09ws ""
X$
X$terminate:
X$
X$`09if(f$search("tmp$.tmp$").NES."") then delete/nolog tmp$.tmp$;0
X$
X$restore:
X$
X$`09! Restore verification to the status quo ante
X$
X$ `09v = f$verify(orig_veri)
X$`09exit
$ CALL UNPACK DBNCBI.COM;1 1325615065
$ create 'f'
X1 DBNCBI
X     DBNCBI retreives a sequence via Email from the NCBI server.  Use it
X     after identifying a new sequence of interest with the BLASTNCBI or
X     SEARCHNCBI Email shells.
X
X     DBNCBI will ask you for the LOCUS NAME or ACCESSION NUMBER of the
X     sequence.  After the sequence arrives, extract it to a temporary
X     file and convert it to GCG format (e.g., with FROMGENBANK for a
X     GenBank sequence).  Then save disk space by deleting the temporary
X     file.
$ CALL UNPACK DBNCBI.HLP;1 2047179651
$ create 'f'
X$ orig_veri = f$environment("VERIFY_PROCEDURE")
X$ v = f$verify(0) ! (SearchNCBI turns off verification)
X$!
X$!                            SEARCHNCBI.COM
X$!                            --------------
X$!
X$! Version 1.0
X$! Foteos Macrides (MACRIDES@WFEB2.BITNET), August 20, 1992
X$!
X$!
X$! Command procedure to mail a request to NCBI for the titles of sequences
X$! whose definitions have matches to a search text (terms with Boolean
X$! connectors).  NCBI will return the titles via email.
X$!
X$! Installation:
X$! -------------
X$! 1. The symbol RETRIEVE_ADDRESS below should be assigned the network
X$!    address for the NCBI retrieval service.  This may have to be changed
X$!    to accomodate local gateways, etc.
X$!
X$! 2. Assign symbol (in the appropriate initializing GCG command procedure):
X$!`09$ SEARCHNCBI  :== $device:`5Bdirectory`5DSEARCHNCBI
X$!
X$!--------------------------------------------------------------------------
V--
X$
X$`09on control_y then goto restore
X$`09bell`5B0,7`5D = 7
X$`09ws := "write sys$output"
X$`09iq := inquire/nopunctuation
X$
X$`09! The Internet address for sending the titles request is
X$`09! retrieve@ncbi.nlm.nih.gov
X$
X$`09retrieve_address := """"IN%"""""retrieve@ncbi.nlm.nih.gov""""""
X$
X$`09ws ""
X$`09ws "This procedure finds sequences in the NCBI databases by searching"
X$`09ws "their documentation (records) for character patterns matching your"
X$`09ws "input search text, and returns via Email the titles of the first"
X$`09ws "up to 1000 hits.  It is like record searches with STRINGSEARCH"
X$`09WS "but the query search text has a different format, i.e., the text"
X$`09ws "must have one or more terms, and terms are separated by Boolean"
X$`09ws "connectors (AND, OR, NOT; e.g.:  cytochrome AND p450 NOT yeast )."
X$`09ws "Spaces and underscores between terms are treated as ORs.  If a"
X$`09ws "term contains an underscore (e.g.:  rata2ugldb_1 ) encase it in"
X$`09ws "double quotes (e.g., so it is not treated as  rata2ugldb OR 1 )."
X$`09ws "Double quotes can also be used to treat a series of words as one"
X$`09ws "term, e.g.,  "+"""cytochrome p450"""+" AND Smith  returns the"
X$`09ws "titles of sequences which have the term  cytochrome p450  (both"
X$`09ws "words in that order) and the term  Smith  in their records."
X$`09ws ""
X$
X$get_text:
X$
X$`09define/nolog/user sys$input sys$command
X$`09read/prompt="Text: " sys$input text
X$`09deassign/user sys$input
X$`09if(text.EQS."") then goto get_text
X$
X$get_database:
X$
X$`09! Find out which database to search.
X$
X$`09ws ""
X$`09ws "Database to search:"
X$`09ws ""
X$`09ws "  1) genbank:    GenBank database without updates"
X$`09ws "  2) gbupdate:   GenBank cumulative daily updates"
X$`09ws "  3) embl:       EMBL database without updates"
X$`09ws "  4) emblupdate: EMBL cumulative weekly updates"
X$`09ws "  5) vector:     Vector subset of GenBank"
X$`09ws "  6) dbest:      Database of Expressed Sequence Tags (ESTs)"
X$`09ws "  7) swissprot:  SWISS-PROT protein database"
X$`09ws "  8) pir:        PIR protein database"
X$`09ws "  9) genpept:    GenPept (translated GenBank)"
X$`09ws " 10) gpupdate:   GenPept cumulative daily updates"
X$`09ws " 11) tfd:        Transcription Factors Database"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database := ""
X$`09if(choice.EQS. "1") then database := "genbank"
X$`09if(choice.EQS. "2") then database := "gbupdate"
X$`09if(choice.EQS. "3") then database := "embl"
X$`09if(choice.EQS. "4") then database := "emblu"
X$`09if(choice.EQS. "5") then database := "vector"
X$`09if(choice.EQS. "6") then database := "dbest"
X$`09if(choice.EQS. "7") then database := "swissprot"
X$`09if(choice.EQS. "8") then database := "pir"
X$`09if(choice.EQS. "9") then database := "genpept"
X$`09if(choice.EQS."10") then database := "gpupdate"
X$`09if(choice.EQS."11") then database := "tfd"
X$`09if(database.NES."") then goto do_it
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 11, inclusive."
X$`09goto get_database
X$
X$do_it:
X$
X$`09ws ""
X$`09ws "Constructing message file..."
X$`09open/write outfile tmp$.tmp$
X$`09on control_y then goto terminate
X$`09wc := "write outfile"
X$`09wcs := "write/symbol outfile"
X$`09wc "DATALIB ''database'"
X$`09wc "MAXDOCS 1000"
X$`09wc "MAXLINES 2500"
X$`09wc "TITLES yes"
X$`09wc "BEGIN"
X$`09wcs text
X$`09close outfile
X$`09ws "Mailing the request to NCBI..."
X$`09mail/noedit/noself/subject="" tmp$.tmp$ 'retrieve_address'
X$`09ws ""
X$`09ws "NCBI will mail the results back to you shortly.  Use DBNCBI to"
X$`09ws "     retrieve sequences from NCBI databases.
X$`09ws ""
X$
X$terminate:
X$
X$`09if(f$search("tmp$.tmp$").NES."") then -
X`09`09delete/nolog tmp$.tmp$;0
X$
X$restore:
X$
X$`09! Restore verification to the status quo ante
X$
X$`09v = f$verify(orig_veri)
X$`09exit
$ CALL UNPACK SEARCHNCBI.COM;1 1216042638
$ create 'f'
X1 SEARCHNCBI
X     SEARCHNCBI retrieves sequence titles via Email from the NCBI server.
X     Use it like STRINGSEARCH to search the sequence documentation (records)
X     in databases at NCBI.  Then retrieve sequences of interest with DBNCBI,
X     using the LOCUS NAME or ACCESSION NUMBER in the title.
$ CALL UNPACK SEARCHNCBI.HLP;1 1847132937
$ create 'f'
X!***  TOFASTA ***********************************************************
X!*
X!* This program converts a standard GCG file sequence to FASTA format,
X!* as required by the GenBank BLAST server for database searching. With
X!* no command line switches, the program asks for the sequence filename,
X!* the regions to convert, and whether or not it should be reversed. The`20
X!* output filename is root.FASEQ.
X!*
X!* The following command line switches are available:
X!*
X!* /INfile = filename`09Suppresses the request for the input filename
X!* /NOREVerse`09`09Forces the top strand to be output
X!* /SEQINFO`09`09Sets symbols that can be used in command shells:
X!*`09`09SEQINFOINAME`09Input sequence filename
X!*`09`09SEQINFOONAME`09Output sequence filename
X!*`09`09SEQINFOTYPE`09"PROTEIN", "DNA", or "NONE" on error
X!*`09`09SEQINFOSTART
X!*`09`09SEQINFOEND
X!*`09`09SEQINFOLENGTH
X!*`09`09SEQINFOREV
X!*
X!* Written by Steve Clark September 7, 1991
X!*
X!* To install, initiate the GCG support environment with the command
X!* GCGSUPPORT, compile the program (FORTRAN/EXTEND TOFASTA) and link
X!* it (GENLINK TOFASTA).  Then define it as a foreign command:
X!*
X!* $ TOFASTA :== $device:`5Bdirectory`5DTOFASTA
X!*
X!*************************************************************************
X
X`09program tofasta
X
X`09implicit none
X
X`09integer infile, lseq, rpos, lpos, l, i
X`09integer inttostr, str_len, revseq, getstring
X
X`09character inname(256), outname(256), seq(100001), text(33)
X
X`09byte bytename(256)
X
X`09logical seqinfo, logstatus, reverse
X`09logical clnoarg, isprotein, dclsetsymbol, clgetoldfname
X
Xc Check for the command line switch /SEQINFO to see if the symbols should
Xc be set for using in a command shell.
X
X`09seqinfo = .false.
X`09if(clnoarg('SEQINFO')) then
X`09`09seqinfo = .true.
X`09`09logstatus = dclsetsymbol('seqinfotype', 'NONE')
X`09endif
X
X`09if(.not.seqinfo) then
X`09`09call writef(
X     & '\nTOFASTA converts a GCG format sequence to the native FastA format.
V\n')
X`09endif
X
Xc Look for the input filename on the command line. If not found, ask for it.
X
X`09if(.not.clgetoldfname('INfile', 1, inname)) then
X`09`09call writef('\nTOFASTA of what GCG sequence? ')
X`09`09if(getstring(inname).eq.0) stop ' '
X`09endif
X
Xc Open the file and read in the sequence.
X
X`09call openfile(infile, inname, 'rdb')
X`09call readseq(infile, seq, lseq)
X`09call closef(infile)
X
Xc Get the range and revere if not prevented by the command line argument.
X
X`09call getrange(lpos, rpos, lseq)
X`09reverse = .false.
X`09if(.not.clnoarg('NOREVERSE')) call getreverse(reverse)
X
Xc We have all the info we need. Calculate the output filename.
X
X`09call strcopy(outname, inname)
X`09call newfiletype(outname, '.faseq')
X
Xc Set the SEQINFO symbols if required.
X
X`09if(seqinfo) then
X`09`09logstatus = dclsetsymbol('seqinfoiname', inname)
X`09`09logstatus = dclsetsymbol('seqinfooname', outname)
X`09`09l = inttostr(lpos, text)
X`09`09logstatus = dclsetsymbol('seqinfostart', text)
X`09`09l = inttostr(rpos, text)
X`09`09logstatus = dclsetsymbol('seqinfoend', text)
X`09`09l = inttostr(lseq, text)
X`09`09logstatus = dclsetsymbol('seqinfolength', text)
X`09`09logstatus = dclsetsymbol('seqinforev', 'FALSE')
X`09`09if(reverse) logstatus = dclsetsymbol('seqinforev', 'TRUE')
X`09`09logstatus = dclsetsymbol('seqinfotype', 'DNA')
X`09`09if(isprotein(seq))
X     &`09`09`09logstatus = dclsetsymbol('seqinfotype', 'PROTEIN')
X`09endif
X
Xc Open the output file and write the first line which consists of a ">" and
Xc the sequence name. This is followed by a space and the region that was
Xc included in the conversion.
X
X`09l = str_len(outname)
X`09do i=1, l
X`09`09bytename(i) = ichar(outname(i))
X`09enddo
X`09open (unit=1, file=bytename, type='new', carriagecontrol='list')
X`09if(.not.reverse) then
X`09`09write(1,1010) (outname(i), i=1, l), lpos, rpos
X1010`09`09format('>', <l>a1, ' From', i6, ' to', i6)
X`09else
X`09`09write(1,1011) (outname(i), i=1, l), lpos, rpos
X1011`09`09format('>',<l>a1,' From',i6,' to',i6,' Reverse orientation')
X`09`09l = revseq(seq, lpos, rpos)
X`09endif
X
Xc Now write out the sequence, 70 characters to a line with no spaces.
X
X`09do while (lpos.le.rpos)
X`09`09l = min(lpos+69, rpos)
X`09`09write(1,1020) (seq(i), i=lpos, l)
X1020`09`09format(70a1)
X`09`09lpos = l + 1
X`09enddo
X
X`09close (unit=1)
X`09if(.not.seqinfo) call writef('\nSequence written to %s.\n', outname)
X
X`09stop ' '
X`09end
$ CALL UNPACK TOFASTA.FOR;1 1729870502
$ create 'f'
X1 TOFASTA
X     TOFASTA converts a GCG sequence file into a file with the sequence
X     (or a designated portion of it) in native FastA (Pearson) format.
X
X     This is an enhancement from Stephen Clark (clark@salk.bitnet) for
X     use with programs that require native FastA (Pearson) formatted
X     sequences as input.
$ CALL UNPACK TOFASTA.HLP;1 1446846040
$ v=f$verify(v)
$ EXIT

