IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

Staden scripting

simon andrews simon.abdrews at bbsrc.ac.uk
Mon Jul 2 05:12:50 EST 2001


Simon Andrews wrote:
> 
> jkb at mrc-lmb.cam.ac.uk wrote:
> >
> > In <sot*EyTZo at news.chiark.greenend.org.uk> Tim Cutts
>
> I'll have a play with this this morning.  I'll post the script I come up
> with back here for comments and suggested improvements if that's OK.

OK, having used your suggestions (and the example in the Staden
distribution), I've managed to put together something which works (like
a charm actually).  The speed of execution is OK too, the Staden stuff
takes far less time than the database searching.

I don't quite understand how the dual sh tsch script stuff works, but as
long as it does then I'm prepared to accept one line of "magic" in the
script.

I'm not actually doing any error checking within the Tcl script. The
input to the script is verified by the calling Perl script, but is there
anything else I should be checking that I can't catch externally?  I
noticed in the script output that some sequences were failing to be put
into the assembly.  I know there will be some duplicates, and these
appear to give a different error message.  The tests that I have run
have never caused the script to exit because there was more than one
contig.  Is this a result of the way I'm performing the assembly, or
have I just been lucky so far?

Any other comments are welcome.

Thanks again 

Simon.

####################################################################

#!/usr/bin/sh

# \
exec gap4sh -f "$0" ${@+"$@"} || exit 1

load_package gap

error_bell 0

set dbname [lindex $argv 0]
set fofn pregap.passed
set oldnew [lindex $argv 1]

set io [open_db -name $dbname -version 0 -access rq -create $oldnew]

assemble_shotgun \
        -io $io \
        -files [ListLoad $fofn files] \
        -min_match 24 \
        -max_pads 25 \
        -max_pmismatch 7

set num_contigs [db_info num_contigs $io]

if {$num_contigs > 1}   {

        puts "MORE THAN ONE CONTIG"
        close_db -io $io
        exit 1

}

set consensus_cutoff 0.01
set quality_cutoff 0
set consensus_mode 2
set seq [calc_consensus -io $io -contigs #1 ]

puts "CONSENSUS $seq"



close_db -io $io

exit

#################################################################




More information about the Staden mailing list

Send comments to us at archive@iubioarchive.bio.net