IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

Unresolved problem in SRS5

Jack Leunissen jackl at caos.kun.nl
Mon May 12 08:37:31 EST 1997

Dear chaps,

I don't want to spoil the overall enthousiasm over the availability
of SRS5, but it appears strange to me that nobody seems to care about
the fact the SRS5 fails to handle (GCG-) split entries correctly!!!

Just try the following on your server, if you use the EMBL and EMNEW
database in GCG format: retrieve entry "CEY57G11" from EMNEW... This
entry is split into 5 separate entries by GCG, and should be joined
by SRS into one. This DOES happen indeed, but with an INCORRECT num-
ber of base, due to the fact the overlap fragments are not removed!

I tried a few sites, searching for ID=CEY57G11, the original length 
of which is 486290 bases.

		version - format -  got ID  -- with length

EMBL:		SRS5.04 -- EMBL -- CEY57g11 -- 486290 bases
EBI:		SRS5.05	-- GCG  -- CEY57G11 -- 526290 bases
UPPSALA:	SRS5.03 -- GCG? -- CEZK1127 -- n.a.

This was reported some time ago, but it hasn't been resolved even 
with the current release! one solution would be to store the database
in original EMBL format, but this is hardly acceptable, given the
size of the data.


   Jack A.M. Leunissen       | Email: jackl at caos.kun.nl
   CAOS/CAMM Center          | Tel. : +31 24 365 22 48
   University of Nijmegen    | Fax  : +31 24 365 29 77
   Nijmegen, The Netherlands | Www  : http://www.caos.kun.nl

More information about the Bio-srs mailing list

Send comments to us at archive@iubioarchive.bio.net