IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

PIR entries give bad sequence formats

Peter Rice pmr at sanger.ac.uk
Thu Nov 4 12:11:52 EST 1999

Has anyone changed their PIR parser to produce 'correct' sequence
formats in SRS5 for the wgetz output?

With the default (PIR) format there are two headers - one for the
reference data and one for the sequence, so this is not acceptable PIR format.


With GCG format some entries are fine, but others (e.g. S10602) have
".." hiding in the reference part and SRS does not expand it.


I am trying to find a format that allows EMBOSS to read PIR entries
from an SRSWWW server.

With "-f seq - sf <format>" there is an extra header, but I suppose
"-f seq -sf gcg" will make the sequence readable (GCG format wil
ignore the header, while SRS is no longer writing the ".." line) while
losing most other information. Hardly ideal.


Peter Rice                | Informatics Division, The Sanger Centre,
E-mail: pmr at sanger.ac.uk  | Wellcome Trust Genome Campus,
Tel: (44) 1223 494967     | Hinxton, Cambridge, CB10 1SA, England
Fax: (44) 1223 494919     | URL: http://www.sanger.ac.uk/Users/pmr/

More information about the Bio-srs mailing list

Send comments to us at archive@iubioarchive.bio.net