Has anyone changed their PIR parser to produce 'correct' sequence
formats in SRS5 for the wgetz output?
With the default (PIR) format there are two headers - one for the
reference data and one for the sequence, so this is not acceptable PIR format.
http://www.sanger.ac.uk/srs5bin/cgi-bin/wgetz?-e+[PIR-ID:'S10602']
With GCG format some entries are fine, but others (e.g. S10602) have
".." hiding in the reference part and SRS does not expand it.
http://www.sanger.ac.uk/srs5bin/cgi-bin/wgetz?-e+[PIR-ID:'S10602']+-sf+gcg
I am trying to find a format that allows EMBOSS to read PIR entries
from an SRSWWW server.
With "-f seq - sf <format>" there is an extra header, but I suppose
"-f seq -sf gcg" will make the sequence readable (GCG format wil
ignore the header, while SRS is no longer writing the ".." line) while
losing most other information. Hardly ideal.
http://www.sanger.ac.uk/srs5bin/cgi-bin/wgetz?[PIR-ID:'S10602']+-f+seq+-sf+gcg
--
----------------------------------------------------------------------
Peter Rice | Informatics Division, The Sanger Centre,
E-mail: pmr at sanger.ac.uk | Wellcome Trust Genome Campus,
Tel: (44) 1223 494967 | Hinxton, Cambridge, CB10 1SA, England
Fax: (44) 1223 494919 | URL: http://www.sanger.ac.uk/Users/pmr/