Swiss-Prot release 41 available

Elisabeth Gasteiger Elisabeth.Gasteiger at isb-sib.ch
Fri Mar 7 06:27:40 EST 2003


Name        : Swiss-Prot
Description : Protein knowledgebase.
Release     : 41.0 of February 2003
Statistics  : 122'564 fully annotated sequences, 44'986'459 amino acids,
               103'486 references.
Citation    : Boeckmann B., Bairoch A., Apweiler R., Blatter M.-C.,
               Estreicher A., Gasteiger E., Martin M.J., Michoud K.,
               O'Donovan C., Phan I., Pilbout S. and Schneider M.
               The Swiss-Prot protein knowledgebase and its supplement
               TrEMBL in 2003.
               Nucleic Acids Res. 31:365-370(2003).
Availability: FTP: ftp://ftp.expasy.org/databases/swiss-prot
               WWW: http://www.expasy.org/sprot/

Name        : ENZYME
Description : Enzyme nomenclature database.
Release     : 30.0 of March 2003
Statistics  : 4'136 enzymes described.
Citation    : Bairoch A.
               The ENZYME database in 2000.
               Nucleic Acids Res. 28:304-305(2000).
Availability: FTP: ftp://ftp.expasy.org/databases/enzyme
               WWW: http://www.expasy.org/enzyme/



Note: a much more complete description of the changes and future developments
is available from the release notes. The release notes can be accessed from the
WWW at the address:


or downloaded by FTP from:


A) Summary of the changes in Swiss-Prot release 41 and ENZYME release 30.

In Swiss-Prot:

- Release 41.0 of Swiss-Prot contains 122'564 sequence entries, comprising
   44'986'459 amino acids abstracted from 103'486 references.
   21'133 sequences have been added since release 40, the sequence data of
   3'251 existing entries has been updated and the annotations of
   57'525 entries have been revised. This represents an increase of 20%.
- In order to handle the large amount of "raw" data coming from the
   microbial genomic sequencing, the High-quality Automated and Manual
   Annotation of microbial Proteomes (HAMAP) project was initiated. It aims to
   annotate a significant percentage of proteins which originate from microbial
   genome sequencing projects. There are currently 118 complete proteomes in
   Swiss-Prot and TrEMBL. The HAMAP web site was enhanced with many new features:
- The Human Proteomics Initiative (HPI) project is progressing. There are
   currently 9'172 annotated human sequences in Swiss-Prot. Up-to-date detailed
   statistics concerning the HPI project, as well as detailed project information,
   are available at http://www.expasy.org/sprot/hpi/
- The term 'Nucleomorph' has been added to the defined terms in the OG
   (OrGanelle) line, indicating from which genome a gene for a protein
   originates. Until now, defined terms in the OG line where 'Chloroplast',
   'Cyanelle', 'Mitochondrion' and 'Plasmid'. The term 'Nucleomorph' designs
   the residual nucleus of an algal endosymbiont that resides inside its host
- The RC (Reference Comment) line topic STRAIN and the CC line topic
   'CATALYTIC ACTIVITY' have been converted to mixed case.
- There can be more than one RP line per reference in a Swiss-Prot entry. The RP
   line describes the extent of the work carried out by the authors of the 
- We have added cross-references from Swiss-Prot to the following
   databases: GeneDB_SPombe, Genew, Gramene, HAMAP, PhosSite and TIGRFAMs.
   Links to CarbBank, GCRDb, Mendel and YEPD have been removed.
   In human protein sequence entries we have introduced explicit links to the
   Single Nucleotide Polymorphism database (dbSNP) from the feature description of
   FT VARIANT keys.
- The feature key 'SIMILAR' became obsolete and will not be used again.
- A version of Swiss-Prot and TrEMBL in XML format has been developed and is
   provided with this release. More information is available at
   http://www.ebi.ac.uk/swissprot/SP-ML and the data can be downloaded from
- The ExPASy WWW server was the target of many improvements that are all
   described at the address: http://www.expasy.org/history.html


- Release 30 of the ENZYME database contains 4'136 entries. The description of
   many existing entries was updated.


- A new release of PROSITE will be announced in a few weeks.

B) Future developments

Here is what was announced as planned changes for release 41:
- From the next release on, the Swiss-Prot release notes will be available in
   a different format. Detailed information is available at
- We are planning to elongate the mnemonic code for the protein name
   in the ID line from up to 4 characters to up to 5 characters.
- We will continue to convert Swiss-Prot entries from all 'UPPER CASE' to
   'MiXeD CaSe'. We are proceeding in the conversion of CC (Comment) lines, we
   will start to convert the GN (Gene Name) lines, but any other line type
   might also be affected.
- We will allow multiple RC lines, in which one topic (PLASMID, SPECIES,
   STRAIN, TISSUE or TRANSPOSON) might span over more than one line.
- We are gradually restructuring the CC (comment) line topic ALTERNATIVE
   PRODUCTS and introducing unique identifiers for each described isoform.
   A detailed description of the new format is available in the release notes.
- We will add cross-references (DR line) to the Gene Ontology (GO).
- The following modifications are planned in the feature table (FT):
   A new feature key 'CROSSLNK' will be introduced describe bonds between
   amino acids, which are formed posttranslationally within a peptide or
   between peptides. This will also include the description of tioether bonds
   and thiolester bonds, and thus the feature keys 'THIOETH' and 'THIOLEST'
   will be removed.

Of course the above list is far from being definitive, we await your

Swiss-Prot is copyright.  It is produced through a collaboration between the
Swiss Institute  of  Bioinformatics   and the EMBL Outstation - the European
Bioinformatics Institute. There are no restrictions on its use by non-profit
institutions as long as its  content is in no way modified. Usage by and for
commercial entities requires a license agreement.  For information about the
licensing  scheme  see: http://www.isb-sib.ch/announce/ or send  an email to
license at isb-sib.ch.
Elisabeth Gasteiger
Swiss Institute of Bioinformatics
CMU - 1, rue Michel Servet                Tel. (+41 22) 379 58 75
CH - 1211 Geneva 4 Switzerland            Fax  (+41 22) 379 58 58
Elisabeth.Gasteiger at isb-sib.ch            http://www.expasy.org/


