This is a general annoucement of ProDom release 34 available at:
http://protein.toulouse.inra.fr/prodom.html
The Protein Domain database, ProDom, release 34, has been constructed
by clustering homologous segments derived from 53145 non-fragmentary
sequences present in SWISS-PROT 34. It can be retrieved by FTP at:
ftp://ftp.toulouse.inra.fr/pub/prodom/prodom34
It provides 18086 mutiple alignments and consensus sequences for
homologous domain families. The vast majority of these families
have been generated automatically using MKDOM, a much improved
version of the DOMAINER program (Sonnhammer & Kahn, 1994, Protein
Science, 3:482-492; Gouzy, Eugene, Greene, Kahn & Corpet, in
preparation). Several steps have been taken towards improving the
quality of ProDom. In particular, all multiple alignments have been
recalculated ab initio using the MultAlin program (Corpet, 1988, Nucl.
Acids Res. 16:10881-10890; http://www.toulouse.inra.fr/multalin.html).
In addition a new expertise procedure has been introduced to validate
some domain boundaries.
The Web user interface has also been considerably enhanced.
Links are provided to and from PROSITE and PDB. They have been
calculatedwith the help of the LASSAP program (http://alize.inria.fr/).
Domain families can be searched by keyword, and graphical
representations of domain arrangements can considerably facilitate the
structural interpretation of large protein families.
In addition, we now provide a sensitive homology search procedure
which scans all domain sequences in ProDom and retrieves matches with
only one sequence for each domain family, thus drastically reducing
output redundancy. The most significant matches are visualized
graphically to assist with interpretation. For long queries the former
less sensitive but faster search on consensus sequences is also
provided. Choice is given between the classical NCBI BLAST 1.4.9 and
the new WU-BLAST 2.0a8 allowing for gapped outputs
(http://blast.wustl.edu).
For any request please mailto:proquest at toulouse.inra.fr
Jerome Gouzy
Daniel Kahn
Laboratoire de Biologie Moleculaire
des Relations Plantes-Microorganismes,
CNRS-INRA
Florence Corpet
Laboratoire de Genetique Cellulaire,
INRA
BP 27
31326 Castanet-Tolosan Cedex
France