THE C. elegans GENOME PROJECT FTP SITE
The cosmid sequences from the C. elegans genome
project are available by anonymous FTP from :-
The ftp site may also be accessed through our
This site not only provides an easy way of accessing
cosmid sequences which have already been submitted to the major
databases but also allows access to more preliminary sequences.
Sequences within the FTP site are not compressed. However,
the FTP site has been configured in such a way that if you request
a compressed sequence i.e. append .Z or .gz a compressed version
will be sent.
The Sequence data is currently divided into 3 directories.
1. EMBL_SEQUENCES: This directory contains sequences which
have been finished, annotated and
submitted to the public databases.
The format of these files is EMBL format.
2. FINISHED_SEQUENCES: This directory contains sequences which
have been finished, although final
checks and annotations may not of been
completed. Because of this please be
aware that some of these sequences may
still contain some errors and are
subject to change from day to day.
3. UNFINISHED_SEQUENCES: This part of the FTP service is
currently experimental. It contains
all the contiguous sequences (>1000bp)
from all the C. elegans cosmids being
worked on at the Sanger Centre. These
sequences are very preliminary and may
contain both E. Coli and vector
contamination. However they will be
of use for mapping purposes and for
Each file will contain the
contigs for a particular cosmid. The
contigs will be in fasta format with
an unique identifier for each.
As these sequences are
actively being worked on it is
expected that there will be a great
deal of change in the contigs each
time the ftp site is updated.
PLEASE NOTE: This ftp site now contains ALL C.elegans sequence data
existing within the Sanger centre and will be automatically
comments and suggestions to: Steve Jones (sjj at sanger.ac.uk)