>>>>> "Charlie" == Charlie <cckim at stanford.edu> writes:
Charlie> This is an interesting problem, but as far as I know it
Charlie> has not been addressed computationally. The primary
Charlie> problem is that "pathogenicity islands" are not a
Charlie> precisely defined entity; rather, it is a somewhat
Charlie> arbitrarily defined region. I'm sure you already know
Charlie> some of the common characteristics, but I will list a few
Charlie> for the benefit of other readers of this post:
Charlie> 1) Flanked by tRNA sequences 2) %GC is low relative to
Charlie> the rest of the genome 3) Contain sequences that are
Charlie> unique to the organism 4) Often large (>20kb)
Charlie> The problem is that very few of the many designated
Charlie> pathogenicity islands out there actually have all of
Charlie> these features. My belief is that the term is used
Charlie> fairly loosely in order to spice up the data a bit.
As Charlie points out, how you might try to accomplish this depends on
what you mean by a "pathogenicity island". For example, if you were
looking for such regions in enteric pathogens you might try Blasting
against a database of known "islands" or classes of genes (iron
uptake, type III/IV secretion systems, prophage) in addition to the
indicators he mentioned.
Tools which we use to help us include:
tRNAscan-SE (Todd Lowe & Sean Eddy,
http://www.genetics.wustl.edu/eddy/software/#trnascan) for identifying
those tRNAs which may flank these regions.
Artemis (Kim Rutherford, http://www.sanger.ac.uk/Software/Artemis) for
visualising anomalous %GC / dinucleotide frequencies + tRNAs
(identified by tRNA scan) and other genomic context.
ACT (Kim Rutherford, http://www.sanger.ac.uk/Software/ACT) for
visualising whole genome comparisons (some of these types of of
regions appear to be horizontally transferred).
Artemis and ACT are Java applications which run on Unix/Linux, Mac and
Windows and can be downloaded for free from the above URLs. Hope this
is of some use.
-= Keith James - kdj at sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
Pathogen Sequencing Unit, Wellcome Trust Sanger Institute, Cambridge, UK