Can't agree that introns are just 'gaps': we are finding that conserved
non-coding regions are highly correlated with regulation. Also that
alternative splicing often involves differences in intron as well as
exon lengths and composition. Would propose that acedb classes be
defined which support CNCRs and alternative splicing. Please see our
Alternative Splicing Database,
Visualization Tools for Alignment (VISTA), http://www-gsd.lbl.gov/vista
Donn Davy, CBCG at LBNL
Tim Cutts wrote:
>> In article <3BDEA54C.FD71A337 at sanger.ac.uk>,
> Richard Durbin <rd at sanger.ac.uk> wrote:
>> >We have talked for some time about ways to support an alternative exon
> >implementation for acedb/fmap more in line with what is standard
> >elsewhere, i.e. having separate exon objects, with transcripts linking
> >to the exons that they contain. Time to have this discussion with
> >proposals in the open in the acedb newsgroup, I think.
>> Sounds like a good idea to me!
>> >Should we also support Intron objects? If so, would either exons or introns be
> >acceptable, and what happens if both are given and they are inconsistent?
>> Storing introns as objects seems a slightly strange concept to me - my
> mental model is that introns are "gaps". Certainly technologies such as
> BioJava are modelling transcripts as lists of exon objects, and that's
> how we modelled them at Incyte too.
>> >I hasten to add, for reasons of continuity, and because some people strongly like
> >all the exon structure information being explicit in the transcript-type objects,
> >I expect we will continue to support the current style.
>> Well something like this in ?Sequence
>> Source_exons Int Int ?Exon
>> would be backward compaible with current databases. It works well in my
> mind because it separates properties of the transcript (CDS etc, which
> should indeed be properties of the ?Sequence object) from properties of
> the exon itself (which probably aren't that numerous!).
>> I suppose you could add some XREFs so that exons could list which
> transcripts they belong to, although as a rule I avoid throwing XREFs
> around like confetti since they can drastically slow things down
> (parsing large numbers of DNA_homol lines, anyone?!)
> i.e. something like:
>> // in ?Sequence
>> Source_exons Int Int ?Exon XREF In_transcript
>> ?Exon In_transcript ?Sequence
>> Sorry, might have the syntax wrong there - I don't have a copy of ACeDB
> on my machine here at home.
>> Thoughts? My idea is pretty simple, but it certainly would solve my
> needs, and maintains compatibility with current code.