IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

big databases

Richard Durbin rd at sanger.ac.uk
Mon Oct 14 08:36:28 EST 2002


There shouldn't be any downsides.   Only the general desirability of having very big
single unix files.  You could try 200,000 (200Mb) for WormBase.  Excellent if this
helps.

Richard

Keith Bradnam wrote:
> 
> > I'm interested in trading war stories and tips about how to deal with
> > big ACEDB databases.  GrainGenes recently got big and we're feeling like
> > we're bumping our head.  database/block*.wrm is about 2.5 GB.  Mainly
> > due to 500K Sequence records.
> >
> > Today I added a small new .ace file and ran into a Unix "too many open files"
> > error, where the default limit is 64.  I increased the limit and loaded the
> > file, no problem.  But then I wondered if increasing the size of the
> > block*.wrm files wasn't a better solution.
> >
> > Fifty files are defined in the distribution (4_9i) wspec, 50000 bytes each
> > (except the first two).  So I changed them to 100000 and reinitialized.
> >
> > The result to report is that the time for reloading the database improved
> > from 7.5 hr to 4.25.  (On a Sun Ultra10, 300MHz, 1 GB RAM, ca. 450 MB memory
> > used by the ACEDB process during loading.)
> 
> Any word from the acedb guys as to whether they fully endorse
> this??? I.e. would there be any unforseen side effects (slower access
> times maybe?).
> 
> I'll try to test this with our large database at some time and will report
> back any speed gains in loading time.
> 
> Keith
> 
> ~  Keith Bradnam - WormBase group: http://www.wormbase.org/
> ~
> ~  The Wellcome Trust Sanger Institute,
> ~  Hinxton, Cambridge, CB10 1SA, UK.  Tel: 01223 494922

-- 
---------------------------------------------------------------------
Richard Durbin                    The Wellcome Trust Sanger Institute
email: rd at sanger.ac.uk                   Wellcome Trust Genome Campus
http://www.sanger.ac.uk/Users/rd/                             Hinxton
phone: 01223 494978                                Cambridge CB10 1SA
fax: 01223 494919                                                  UK
---------------------------------------------------------------------





More information about the Acedb mailing list

Send comments to us at archive@iubioarchive.bio.net