On 21 Aug 2000, Kurt Stueber wrote:
> Dear colleages,
>> I would like to record the results of blastn searches in acedb.
> Looking at the models.wrm file I found the line
> in the model for "sequence":
>> homol DNA_homol ?sequence XREF DNA_homol ?Method Float Int Int Int Int
>> This looks like the place to store homologies. Is there
> a place where I can find out what the five number
> (one float and 4 integers) are designed to stand for?
The floating point number holds the actual blast score. Most blast
programs give you two scores for how good a match is. One of these scores
is the raw score and is an integer. The second is a score measured in
'bits' (a unit of information content) and is a floating point number.
This score allows you to make comparisons between blast results from
different datasets...the raw score on its own is not that useful if you
want to make comparisons.
You should probably use the score in bits for the 'Float' field of the DNA
homology info, but that's up to you. The remaining four numbers simply
specify two sets of coordinates for the query and matched sequence.
E.g. consider the following (very short) ace file:
Sequence : "EMBL:AI618741"
DNA_homol "EMBL:AC009513" "BLASTN_EST" 547 4 279 47153 47428
This query sequence (AI618741) has a match to AC009513. The blast score
is 547, the actual match is between bases 4..279 on the query sequence and
47,153..47,428 on the other sequence.
Hope this helps,
~ Keith Bradnam - Developer, Arabidopsis Genome Resource (AGR)
~ Nottingham Arabidopsis Stock Centre - http://nasc.nott.ac.uk/
~ University Park, University of Nottingham, NG7 2RD, UK
~ Tel: (0115) 951 3091