Greetings GenBank Users,
Several of you reported that some records in October's Release 132.0 had
COMMENTs or /notes with line lengths that exceeded the 79-character maximum
for the GenBank flatfile format. A total of thirteen records were affected,
in seven release files:
gbpln4.seq.gz
gbpln5.seq.gz
gbpri19.seq.gz
gbpri23.seq.gz
gbvrl1.seq.gz
gbvrl3.seq.gz
gbvrt2.seq.gz
Appended below are per-file lists of the accessions involved.
Our flatfile generator was recently enhanced to handle URLs embedded within
text strings in a smarter way. Unfortunately, the changes also introduced
a bug in line length calculation, when those text strings contain tokens of
the form:
<.........>
For example :
"See <apehbb5ch>, <mnkhbb3rh>, <mnkhbb5rh>, <mnkhbb3ce>, <mnkhbb5ce>."
The bug was not detected by our usual QA procedures because there was an
extremely large number of COMMENT and /note diffs (100,000s), due to
unrelated code changes which removed extraneous whitespace characters.
We have added safeguards to our release-generation procedures which are
designed to prevent such problems from occurring in the future.
Our apologies for the inconvenience that this formatting error has caused
some users.
Mark Cavanaugh
GenBank
NCBI/NLM/NIH
pln4
----
X13611
pln5
----
J01390
pri19
-----
J00326
J00329
pri23
-----
M93406
vrl1
----
J01917
J01966
J01969
V00005
X02996
X02998
vrl3
----
L00161
vrt2
----
X04804
---
- gttaacaattaaagagtgtttatcgaaattcattatatagtggtttatatagaccacttc
-
- GenBank newsgroup see: http://www.bio.net/hypermail/genbankb/
- GENBANKB e-mail: messages sent to genbankb at net.bio.net
- subscribe: e-mail biosci-server at net.bio.net with: subscribe genbankb
- unsub: e-mail biosci-server at net.bio.net with: unsubscribe genbankb
- GenBank on the WWW, see: http://www.ncbi.nlm.nih.gov/Genbank/
- problems with GENBANKB? E-mail moderator: francis at cmmt.ubc.ca