IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

[Genbank-bb] GenBank Update Problem : 1016 : Corrupted con_nc.1016.flat.gz update file

Cavanaugh, Mark (NIH/NLM/NCBI) [E] via genbankb%40net.bio.net (by cavanaug from ncbi.nlm.nih.gov)
Fri Oct 18 15:11:28 EST 2013


Greetings GenBank Users,

The flatfile version of the "CON division" GenBank Incremental Update
(GIU) product for October 16 2013 contained corrupted CONTIG lines.

The affected file was con_nc.1016.flat.gz :

ftp> pwd
257 "/genbank/daily-nc" is the current directory

ftp> dir con*1016*
227 Entering Passive Mode (130,14,29,30,201,94)
150 Opening ASCII mode data connection for file list
-r--r--r--   1 ftp      anonymous   372433 Oct 16 06:03 con_nc.1016.flat.gz


Here is a portion of one of the impacted records:

LOCUS       HF677448               31135 bp    DNA     linear   CON 14-OCT-2013
DEFINITION  Clostridium difficile T5 genomic scaffold, 1852_175, whole genome
            shotgun sequence.
ACCESSION   HF677448 CAMB01000000
VERSION     HF677448.1  GI:549396745
DBLINK      BioProject: PRJEB188
....
FEATURES             Location/Qualifiers
     source          1..31135
                     /organism="Clostridium difficile T5"
                     /mol_type="genomic DNA"
                     /strain="T5"
                     /db_xref="taxon:1215059"
                     /note="1852_175"
CONTIG      join(<C8>#.1:1..31135)
//


A total of 27 records were affected. Other examples of mangled
CONTIG lines are:

CONTIG      join(È#.1:1..31135)
CONTIG      join(È#.1:1..74315)
CONTIG      join(Ô,†â#36472546.1:1..1743)
CONTIG      join(D#.1:1..18776)
CONTIG      join(I from V034512:1..7477)
CONTIG      join(,†Ô#16940.55:1..1610)
CONTIG      join(NZ_ from EFR210166874.1:1..2732)
CONTIG      join(Ö›#25900.1:1..1769)
CONTIG      join(,†Ô#16940.55:1..8158)
CONTIG      join(NZ_ from EFR210166874.1:1..56047)
CONTIG      join(1,†éb#,†éc#1697416937.1:1..1641)
CONTIG      join(†ë‚#É,†ëƒ#7,†ë„#.1:1..31477)
CONTIG      join(÷#.1:1..29020)
CONTIG      join(û,†ÚÍ#ý,†ÚÎ#$,†ÚÏ#523011802.1:1..1118)
CONTIG      join(,†Ô#16940.55:1..9107)
CONTIG      join(È#.1:1..22095)
CONTIG      join(gi|549282866:1..7476)
CONTIG      join(1,†éb#,†éc#1697416937.1:1..7155)
CONTIG      join(I from V034512:1..23828)
CONTIG      join(Ô,†â#36472546.1:1..1432)
CONTIG      join(1,†éb#,†éc#1697416937.1:1..4334)
CONTIG      join(û,†ÚÍ#ý,†ÚÎ#$,†ÚÏ#523011802.1:1..26661)
CONTIG      join(ö#2066515663.1:1..29717)
CONTIG      join(,†Ô#16940.55:1..42966)
CONTIG      join(÷#.1:1..8802)
CONTIG      join(È#.1:1..1625)
CONTIG      join(1,†éb#,†éc#1697416937.1:1..628)

The ASN.1 version of the 1016 CON-division GIU was not affected.

An unstable system, unmonitored during the recent United States
government shutdown, was responsible for this problem. To address
it, we ensured that today's CON-division products contain all of
the records that had been present in the October 16th CON-division GIU:

-r--r--r--   1 ftp      anonymous  1515351 Oct 18 05:36 con_nc.1018.flat.gz

And we have confirmed that all of the CONTIG lines are correct.

So this means that the users may safely skip processing of
con_nc.1016.flat.gz (if they haven't already), and proceed 
with the 1017 and 1018 data products. 

To prevent problems for others who may not yet have obtained
the 1016 CON-division GIU products, we have just removed them
from our FTP site (both flatfile and ASN.1 versions).

We would like to thank GenBank users at Chemical Abstracts Services
(www.cas.org) for alerting us to this problem. We appreciate the
scrutiny of the GIU that our users provide, and appreciate problem
reports.

Our apologies for any inconvenience that this may have caused.

Mark Cavanaugh
GenBank
NCBI/NLM/NIH/HHS




More information about the Genbankb mailing list

Send comments to us at archive@iubioarchive.bio.net