IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

MEDLIB to BibTeX awk Script

POSTMAST at GUNBRF.bitnet POSTMAST at GUNBRF.bitnet
Fri Feb 14 14:02:00 EST 1992


At the request of quite a few people, I am posting the awk script for converting
the MEDLINE format table of contents into BibTeX format entries.  It is not
perfect and it doesn't make tags, but it does all of the dirty work.
------------------------------------------------------------------------
                                 Dr. John S. Garavelli
                                 Database Coordinator
                                 Protein Identification Resource
                                 National Biomedical Research Foundation
                                 Washington, DC  20007
                                 POSTMASTER at GUNBRF.BITNET
------------------------------------------------------------------------
# Gerneral_script for conversion of MEDLIB files to BibTex
#  1-MAY-1991 by John S. Garavelli
#AU Nakamura-H.  Katayanagi-K.  Morikawa-K.  Ikehara-M.
#TI Structural models of ribonuclease H domains in reverse
#   transcriptases from retroviruses.
#SO Nucleic-Acids-Res.  1991 Apr 25.  19(8).  P 1817-1824.
#Only continuation lines for title and author are expected
#Need to fix to separate on the "." because of refs missing day dates like
#AU Nussinov-R.
#TI Compositional variations in DNA sequences.
#SO Comput-Appl-Biosci.  1991 July.  7(3).  P 287-293.

{ if (substr($0, 0, 2) == "  ") {
    if (authorflag) theauthor = theauthor substr($0, 3, 80)
    if (!authorflag) thetitle = thetitle substr($0, 3, 80)
  }
}

{ if ($1 == "AU") {
    authorflag = 1
    theauthor = substr($0, 4, 80)
  }
}

{ if ($1 == "TI") {
    authorflag = 0
    thetitle = substr($0, 4, 80)
  }
}

{ if ($1 == "SO") {
    thejournal = $2
    gsub(/-/, " ", thejournal)

    printf("@article{,\n  author = \"")
    gsub(/.  /, ". and ", theauthor)
    n = split(theauthor, authors)
    for (i = 1; i <= n; i++) {
      gsub(/-/, " ", authors[i])
      p = split(authors[i], parts)
      for (j = 2; j <= p; j++) printf(" %s", parts[j])
      printf(" %s", parts[1])
    }
    printf("\",\n")
    printf("  title = \"%s\",\n", thetitle)
    printf("  journal = \"%s\",\n", thejournal)
    printf("  year = %s,\n", $3)
    printf("  month = \"%s\",\n", $4)
    thevolume = $6
    gsub(/\./, "", thevolume)
    printf("  volume = \"%s\",\n", thevolume)
    thepages = $8
    gsub(/\./, "", thepages)
    printf("  pages = \"%s\"\n\}\n", thepages)
  }
}





More information about the Bio-soft mailing list

Send comments to us at archive@iubioarchive.bio.net