I have realized that my previous cryptic remark about "subtle errors" in
the genetic codes contained in NCBI's ASN.1 database and toolkits
was a poor choice of words and might be interpreted as a slam
on NCBI. It was not meant to be this. Here's the full story:
Some codons appear to never be used in some organisms. In a few cases,
these codons appear to be "unassigned", for no acceptor tRNA or
release factor can be found for these codons. In a few cases,
it has been shown that if unassigned codons are introduced artificially into
messages, they cause ribosomes to stall at that codon but not release
the nascent polypeptide -- therefore they are truely different from
stop codons .
Unassigned codons create a serious representational problem for
any programmer, and to my knowledge no genetic code implementations
deal with this (including my own!). They should definitely NOT be
listed as an amino acid (for they most definitely don't code for
one) and probably should not be listed as stop (as they are not true
stops). Two alternatives are to list them as "gap" or "unknown",
neither of which is entirely satisfactory. It is not inconceivable
that there exists somewhere in biology a codon which (sometimes) is
skipped entirely on the basis of that codon (i.e. not due to neighboring
information). "Unknown" is not the best, as we do know what that codon
codes for: nothing. A "null" is the best solution, but again no
code (to my knowledge) allows for this.
NCBI's Documentation describes a genetic code table ("SGC3") for
use with "Mold mitochondria and mycoplasma", and this table lists
CGG as Arginine. CGG has been shown biochemically to be an
unassigned codon .
In the bacterium Micrococcus luteus, UUA (Leu), CUA (Leu), AUA (Ile),
GUA (Val), CAA (Gln), and AGA (Arg) are never found in messages (this
organism has a G+C content near the theoretical limit) .
Biochemical tests have shown that AUA and AGA codons are unassigned .
NCBI's toolkit does not include a table specifically for M.luteus.
In mitochondria of the yeast Torulopsis glabrata, neither CGN (Arg) codons
nor tRNAs capable of translating them are found. CGN is probably
unassigned . NCBI's toolkit does not include a table specifically
In any case, the original post to which I had replied was explicitly
from someone trying to write code dealing with the genetic code, and
so I wished to emphasize that the poster should read Osawa et al's
excellent review , and not rely on tables from textbooks. On the
other hand, NCBI's manual does provide very nice summaries of various
codes, and so it is useful in conjunction with the Osawa et al review.
I should have said this explicitly rather than using the careless
term "subtle errors" (it would have better to say "there are caveats
which are not encapsulated in NCBI's tools").
1. Osawa, S., et al. 1992. Microbiological Reviews 56(1):229-264.
Recent evidence for evolution of the genetic code.
2. Oba, T., et al. 1991. PNAS 88(3):921-925.
CGG: an unassigned or nonsense codon in Mycoplasma capricolum
3. Kano, A., et al. 1993. J.Mol.Biol 229? 51-
Unassigned or nonsense codons in Micrococcus luteus
(my apologies for the citation -- I photocopied only the first page
and very badly and that's where the citation info was).
Again, I did not mean to slight NCBI's work, just emphasize that it
is important to not just follow published tables blindly.
Department of Cellular and Developmental Biology
Department of Genetics / HHMI
robison at biosun.harvard.edu