The perspective of molecular docking:
A killer application to drug design is emerging
by Jianquan Chen
Email: chen_jianquan at yahoo.com
As structural genomics initiative begins to bring more and more 3D
macromolecular structures, the work involved to validate all these
potential targets, to demonstrate their therapeutic relevance, and to
find
leads will become heavily dependent on the high-throughput screening
technologies such as automated molecular docking.
Now it makes more and more chemical compounds screened because a new
drug
has to meet more and stricter requirements. Thus high-throughput
screening (HTS) technology has played an increasingly important role in
new
drug development. And for many pharmaceutical companies, HTS is now an
essential component to identify leads.
Molecular docking over a large scale virtual compound database is a
kind of
virtual HTS technology. Molecular docking is used to predict the bound
conformation of the ligand to the receptor. It connects the structural
bioinformatics and drug design. Straightforwardly speaking, it is a
tool
to mine gold from the structural bioinformatics data. It can be divided
into
3 classes, rigid ligand/rigid receptor, flexible ligand/rigid receptor,
flexible ligand/(partially) flexible receptor docking. Rigid
ligand/rigid
receptor docking is very fast but not detailed. flexible ligand/
(partially)
flexible receptor docking is too complicated to be used to screen a
large
scale database. So we only discuss flexible ligand/rigid receptor
docking.
A docking problem can be divided into search algorithm and scoring
function.
search algorithm should be efficient enough to find the lowest energy
configuration or conformation. The scoring function should be able to
distinguish a correct binding mode from other putative modes. I only
discuss search algorithm here because the progress on the search
algorithm
and the computer will remove the hardware obstacle for the flexible
ligand/
rigid receptor docking to enter the medicinal chemistry lab soon.
AEDock1.1, a new release of AEDock, where a new technology is
introduced,
can dock XK263, a bulky ligand with 10 rotatable bonds, into HIV-1
protease
in 23.4 seconds using a tolerance of 0.01nm rmsd with a success rate of
92%
on a PC with an AMD 750M CPU+256M SDRAM. Please read the results of
AEDock1.1 at
http://www20.brinkster.com/jianquan/aedock11/default.asp .It is a WHOLE
MOLECULE-BASED docking program and so avoids the disadvantages of the
fragment-based docking approaches. Based on the rotatable bond
distribution
table from Bohdan W., it is estimated that 2 PC with 2.2GHz Intel CPU
can
screen 1M drug-like compounds with 0-10 rotatable bonds in 1 week.
Please
follow the above link to see how it is estimated.
Comparison of molecular docking and real HTS
1.Molecular docking over a compound database is more productive and
cost-efficient than a real HTS.
I list the disadvantages of real HTS here:
1. HTS is no guarantee of success.
2. Establishing a robust assay for a new target takes time and money.
Hit rates against some receptors are reported to be very low,
necessitating
screening of very large numbers of com-pounds (tens to hundreds of
thousands).
Collections of synthesized compounds or natural products often contain
far
less chemical diversity than is desired, are not bottomless resources,
and
are very time-consuming to replenish.
3. Techniques such as combinatorial chemistry offer the potential for
synthesizing very large libraries of compounds, but in practice this
approach
is time-consuming for drug-like com-pounds and may still produce
libraries
of relatively restricted diversity.
4. What is the most important is that a real HTS system and a real
compound database are so expensive that only companies can afford them.
A normal medicinal chemistry lab can't afford it.
Bohdan W. etc. used a 64-processor SGI Origin2000 to dock a 1.1 million
drug-like compounds and it took the workstation 6 days to finish the
job.
These compounds were docked into human alpha-estrogen receptor. after
being docked and filtered by some supplementary descriptors, a set
comprising
37 commercially available compounds were chosen to be assayed in a
standard
competitive radio-ligand binding assay. 21 exhibited an inhibition
constant
( Ki ) of 300 nM (nanomolar) or less, with the best compounds at 8 nM.
Given
the structural novelty of the hits (compounds known to possess
estrogenic
activity or to be structurally similar to known ligands were excluded
from
assay), this represents a very positive result, demonstrating that
virtual
screening can readily identify potent ligands from a variety of
structural
classes.
Maybe you will say:"a 64-processor SGI Origin2000 is very expensive."
But
if AEDock1.1 is used, then we only need to buy 2 PC 2.2GHz Intel CPU. A
Dell Dimension 8200 PC with 2.2G Pentium 4 CPU and 256MB PC800 RDRAM
(Rambus dynamic RAM) only costs you $1,689 in Feb 13,2002. The total
is about $3,300. Several months later they can be bought at a cheaper
price. I predict in the end of 2002 a PC with one single CPU can screen
1M drug-like compounds with 0-10 rotatable bonds using whole molecule-
based
flexible ligand/rigid receptor docking approach in 1 week because
performances of CPU and search algorithm are keeping increasing. It
means
that the hardware obstacle for the flexible docking/rigid receptor
docking
to enter normal medicinal chemistry lab will disappear soon.
Comparison of molecular docking and pharmacophore queries or focused 2-
D
property
the flexible ligand/rigid docking is more detailed and more objective
than pharmacophore queries or focused 2-D property. Application of
pharmacophore queries or focused 2-D property profiles may significantly
inhibit the diversity of the compound subset because they are biased
by the properties of known ligands. In contrast, the molecular docking
program can process an entire chemical database with minimal
pre-filtering (e.g., to eliminate unstable or toxic moieties) so that
the final selection is based on the quality of the docked models rather
than a subjective opinion of what properties are expected in a ligand.
This route is a very promising one to find structurally novel ligands,
which may make receptor interactions similar to known ligands.
The roles molecular docking plays in the drug design
1. find leads.
2. provide groups for chemists to modify the leads. When compounds are
docked into the receptor, the groups interacting with the subsites of
the receptor should be ranked. Maybe the groups with good ranking plus
leading compounds will lead to final products. (up to now I didn't find
any paper on this role of the molecular docking, please tell me if you
find any paper on this. )
So molecular docking provides not only leading compounds, but also
suggest
you how to modify the leads. Oops, molecular docking provides a whole
solution for drug discovery.
It sounds molecular docking very promising and will be popular soon.
Then why not to invest your money or energy in it? Please contact me
if you are interested. Email:chen_jianquan at yahoo.com, Phone:(USA)732-
207-8147
In a word, the molecular docking will become a killer application to
drug
design. It will enter the normal medicinal chemistry labs, not only the
big pharmaceutical companies.
The obstacles for the molecular docking to enter the medicinal
chemistry lab
1. The virtual compound database. Now most of the scientist uses ACD-3D
or ACD-SC from MDL as the compound database for screening. It will cost
you about $15,000 to get a license of ACD-3D and ACD-SC on a single
"Datastation" per year. But MDL allows you to share the ISIS databases
with other scientists and so it will lower your cost. In addition you
need a license of Enterprise Oracle database. It seems the database
costs
you lots of money. There is an alternative solution. A free NCI 3D
compound
database can be stored in free database such as Mysql. After being
screened
the final set of compound can be submitted to www.chemexper.com to know
if
the compound is commercially available. In addition Oracle database is
too
difficult for a chemist or molecular modeler to maintain.
2. The available 3D macromolecular structures. Most of the 3D
structures
of macromolecules are not available. Fortunately, structural genomics
initiative begins to bring more and more 3D macromolecular structures
and
maybe computational protein folding will bring some reliable structures.
3. The scoring functions and the accuracy of the 3D macromolecular
structures.
Not all crystallographic structures can be reproduced because the
scoring
function can't distinguish a correct binding mode from other putative
modes
for some cases or some crystallographic structures are not accurate
enough.
Please tell me that if you find any paper on test a scoring functions using
hundreds
of PDB entries.
Please sent comments to chen_jianquan at yahoo.com or visit
http://www20.brinkster.com/jianquan/aedock11/default.asp or my homepage
(http://www-ec.njit.edu/~jc26/).
Note:(1) some contents are extracted from
"Large-scale virtual screening for discovering leads in the postgenomic
era",
Waszkowycz, B.; Perkins, T. D. J.; Sykes, R. A.; Li, J. . IBM SYSTEMS
J.,
2001, 40, 360-376.
(2) please include the web address of this paper
(http://www20.brinkster.com/jianquan/aedock11/perspective.asp)
and my name if you want to cite the opinions in this paper.