to Molecular Interaction in Biological Systems
by Lukas K. Buehler
of Cellular Systems
of Molecular Recognition
of Molecular Surfaces
The Ligand-Receptor Complex and its Pharmacophore
Water As Solvent and Binding Energetics
The Structural Properties of Biological Macromolecules
Motion and Protein Stability
Self-assembly Systems and Protein Complexes
Allosteric Properties of Proteins
Enzymes and Their Inhibitors
of DNA & RNA
Antisense & Small Molecule Ligands
Find information in this biotechnology and biochemistry
web reader: This page contains short summaries of each chapter.
To read an extended version of a chapter, click on the links provided
or use the Search
option (note: searches entire www.whatislife.com site). Find
referenced throughout the web-reader here!
The last fifty years have completely changed the way biological
and medical researchers can study and understand life, its development
from conception to death, susceptibility to infectious and inherited
diseases, in short, the molecular mechanisms of metabolic processes.
One reason that brought about this understanding lies in the ability
to access the information contained in biological macromolecules.
Information stored in the structure of molecules is a function of
their physical and chemical properties. A second and more important
reason is the ability to manipulate this information by virtue of
changing the structures of macromolecules - proteins, nucleic acids,
or polysaccharides. The advancement of molecular biology has been
the driving force behind these changes in the biomedical sciences.
But the functional manipulation of biological material could not
generate much of what is done today by the pharmaceutical industry,
if it were not for the preceding developments in physics and chemistry
during the 19th and 20th centuries including thermodynamics, statistical
mechanics, and the nature of the chemical bond. The reductionist's
approach - the study of chemistry and physics of life - created
an enormous wealth of biochemical and genetic data available for
the rational design of drugs and the manipulation of the genome.
The ensuing atomistic view shall be presented and discussed within
the context of molecular interactions in biological systems. While
DNA is the storage of hereditary information, proteins and RNA are
its agents, accessing and executing the genetic programs. The mechanism
of protein function is simple; proteins accelerate chemical reactions
(as enzymes) by providing optimal binding to substrates, or drastically
improve solubility and target specific binding (as receptors) of
small ligand molecules. All catalytic activity and ligand binding
occurs on solvent accessible protein surfaces provided by preferential
molecular interactions. These molecular interactions are electrostatic
in nature. The strength of these interactions, the forces among
atoms, can be categorized according to their thermodynamic and kinetic
behavior and is defined as affinity. The conformational precision
of interaction leads to selective binding and is defined as specificity.
Both properties are directly dependent on the physico-chemical properties
of the solvent of life - water.
Principles of Molecular
The function of most proteins is controlled by small molecule ligands
that reversibly bind to proteins and either stimulate or inhibit
their activity. Because different areas of research have studied
different kinds of proteins, more than one nomenclature for these
small ligand molecules are used. This is summarized in the table
For enzymes which catalyze chemical reactions, natural ligands
are the substrates which have to bind before they are being chemically
processed into products. Catalytic reactions can be suppressed by
competitive inhibitors which bind to the same location on the surface
of an enzyme as the substrate. This location is called the active
site. Receptors (e.g. cell surface proteins, nuclear DNA-binding
proteins) bind ligands without chemically modifying them. Instead
binding induces a conformational change in the receptor protein
that can trigger a chemical reaction of a substrate bound somewhere
else on the same protein or affect the binding affinity of a second
molecule that interacts with the receptor. This process is known
as allosteric regulation. Ligands activating receptors are called
agonists, while competitive inhibitors of these ligands are called
Binding events are characterized as reversible chemical equilibrium
and binding (and thus effect) of both agonists and antagonists are
concentration dependent. The affinity of the ligand for the binding
site can be quantified by the equilibrium constant of binding (association
constant KA) or unbinding (dissociation constant KD)
over an effective concentration range, i.e., where an agonist induces
an effect, or an antagonist can block agonist activity. Affinity
tells us how strong a ligand binds to its receptor, is related to
the Gibbs free energy of binding. Affinity is a macroscopic property
if binding representing an averaged behavior of a very large number
of events that are the result of an often complex series of events
and molecular interactions. The latter are microscopic properties
of molecular structures and are described by non-covalent bond structures.
Bridging the qualitative difference between macroscopic properties
(thermodynamic quantities and kinetic data) and microscopic structural
information (chemical bonds, electron density maps) is the biggest
obstacle to predicting functional aspects of ligand-receptor interaction.
space and the structure of a ligand/receptor complex
The following discussion refers to the structure of a stable complex
between a ligand and its receptor which is a microscopic description.
The strength of an interaction (its affinity, which is a macroscopic
description) depends on the complementarity of the physico-chemical
properties of atoms that bind, i.e., protein surface and ligand
structure. Excluding catalytic mechanisms from the discussion, two
classes of molecular properties important for binding can readily
- shape or volume
- surface potential
Talking about these properties, chemists refer to them as the molecular
similarity space which can now more precisely be described as:
|atom pair matching function
||(shape or volume)
|charge matching function
It is intuitive to think that simple binding has to do with similarity
(or complementarity) of properties and structures such that the
higher a similarity the more specific the recognition will be. Atom
pair matching function mostly refers to Van der Waals surfaces of
molecules and hydrogen bonds. Both interactions are weak and are
effective only over very short distances in terms of potential energy
function, i.e., they are short-range interactions that can be easily
broken, but help define the conformational specificity of an interaction.
This is particularly true for the hydrogen bond, which has a directional
quality related to its strength of interaction.
Recognition by a ligand of its receptor binding site can be envisaged
as a result of orientational and translational movement of the ligand
within the electric surface potential field of the receptor. A specific
interaction is encountered when the orientation of the ligand fits
complementary physical properties on the receptor surface.
- Receptor Complementarity
|shape or volume
||short range (1/r^6)
||long range (1/r)
||short range (1/r^3)
|short range (1/r^6)
short range (1/r^3)
A combination of any of these four physical properties summarized
in the table above defines multivariate surfaces. This can obviously
lead to complex surface structures or binding motifs, specially
for large contact surfaces such as found between proteins, where
one protein is the 'ligand' and the other the 'receptor'. Protein-protein
interaction is relevant for any enzyme or receptor complex, cytoskeletal
structures, or chromatin structures. In the present context, peptide
ligands (e.g. neuropeptides, insulin etc.
) are among those
agonists providing the largest variability in similarity space.
Since protein surfaces are determined by their amino acids residues,
binding surfaces can be mathematically described as sequence space.
An artificial peptide sequence space for the development of novel
antibiotics has recently been achieved using cyclic peptides that
self-assemble into peptide nanotubes.
It is the complementarity of these motifs between receptor and
ligand that determines the specificity of the interaction. The electrostatic
force between ligand and receptor helps define the affinity of the
interaction. There are long range and short range interactions.
Hydrophobic interactions and hydrogen bonds are short range interactions
based on induced-dipole and molecular dipole moments, respectively.
Electrostatic interactions are long range meaning that electric
fields can be sensed several angstroms away from the point charge.
The strength (and effective distance) of these interactions is a
function of the dielectric property of the environment. Water molecules
are able to shield locale charges and dipoles reducing the range
of their electric field forces. In a hydrophobic environment like
a cell membrane charges are not shielded by non-polar molecules
and can have an effect over distances covering entire proteins.
To find a good (drug-) ligand for a protein or DNA surface (a receptor),
one only has to study the structure and function of natural agonists
and antagonists, or the surface topology of the binding site on
the macromolecule. Using structural information for drug discovery
is referred to as rational drug design and makes use of the concepts
of chemical similarity and complementarity. Chemical similarity
is measured by identifying distances between atoms on a receptor
and a ligand. Based on the chemical properties of the interacting
atoms (or group of atoms = functional groups) small differences
in distance have a great influence on the 'reactivity' of a ligand.
Since proteins are fluid like entities (alas highly viscous ones),
their structures are very sensitive towards disturbances at their
surface. For ligands that are similar but not identical, disturbance
results in different molecular properties (antagonist or agonist;
inhibitor or activator).
Thus a local conformational change initiated by the agonist, but
not antagonist binding results in a destabilization of the protein
structure. This destabilization is not strong enough to denature
the protein, but results in a long range effect across the protein
affecting its active site several angstrom away from the ligand
binding site. This is known as an allosteric mechanism. As a rule,
agonists induce structure destabilization, while antagonists merely
bind , but do not affect the protein structure (or trigger a conformational
change that locks a protein in its inactive position). One way to
visualize the action of ligands on receptors is to realize that
proteins constantly undergo conformational changes which is best
described as an equilibrium between an active and inactive, or even
among multiple states, including desensitized states (different
types of inactive states). Agonists and antagonists shift this equilibrium
towards an active or inactive conformation, respectively.
In general, using the surface topology of a group of ligands that
all exhibit effector quality (agonist or antagonist) can be overlapped
and the contours of all molecules averaged into a union surface.
This union surface of a ligand is expected to be complementary to
the surface mold of the corresponding binding site on the receptor
or enzyme. In complex structures the distribution and combination
of physical properties used to search for similarity (complementarity)
is large. Modeling structures of ligands in different ways and superimposing
different structures with similar affinity exposes the critical
fragment or overall similarity of these fragments. The critical
fragment of an antagonist/agonist structure is called the pharmacophore.
The classical stick and ball model of chemical structures allows
to overlap the bond structures and identify these critical segments,
often single atoms, in ligands. In many cases ligand receptor interaction
is not necessarily mediated by the entire ligand structure, but
by ligand points or critical fragments, the pharmacophore. Thus
when analyzing existing data of antagonist and agonist structures,
it becomes clear why compounds belonging to very different classes
of chemicals so often act on the same target proteins.
Select Ligand Systems
Chlorpromazine is an agonist of the dopamine and contains a superimposable
Neurotoxins saxitoxin (STX) and tetrodotoxin (TTX) block voltage
gated sodium channels. A solvent accessible surface area match shows
that the dissimilar structure have identical surface topology.
Sigma ligands (steroidal hormone receptor antagonists) show common
points or critical fragments, triangle representation of pharmacophore.
Benzodiazepine (GABA antagonist) and beta-carboline fit the same
surface mold based on the modeling of the solvent accessible binding
site topology, embedded ligand points and hydrophobic core.
Monooxygenase P450 substrate (camphor) and inhibitor (phenyl-imidazol).
Fragments can be designed for enzyme inhibitors that mimic the
structure, but are not hydrolysable. An example are the HIV protease
inhibitors where commonly employed bioisosters (non hydrolysable)
are replacing the functional amide group (hydrolysable).
Ligands binding to cell surface receptors are either amphipathic/hydrophilic/charged
and often too large to cross the cell membrane. Intracellular receptors
are targets for small hydrophobic ligands which easily diffuse across
membranes. A well studied model system includes the inhibitors of
bacterial dihydrofolate reductase (DHFR). This enzyme catalyzes
the reduction of dihydrofolate (DHF) to tetrahydrofolate (THF),
an important step in DNA synthesis. Comparing sequence and structure
of DHFR from different organisms shows many similarities but also
explains why some inhibitors are selective against bacteria, while
having little effect on the enzyme of the host organism. Inhibitors
of human DHFR are also studied for their effectiveness as anticancer
drugs. Again, the need of tumors to synthesize DNA is much greater
than for surrounding healthy cells. DHFR inhibitors therefore may
act as chemotherapeutic as well as antibacterial agents.
Pharmacophore of serotonin
receptor agonists and antagonists
or 5-HT) acts on any one of nine known serotonin receptors which
play important roles in neuronal signaling in the central and peripheral
nervous system. The cytotoxic agents used in cancer chemotherapy
provoke the release of 5-HT from enterochromaffin cells in the peripheral
vagal afferent fibers of the gastrointestinal tract initiating vomit
reflexes (emesis). 5-HT3
receptor specific antagonists block this action and thereby
greatly reduce the number of emetic episodes that occur during cancer
5-HT3 receptor antagonists have been shown to produce beneficial
effects in animal models of cognitive and psychiatric disorders.
Whether 5-HT3 receptor antagonists may have similar profound effects
in the treatment of anxiety, depression or psychosis will be determined
by the outcome of ongoing clinical trials. However, it is in the
treatment of cancer chemotherapy induced emesis that 5-HT3 receptor
antagonists have had their greatest impact. The marked clinical
efficacy of 5-HT3 receptor antagonists such as ondansetron, granisetron
and tropisetron together with their lack of adverse side effects
has greatly improved the treatment of cancer chemotherapy induced
5-HT3 receptors belong to the family of ligand-gated ion channels.
When activated, ions flowing through these channels depolarize the
cell membrane triggering action potentials and thus nerve conduction
events. 5-HT3 receptor-mediated ion currents evoked by the full
agonists 5-hydroxytryptamine (5-HT), quaternary 5-HT (5-HTQ), meta-chlorophenylbiguanide
(mCPBG) and the partial agonists dopamine and tryptamine in whole-cell
voltage clamp experiments can be used to characterize binding properties
of these ligands such as affinity and specificity. Ligand-gated
receptors typically switch into an inactive (desensitized) state
within seconds of activation. Both serotonin and its synthetic analogues
desensitize the 5-HT3 receptor completely with a steep concentration
dependence and a potency order of: mCPBG > 5-HTQ >> 5-HT
>> tryptamine > dopamine. The time course of recovery from
desensitization depends on the agonist used.
about electrophysiology experiments on the serotonin receptor in
A quantitative molecular pharmacophore model was derived to predict
drug affinities for 5-hydroxytryptamine (5-HT3) receptors. The model
was based on the molecular characteristics of a learning set of
40 pharmacological agents that had been analyzed previously in radioligand
binding studies. Molecules were analyzed for various structural
features, i.e., the presence of a benzenoid ring and nitrogen atom,
substitutions on the benzenoid ring, the location of the substitutions
on the nitrogen, and the molecular characteristics of the most direct
pathway from the benzenoid ring to the nitrogen. Weighting factors,
based on published 5-HT3 receptor affinity data, were then assigned
to each of 10 molecular characteristics.
The following nine rules have been established for the 5HT-3A receptor
structure (from Schmidt et al., 1989, Molecular Pharmacology
||Contains aromatic ring structure (lower half of molecule;
consistence with the hypothesis of Lloyd and Andrews which states
that all central nervous system active drugs contain an aromatic
ring; J.Med. Chem, 1986, 29:453-1093).
|| A tropane ring embedded nitrogen is present (see upper half
of molecule shown above) and located at a nearest distance from
the aromatic ring is not more than seven atoms from the aromatic
||When aligning the tropane ring nitrogen in the same plane
as the aromatic ring (torsion angle flexibility) the distance
between the nitrogen and the aromatic ring center is 6.0 to
||Chemical substitutions of no more than 3 atoms are allowed
at the nitrogen
||The tropane ring structure itself does not tolerate substitutions
larger than methyl groups (-CH3); larger groups significantly
||The linker structure between aromatic ring and ring nitrogen
contains steric similarities that reduces flexibility (carbonyls
or C=O bonds)
||The first and second atom from the aromatic ring in the linker
is never a tetrahedral carbon (no torsion angle flexibility;
see point 6)
|| The third atom from the aromatic ring may be a tetrahedral
||Substitutions on the aromatic ring must be able to adopt a
A naturally occurring drug, atropine, demonstrates how a 'small'
but significant violation of the 'rules' in its structure explains
its very low affinity for the serotonin receptor 3A subtype. The
first atom from the aromatic ring is a tetragonal carbon and thus
allows torsion angle flexibility (see rules 6 and 7) explaining
that the structure can deviate from being co-planar with the aromatic
ring. These two atoms are the main difference as compared to the
highly specific antagonist ICS 205-930. The molecular weight and
chemical formula are nearly identical although the affinity of atropine
for the 5HT-3A receptor is x1000 lower than that for ICS 205 930.
Atropine is a natural antagonist of cholinergic receptors (acetylcholine
receptors) and is used as an antidote to nerve gas (sarine) which
inhibits acetylcholine esterase. The toxic effect of prolonged acetylcholine
stimulation is thus reversed by blocking the acetylcholine receptor.
The case of atropine and the 5HT-3A pharmacophore structure requirements
demonstrate the influence of molecular motion on binding. The flexibility
of the atropine molecule between its aromatic and tropane ring structures
essentially reduces the chance in atropine of superimposing the
tropane ring and aromatic ring, a gross-structure requirement for
5HT-3A antagonist binding. Further structure-activity relationship
studies (SARS) show that all known receptor antagonists exhibit
at least one degree of freedom. This hinders potential screening
of ligand-receptor topology which is best achieved with a rigid
molecule as template for rational drug design. This importance of
molecular motion in ligands (and receptors) is further demonstrated
in thermodynamic analysis of drug binding.
Water as Solvent
Dipole structure of water. Water is composed of one oxygen and
two hydrogen atoms forming two O-H single bonds of 0.95 angstrom
(Å) in length and a bond angle of 104.5° between them.
Based on the asymmetric distribution of electrons in this triatomic
molecule, with the electrons attracted to the oxygen nucleus, the
water molecule exhibits a molecular dipole moment of 1.84 Debye.
A dipole moment m is defined by two point charges q separated by
a distance r; m = qr [Cm]. The value of the dipole moment depends
on the difference of the electronegativity of atoms sharing a covalent
bond structure. The electronegativity series of biologically important
atoms (with increasing affinity for electrons) is: H < C <<
N < O.
Dielectric constant: Molecular dipoles experience either an attractive
or repulsive force and react to external electric fields. This property
is known as polarizability of the medium and expressed as dielectric
constant D (or e) of the solvent. The dielectric constant determines
the polarity of a solvent and thus the solubility of molecules.
Polarizable solvents (solutes) are polar or hydrophilic (liking
water; water is a polar solvent), while non-polarizable solvents
(solutes) are non-polar or hydrophobic. As a general rule, hydrophilic
solvents mix well with hydrophilic solvents (solutes), and hydrophobic
solvents with hydrophobic solvents (solutes).
Hydrogen bond: The dominant electrostatic interaction in water,
based on its permanent molecular dipole moment, is the hydrogen
bond (H-bond). The hydrogen bond is stronger than an induced dipole-dipole
interaction. The latter is known as Van der Waals interaction, a
small electrostatic attraction. The hydrogen bond, however, is weaker
than a covalent bond. The relative strength of these three types
of bonds can be directly assessed by comparing the length of each
bond; O-H covalent bond = 0.96 Å (strong), O-H hydrogen bond
= 1.8 Å (medium), and O-H Van der Waals bond = 2.6 Å
(weak). Based on the tetrahedral bond architecture and the orientation
of two unpaired electron pairs on the oxygen atom, water molecules
can form as many as four (4) hydrogen bonds with each other. This
maximal extend of hydrogen bonds, or saturated hydrogen bond network,
is achieved in water's solid state - ice crystals. Liquid water
has an average of 2.3 hydrogen bonds per molecule. The system is
highly dynamic, the lifetime of an hydrogen bond is very short,
and as a consequence there is no discernible structure in liquid
water. Hydrogen bonds can also be formed by amine groups containing
N-H single bonds or carbonyl bonds (C=O). The ability of water molecules
to form hydrogen bonds with themselves and biological (macro-) molecules
is the single most important parameter to understand structure,
function, and regulation of enzymes, genes, and biological membranes.
Ions and mobile charges: Water is rarely a pure solvent, but contains
a multitude of salts, which all exist in the form of dissociated,
charged ions. Table salt NaCl for example, quickly and spontaneously
decays into Na+ and Cl- ions. This process is driven by a change
in heat capacity of the system, an enthalpic reaction, and at the
molecular level is stabilized by the formation of hydration shells.
Hydration shells are semi-stable structures of water molecules that
interact with their dipole moment to the central point charge more
strongly than they do with themselves. The positive and negative
point charges function as external field with the electric field
of the dipole moments reorienting against the charge field to minimize
the free energy of the system. The strength of electrostatic interactions
between ion pairs like NaCl is described by Coulomb's law, which
says that the force holding two equal, but opposing charged ions
together is a function of the charge itself, the inverse square
distance between the charges and the dielectric constant of the
medium. The kinetic energy prevents the durable formation of hydration
shells. The ease of solubilization depends on the polarizability
of the solvent molecule, a parameter that is a function of the dipole
moment as well as the mass and rotational lateral mobility of the
Hydrophobic effect: Many biologically relevant molecules are partially
hydrophobic, meaning that they are not easily soluble in water,
because they lack the ability to form strong electrostatic interactions
or hydrogen bonds with the solvent. Hydrophobic interactions are
typically based on Van der Waals forces (induced dipoles). Their
inability to form energetically favorable interactions with water
molecules (hydrogen bonds) induces phase separation. Water molecules
preferentially interact with each other through hydrogen bonds.
Since hydrophobic molecules must form contact with water molecules,
but can do so only through Van der Waals forces (weak forces), every
water-solute interaction is thermodynamically less stable than corresponding
hydrogen bonds (strong forces) among water molecules themselves.
The reduced number of potential hydrogen bonds found on hydrophobic
surfaces reduces the degree of freedom of water molecules at these
interfaces. Rotational and lateral movements are restricted and
stable water structures are formed at interface boundaries. Reducing
the total area of the hydrophobic surface is energetically favorable.
Such a reduction is achieved by clustering hydrophobic solutes into
large aggregates. The large number of water molecules no longer
needed to form less favorable Van der Waals bonds with the hydrophobic
solutes, increases the entropy of the system. The entropy of the
liquid water phase (rotational, vibrational, translational degrees
of freedom) dominates the thermodynamics of the system. The increase
in entropy of the water phase is much larger than the loss in entropy
of the aggregated (structured) hydrophobic particles. This entropy
driven aggregation of hydrophobic molecules in aqueous solutions
is called the hydrophobic effect. It is the major stabilizing force
in biological systems determining such wide ranging processes as
protein folding, ligand binding, and cell membrane formation.
For the search of new and effective agonists and antagonists,
computer modeling has become an invaluable tool because powerful
processors readily calculate the properties necessary to define
a chemical similarity space. Not only can they be used to design
new structures, or modify known structures of agonists/antagonists,
they are also useful to screen existing compound libraries for structural
and chemical similarity. As it turns out, molecular modeling tools
are better at simulating specificity (conformation) than affinity
(energy) of interactions. There is a simple reason for it and it
has to do with the solvent. Molecular modeling (static or dynamic)
is usually performed 'in vacuum' drastically reducing the number
of calculations by excluding solvent-solvent and solvent-solute
interaction. Missing from theoretical analysis of molecular interaction
is the role that surface bound water molecules play during the formation
of a ligand-receptor complex. The role of these water molecules
helps explain experimentally observed affinities for naturally occurring
ligand-receptor systems that can not be explained by analyzing the
non-covalent interaction between ligand and receptor surface in
the complex alone. The challenge here is to understand the binding
energy components enthalpy and entropy during complex formation
for both ligand-receptor binding and solvent displacement.
R + L = RL
Ka = [RL]/[R][L]
Both ligand (L) and receptor (R) come in hydrated form and ligand-receptor
(RL) complex formation requires replacement of surface bound water
molecules from both the ligand and receptor binding site (partial
dehydration). Often, surface bound water is more structured than
liquid bulk water. Thus, the release of many water molecules upon
complex formation into the bulk phase increases the entropy of the
entire system (protein-ligand solution). Such an entropy driven
process is well described for hydrophobic and amphipathic solutes
and is known as the hydrophobic effect.
While it has been found that there is little correlation between
the change in Gibbs free energy (DG)
of binding and the change in solvent accessible surface area (Bogan
and Thorn, 1998, J.Mol.Biol. 280:1-9), experimental observation
show that despite the overall small change in Gibbs free energy
of binding, both its enthalpic and entropic component can be large,
yet in opposing direction. Unfavorable enthalpic components of dehydration
and ligand-receptor binding can be offset by favorable entropic
components stabilizing the ligand-receptor complex.
DG = DH
- TDS = - RTlnKa
Generally, increased bonding in a bimolecular interaction will
produce a more negative enthalpy change, DH,
but this will come at the expense of increased order associated
with a more negative entropy change, DS<0.
The inverse relationship observed between enthalpy and entropy changes
in binding interaction is known as the enthalpy-entropy compensation.
Overall, favorable entropy terms of partial dehydration of ligand
and receptor binding site offset the unfavorable entropic term of
the more ordered ligand-receptor complex. This generally explains
how living organism can form and maintain ordered structures - create
order out of chaos - at the expense of environmental energy.
In terms of rational drug design the enthalpy-entropy compensation
is a difficult challenge that must be overcome to significantly
improve the prediction of binding affinity of novel drugs to novel
targets (for an excellent review see Holdgate, G. A., 2001, BioTechniques
31:164-186). Nevertheless, successful design of drugs on enzymes
with deep binding pockets occluding bulk water (i.e. ordered water
structure in binding pocket, a favorable condition for high affinity
binding) has been achieved. Examples of drugs are Nelfinavir (Agouron
Pharmaceutical's Viracept or AG-1343, an inhibitor of HIV-1 protease)
and Ro 46-6240 (Hoffman-LaRoche's inhibitor of thrombin). Non-peptidic,
small-molecule mimics as inhibitors of protein-protein interaction
have proven more difficult to design. Much of the solvent occlusion
of peptide inhibitors is provided by the main-chain and Cb
atoms (amino acid side chain carbon) adjacent to binding hot spots,
which explains why side-chain modifications heave little effect
The equilibrium constant Ka used in thermodynamic analysis is a
ratio of product over substrate concentration. However, this is
an approximation valid only in ideal solutions that do not assume
molecular interactions (sic!) which is equivalent to extrapolating
experimental measurements to zero concentration (for review see
Ellis, Trends in Biochem. Sci., 2001, 26:597-604). Concentrations
of proteins in cells, however, can be as high as 300 to 400 mg/ml.
The correct solution to thermodynamic equilibrium therefore uses
a correction called the activity coefficient g.
Multiplying the concentration c with the activity coefficient to
correct for real size molecules with real interactions is given
as effective concentration or thermodynamic activity, a = gc.
For example, the association constant for protein dimerization in
bacterial cytoplasm is 8 to 40 fold increased as compared to an
ideal solution, while the association constant, i.e., the affinity,
for a tetrameric protein is increased 10^3 to 10^5 fold.
Molecular crowding favors association, protein folding, and ligand-receptor
(or substrate enzyme) formation. Binding, however, is also a function
of diffusion in solution. Thus, small molecules have favorable diffusion
rates even in crowded solutions, while macromolecules experience
a drastic drop in diffusion (consider the exclusion volume as in
a gel matrix) reducing the positive effect of crowding on affinity.
The Structural Properties
of Biological Macromolecules
Three major types of macromolecules are found in biological systems:
proteins, nucleic acids, and carbohydrates (polysaccharides). All
play important roles in the physiology and structure of organisms.
Catalytic and regulatory functions are mainly performed by proteins,
although some ribozymes, RNA based catalytic units act as enzymes.
All three types can function as receptors. Examples are cell surface
receptors (proteins) as hormone or neurotransmitter receptors, transcription
factors as regulatory elements of gene expression, or glycolipids
and glycoproteins as cell surface matrix that is usually cell type
or organism specific (pathogenic microorganisms).
The role of proteins in cells is three fold; catalyzing chemical
reactions (enzymes); promoting structural stability and mobility
(structural proteins and molecular motors), transport of molecules
and signal events across biological membranes and filamentous protein
structures (e.g. cytoskeleton). Most drug targets are proteins because
of their functional importance.
Protein structure: Proteins
are linear polymers of amino acids. There are 20
different amino acids based on their side chain chemical and
physical properties. Besides the side chain, every amino acid contains
an amino group (NH2) and carboxyl group (COOH) and a hydrogen
atom linked through the central alpha carbon (Ca).
In a protein, the acid-base property of amino acids is not important
except for its N- and C-terminal ends, which are always charged
at physiological pH values (pH = 7 to 7.5). In the linear polymer,
the amino and carboxyl group are covalently linked to form a peptide
bond. Every amino acid residue (except the terminal units) lies
at the center of two peptide bond structures (amide planes) linked
by two single covalent bonds. These covalent bonds have rotational
flexibility (degree of freedom) and are called torsion angles. The
amino acid sequence of a protein is referred to as primary structure
(1°D) and largely determines the three dimensional structure
(tertiary structure or 3°D) of a protein. The tertiary structure
contains repetitive elements dubbed secondary structures (2°D).
This secondary structures are recurrent elements in proteins and
can be classified according to the particular polypeptide backbone
fold and measured by their torsion angle values. The two most widely
found secondary structures are the right handed alpha helix (a-helix)
and the beta strand (b-strand). Most active proteins are found in
complex with other proteins. The structure of these multi-subunit
protein complexes is referred to as quaternary structure (4°D).
Protein complexes give the cell an extraordinary functional variability
and control over catalytic processes. The complexity of living organism
is achieved not only by the number of different proteins or molecules
in general, but by their use as small multi-subunit complexes. Thus
the expression of one gene is in many ways dependent on the expression
of other genes.
and Visualization of Proteins
Structure determination techniques:
To obtain information about the structure of molecules at a resolution
where the position of individual atoms can be seen, two techniques
are currently used: X-ray diffraction from protein or DNA crystals,
or nuclear magnetic resonance (NMR) of proteins or nucleic acid
in solution. Although both techniques are used to obtain the same
high resolution structure information, they measure completely different
physical properties of molecules. X-ray diffraction yields information
about the electron distribution in a crystal lattice, while NMR
measures the magnetic spin resonance of selected isotopes (H,
C, N). High resolution structures contain detailed information
of atoms separated by 1.5 to 3 Å, roughly the length of a
hydrogen bond, or twice the length of a covalent bond. NMR: Because
of its wide distribution in biological macromolecules, proton-NMR
(H-NMR) is used to determine the atomic neighborhood of protons
in macromolecules. Currently, NMR solution structures are limited
to molecules with a molecular weight smaller than 20 to 25 kDa (for
a protein this is about 200 amino acids). NMR is a technique of
choice, if the dynamical aspect of structures have to be determined
because the time scale of obtaining the data is very short and lies
in the order of milliseconds. X-ray crystallography: X-ray diffraction
measures the electron distribution in crystals that contain millions
of units in an ordered structure (crystal lattice and unit cell).
The regularity of the crystal lattice determines the level of resolution
(the more ordered , the higher the resolution). The sampling of
the diffraction data requires much longer times than NMR measurements.
Crystal structures are therefore also known as frozen structures.
Highly flexible protein domains (disordered regions) often create
a 'blank stretch' in the elucidated structure.
The many ways to look at a protein structure: There exists no unique
way to represent the structure of a protein. The 'structure' we
choose to describe a molecule depends on the quality or property
of the molecule to be studied. For the same reasons that we have
physical, chemical, or biological sciences, we have reasons to focus
on different structural aspects of proteins. To mention a few, we
can choose the simple space-filled model representing the 'shape'
or form (Van der Waals surface) of a protein including every atom
in the structure, or we can strip the protein to the reduced information
of its polybackbone conformation, the most often used representation
of protein structures. We can mark regions of different secondary
structures by symbols (cartoon representation) to quickly allow
an overview of motifs and domain organization. To obtain an understanding
of the structure-function relationship of a protein, we need to
include the physical and chemical properties of its surface and
represent the protein in a way that 'makes sense from the point
of view of an other molecular entity', in other words its electrostatic
behavior. This includes clusters of fixed charges, distribution
of permanent and induced dipoles, molecular orbitals of aromatic
amino acids, the distribution of polar and non-polar amino acids,
and the flexibility of surface structures. Their are four major
representation of molecular surfaces: the Van der Waals surface,
surface potentials, the solvent accessible surface, and the union
DNA as carrier of genetic information may be a target for drug
interaction because of the ability to interfere with transcription
(gene expression preceding protein synthesis) and DNA replication,
a major process in cell growth and division (see DHFR inhibitors
above). DNA replication is central to tumorigenesis and pathogenesis.
Nucleic acids are not commonly used as drug targets except antisense
drugs (RNA) and antimicrobial or anticancer drugs that readily damage
DNA strands or prevent regulatory proteins from binding.
DNA structure: DNA is a linear
polymer made of four different types of nucleotides. Nucleotides
are complex structures of a cyclic aromatic base, a ribose sugar
unit, and one, two, or three phosphate groups. They are named after
their different bases, the variable components of nucleic acids,
which come in two basic versions - single ring forms called pyrimidines,
and double ring forms called purines. The pyrimidines include cytosine
(C) and thymine (T; uracil (U) in RNA), the purines include adenine
(A) and guanine (G). The stable form of DNA is a dimer and its tertiary
structure the right handed double helix, called B-DNA, or Watson&Crick
double helix, named after their co-discoverers. The stability of
the B-DNA is provided by the base stacking of flat aromatic ring
structures, as well as the hydrogen bonding in base pairs. In B-DNA
only two base pair combinations are found - AT pairs with two hydrogen
bonds and GC pairs with three hydrogen bonds. The number of hydrogen
bonds and thus the thermodynamic stability of a DNA double helix
is directly related to its GC content, i.e., the percentage of GC
pairs in DNA. Chromosomal regions with high GC content correlate
with the presence of functional genes.
There are three principally different ways of drug-binding. First,
through control of transcription factors and polymerases. Here,
the drugs interact with the proteins that bind to DNA. Second, through
RNA binding to DNA double helices to form nucleic acid triple helical
structures or RNA hybridization (sequence specific binding) to exposed
DNA single strand regions forming DNA-RNA hybrids that may interfere
with transcriptional activity. Third, small aromatic ligand molecules
that bind to DNA double helical structures by (i) intercalating
between stacked base pairs thereby distorting the DNA backbone conformation
and interfering with DNA-protein interaction or (ii) the minor groove
binders. The latter cause little distortion of the DNA backbone.
Both work through non covalent interaction.
Although carbohydrates or polysaccharides play a major role in
cell surface recognition, they are not commonly drug targets because
they have no enzymatic function but serve as structural components
between cells of multicelllar organisms and pathogenic organisms
and their hosts. As is the case for nucleic acids, antimicrobial
activity of potential novel drugs may include those interfering
with binding of pathogenic microorganisms to host cell surfaces.
Both host and microorganism have carbohydrate coated surfaces. The
potential of polysaccharide targets for novel drugs is supported
by the observation that both microbial DNA and polysaccharides are
immunogenic. Thus novel drugs may be developed that mimic an immune
response to develop vaccines or produce competitive inhibitors that
can interfere with binding.
Recognition of Proteins
Molecular Motion and
Proteins would not function if their structures were not flexible.
The flexibility originates from thermal motion of its atoms, the
result of their kinetic energy. In living cells and organisms, macromolecular
structures are not rigid entities, but resemble highly viscous fluids.
As a result, protein activity is temperature sensitive, with too
low or too high temperatures causing inactivation. While low temperatures
inactivate proteins because their structure gets frozen or adopts
a crystalline state, elevated temperatures cause proteins to unfold
or denature. Both conditions compromise the structural integrity
of active sites and binding sites and thus reduce activity.
Molecular dynamics: Protein flexibility
is extremely difficult to study experimentally. The time scale of
molecular motions ranges from femto- (10^-15) to microseconds (10^-6),
with 'longer' time scales correlating with larger structures and
longer distances involved (e.g. protein folding). Modern day computational
power enables simulations of thermodynamic flexibility of atoms
of larger and larger molecular structures. This computer based simulations
are known as molecular dynamics simulations.
Protein folding: Proteins
are synthesized in cells in a linear fashion (on ribosomes) and
have to fold into a native, active conformation (tertiary or quaternary
structure). This conformation is largely determined by the amino
acid sequence and particularly by the distribution pattern of hydrophilic
and hydrophobic amino acid residues. As a general rule, hydrophobic
residues are buried inside the protein core during the folding process,
driven by the hydrophobic effect. The folding process is temperature
sensitive and promoted by the molecular crowding conditions inside
Induced fit binding: It has long been established that enzymatic
reactions undergo a series of steps called reaction step intermediates.
An early intermediate in every reaction is the stabilization of
the enzyme-substrate complex, called transition state. The transition
state mechanistically explains the ability of enzymes to lower the
activation energy of a reaction, thus greatly increasing its catalytic
rate. For non-enzymatic reactions, i.e., ligand binding events on
receptors, an analogous enhanced interaction between protein and
ligand is found. Here, the initial binding event induces a small
conformational change, which increases the molecular closeness between
protein surface and ligand, and thus strengthens the interaction.
This mechanism is called the induced fit model of ligand binding.
At low temperature, i.e. rigid protein structures, ligands loose
their affinity for the receptor.
Protein folding can be extended to the study of functional changes
in protein structures that brings a protein from an inactive to
an active state (see also allosteric regulation below). Sometimes
these conformational changes can be substantial as is the case for
the calcium sensing protein calmodulin (troponin C in muscle). Calmodulin
undergoes a reorientation of its two domains upon binding of four
calcium ions that results in the exposure of a hydrophobic cavity
which allows calmodulin to bind hydrophobic target peptides. These
target peptides are surface loops on proteins that are inactivated
by these loops (self-inhibition). Calmodulin binding releases the
inhibition thus activating the targeted protein/enzyme, often kinases
and the calcium pump, a plasma membrane protein responsible for
the excretion of cytoplasmic calcium after a physiological signaling
and Protein Complex Formation
Structural proteins and molecular motors are typically found as
large protein complexes and the conformation of the supra-molecular
structure is a function of the strength of interaction between protein
surfaces. A typical example is the cell-cell adhesion mediated by
protein interaction. The neuronal cell adhesion molecules (NCAM)
mediate cell contact in a supramolecular complex called cadherin
zippers. The protein-protein interaction is mediated by immunoglobulin
like domains and depends on calcium as stabilizing co-factor. These
protein-protein interactions are sensitive to environmental conditions
like salt concentration, pH, hydrophobicity, temperature or pressure.
Molecular motors are protein complexes undergoing controlled changes
in their supra-molecular (quaternary) organization (rotation, lateral
movement, contraction) due to local changes in cellular electrochemical
Membrane proteins mediate substrate transport or signaling across
cell membranes. Transport is mediated by ion
channels, transporters, and pumps. The latter are distinguished
kinetically (transport rate) from ion channels which promote a fast,
diffusion controlled flux, while pumps control an energy dependent
'uphill' transport. Pumps regenerate the chemical potential stored
by biological membranes and dissipated by ion channels. In the process
of dissipation chemical energy is converted into chemical (ATP synthase)
or mechanical energy (molecular motors). Although metabolic and
membrane transport processes occur under non-equilibrium conditions,
they are studied experimentally at chemical equilibrium. The first
step in a catalysis, complex formation, or transport process is
a binding event, which is quantified by its equilibrium constant
known as dissociation constant, KD, and measures the affinity
of a substrate for an enzyme, a ligand for a receptor, a permeant
for a transporter.
Membrane proteins cannot be understood without an understanding
of the structure and function of biological membranes, also known
as phospholipid bilayers. Membranes are an example of complex self-assembly
systems. Complexity and self-assembly have become important
paradigms in modern biology and thus discussed here in some more
detail. The complexity of cellular structures is obtained by arranging
molecular components in regular, repetitive arrays. The determining
factors of this assembly is the hydrophobic effect, or more generally,
phase separation behavior. The surface structure and shape of the
unit molecule defines the overall architecture of the supramolecular
structure; its size, shape, and number of units. Unlike macromolecules,
which are true polymers and linked by covalent bond formation among
units, supra-molecular structures are stabilized by non-covalent
Examples are manifold. A very important and intriguing biological
supra-molecular structure is the cell
membrane - a double layer (bilayer) of phospholipids. Phospholipids,
the building blocks of cell membranes, are amphipathic molecules,
i.e., they are not entirely hydrophobic. They contain a hydrophilic
and/or charged headgroup linked to two fatty acid tails. Membranes
are stable in water because the hydrophobic fatty acids are protected
inside the membrane bilayer by the hydrophilic surface of the tightly
packed headgroups. Cell membranes are perfectly water soluble, and
they provide a hydrophobic barrier for small polar/charged molecules.
Cell membranes allow compartmentalization of cellular processes.
The hydrophobic barrier, which is essentially an electrical insulator
(capacitor) is regulated by membrane proteins that promote transport
processes; ion channels for passive diffusion of small ionic species,
facilitators and transporters for specific, passive transport, which
may be coupled to symport or antiport of a second molecular species
(flux coupling), and finally pumps, active transporters that utilize
chemical energy in form of ATP hydrolysis or light (photosynthesis).
Thus biological membranes not only form specialized cellular compartments
for various metabolic purposes, but also function as storage devices
for electrochemical potentials (ion gradients).
about how to study ion channels in synthetic membranes
Other examples of biological self-assembly complexes include ribosomes
and chromosomes, large multi-subunit particles of proteins and nucleic
acids, the cytoskeletal fibers - microfilaments made of actin and
microtubules made of tubulin. These elongated fibers have two functions.
They determine the shape of mammalian cells and they are dynamic
systems providing a way of intracellular transport by means of subunit
shuffling between fiber ends. Cells are not a homogeneous solution
of molecules, but highly organized compartments. These properties
are particularly apparent during embryogenesis, where cells or cell
ensembles gain precise polarity, a functional asymmetric distribution
of cellular components necessary for proper cell growth and differentiation.
The self-assembly properties of small, amphipathic molecules is
utilized to design novel, supra-molecular structures with defined
functional properties. The goal is to produce molecular scale structures,
molecular motors, fibers, conducting elements etc. in the nanometer
range. The technology of producing these tiny assemblies is commonly
referred to as nanotechnology. Nanotechnology is an attempt to control
the crystallization process of small molecules of varying size,
shape and solubility properties. The formation of cylindrical structures
called nanotubes form microscopic channels and molecular sieves
controlling transport processes across biological membranes. These
nanotubes may provide useful for the design of drug delivery devices.
Their specificity for what is transported and across membranes of
which cell types could be used to deliver molecules to tumor cells
only and not healthy tissue.
Allosteric properties are the result of conformation changes induced
by molecular interactions with macromolecules by both small ligands
and protein-protein interaction. The conformational changes induced
by binding are the essence of regulating the activity of proteins
by shifting these macromolecules between functional and non-functional
Allostery and cooperativity.
Often protein (complexes) contain more than one binding site for
more than one type of ligand or substrate. Proteins have the ability
to coordinate what is going on at those different binding sites
in such a way that the binding of one ligand alters the affinity
for an other ligand on the same protein (-complex). This is an allosteric
mechanism. The 'interacting' ligands are not identical. For identical
ligands, the allosteric mechanism is called a cooperative effect.
Examples of allosteric mechanisms are the ligand binding induced
changes in conformation of cell surface receptors (signal transduction)
or ion channels (action potential), or the interaction of ligand
binding induced dimerization of transcription factors (nuclear receptors).
The latter are nuclear proteins, which undergo a change in DNA binding
affinity as a function of dimer formation. The DNA binding event
activates or suppresses gene expression or replication activity.
A well known cooperative effect is the binding induced increase
in oxygen affinity on the four identical binding sites of hemoglobin.
While completely devoid of ligand, the affinity for molecular oxygen
is very low. Hemoglobin, which is a tetrameric protein complex with
four identical heme binding sites, undergoes a substantial conformational
change after the interaction of one molecule of O2. The conformational
change drastically increases the oxygen affinity for the remaining
three binding sites, a positive cooperativity between the four binding
The scope of drug targets is as large and broad as the proteins
and nucleic acids found in cellular metabolism. There are, however,
preferred targets and they are mostly located at the extra cellular
surface of cells. Similar to hormones, neurotransmitters, growth
factors and natural toxins, many drugs bind to membrane proteins
that belong to two major classes - G-protein coupled receptors (GPCR;
metabotropic) and ion channels (ionotropic). The latter are sensitive
to local anesthetics,
which directly bind to ion channel proteins, while both iono- and
metabotropic receptors are sensitive to general anesthetics, which
are believed to function through modification of the physical properties
of cell membranes. A special group of membrane interacting antibiotics
are pore forming peptides like alamethicin, gramicidin A, and mellittin.
They kill cells by perforating the electrochemical gradients of
membranes and depleting their energy storage.
An estimated 30% of all currently approved drugs bind to G-protein
coupled receptors or GPCRs. Although no high resolution structure
for this class of receptors is yet available, structural models
based on their homology to bacteriorhodopsin, a bacterial proton
pump, and bovine rhodopsin, a light sensitive G-protein coupled
protein were used for rational drug design for GPCR ligands. GPCRs
are have a simple generic structure with a large N-terminal domain
facing the extracellular side of the cell membrane followed by seven
transmembrane spanning (TM) domains (alpha helices) and a C-terminal,
cytoplasmic domain of variable length. The C-terminal domain and
the loop connecting TM5 and 6 interact with G-proteins which are
activated by GTP binding and receptor-ligand interaction in what
is known as a ternary complex. P2Y, a purine receptor, has recently
been modeled for its structure-activity relationship of ligand docking.
Both the receptor binding site and the ligand pharmacophore have
been characterized. P2Y1-ATP complex models describe three binding
sites: a meta-binding site I, meta-binding site II, and the principal
transmembrane spanning segment (TM) binding site. The meta binding
sites are provided by extracellular loop structures. A thermodynamic
modeling of binding energy indicates that meta-binding site I is
almost as strong as the principle TM binding site. This ligand-receptor
docking model does neither include G-protein binding nor molecular
motion. In fact, both agonist and antagonist binding are modeled
to the same receptor structure.
Recent studies confirm an increasing complexity in receptor activation
by dimerization which can lead to changes in ligand specificity
and affinity as compared to monomeric receptors. Ligands belong
to a variety of chemical structures including small amino acid derivatives
(e.g. dopamine) to larger peptides (e.g. opiates). Over 90% of ligands
belong to the peptide class.
The following receptor-assisted G-protein activation cycle has
been proposed as a two step model. First, the active receptor ternary
complex consists of a receptor with a GDP bound G-protein denoted
HRG(GDP). GDP interaction with the G-protein is the weakest
interaction leading to GDP dissociation. Second, GTP replaces the
diphosphonucleotide on the ligand-receptor-G-protein complex. The
HRG(GTP) ternary complex is the transition state of the
active cycle (consider the ligand-receptor complex as an enzyme
that catalyzes GTP loading on G-proteins) and G(GTP) dissociates
from the ligand-receptor complex leaving behind an HR complex that
can bind a new G(GDP) unit and stimulating nucleotide exchange,
while the newly released G(GTP) functions as an effector
module on kinases, lipases, ion channels, or adenylyl cyclase. The
GPCR system essentially recharges inactive GTPases by accelerating
replacement of non-hydrolysable GDP with hydrolysable GTP on Ga
Ligand-gated Ion Channels
Unlike the G-protein coupled receptors, ligand gated ion channels
open an ion selective pore allowing the flow of ions in or out of
the cell, depending on the actual membrane potential and ion gradient.
These channels serve as receptors for neurotransmitters like glutamate,
GABA, glycine, serotonin, histamine and acetylcholine. The receptor
for the latter, the nicotinic acetylcholine receptor (nAChR) has
been studied pharmacologically, electrophysiologically, and biochemically
since the late 1960s. The kinetics of channel activation and inactivation
are well understood and have served as one of the model systems
to study allosteric regulation. In this channel, two acetylcholine
units bind to each of the two alpha subunits causing the opening
of a gate within the membrane spanning portion of the receptor.
The ligand binding sites and channel gate are about 25 Angstrom
separated and the recent high resolution structural analysis has
helped understand the mechanism of this gating or allosteric mechanism
(Miyazawa et al., Structure and gating mechanism of the acetylcholine
receptor pore, 2003,Nature 423, 949 - 955). The gating mechanism
of this receptor complex is a nice demonstration of internal structural
changes (conformational changes) in response to ligand binding.
Here, acetylcholine binding changes hydrogen bond networks within
the alpha subunits relaxing some conformational stress within the
binding site. Upon breaking internal hydrogen bonds, the alpha subunit
can undergo a conformational relaxation which can affect subunit
contact sites tens of Angstroms apart, i.e., the membrane embedded
The fact that the channel is a pentameric protein complex underlines
the observation that protein-protein interaction within enzymatic
complexes allow fine tuning and precise control over the activity
pattern of proteins. Observation from the nAChR also show that protein
complexes spontaneously switch between conformational states - active
and inactive - even in the absence of ligand. Thus, the ability
of proteins to occasionally adopt active structures even in the
absence of the agonist (positive stimulus) corroborates the idea
that proteins are essentially fluid entities and that ligand binding
and covalent modifications (e.g. phosphorylation and other post-translational
modifications) simply stabilize proteins in one of at least two
thermodynamically stable conformations.
The change between an active and inactive state can thus be described
by a chemical equilibrium that switches between a state T (tense)
and state R (relaxed). Both states are equally stable (under the
right circumstances) and this similarity in stability can easily
be explained by the number of subunit contact sites in each state.
The transition from one to the other state requires the breaking
of some of these contact sites, but an equal number of similar non-covalent
bonds are formed 'trapping' the protein complex in either one of
Enzymes and Their Inhibitors
Enzymes catalyze the conversion of substrate(s) into product(s).
This process can be measured kinetically, how fast a product is
formed, and thermodynamically, in which direction the catalysis
proceeds. All metabolic reactions are reversible and are defined
by their chemical equilibrium where the net formation of a product
is zero. Enzyme catalyzed reactions in vivo are usually not at their
chemical equilibrium. They have a preferred direction determined
by substrate availability as well as being coupled to large energy
releasing reactions (exergonic reactions). The latter makes a catalytic
reaction de facto irreversible, a common feature of metabolic pathways.
Although enzymes are often involved in the chemical reaction mechanism
(covalent bond formation between substrate and enzyme), they are
not chemically modified at the end of the process. Enzyme catalyzed
rates are several orders of magnitude higher than the corresponding
spontaneous reaction in aqueous solution.
Proteases are enzymes that cleave,
or cut, or degrade other proteins by hydrolyzing peptide bonds.
This may sound like an uninteresting topic, but protein degradation
plays a major role in cellular processes. Proteases are involved
in cellular control mechanisms like the removal of old and unused
proteins, an essential part of the turnover pathway of all proteins
in the cells and affects cell growth, degradation of proteins and
peptides for nutritional purposes, defense mechanisms against intrusive
proteins and peptides, or control of protein activity.
Being enzymes, proteases can be characterized by their substrate
affinity and the catalytic rate of the reaction. Using proteases
to study the effects of single amino acid substitutions (mutations)
on catalytic rate and substrate affinity demonstrated that these
two properties are linked and that this linkage can be explained
by analyzing the conformation of the catalytic or active site of
the enzyme. This analysis showed that four major functional groups
are found in the catalytic site of proteases. Chymotrypsin, trypsin,
and elastase are three members of the family of chymotrypsin proteases.
They all are serine proteases because the amino acid residue at
the catalytic site responsible for the transition state stabilization
is a serine. The active site of a serine protease can be divided
into four essential structural features required for the catalytic
action of serine proteases:
site structure of an protease
element of active site
|The main chain substrate binding
||non-specified binding of polypeptide segment
||Specific binding of side chains, sequence specificity
||Stabilizes transition state S* over S in enzyme
|Catalytic triad (Asp-His-Ser)
||Forms tetrahedral intermediate (transition state;
stabilizes S* over S); hydrolyzes peptide bond
Proteases have preferential cleavage sites in the sequence of a
protein substrate. The specificity pocket provides a small binding
pocket consisting of 3 amino acid residues that determine the local
polarity and electrostatic potential profile for the interaction
of residue n-1 on the substrate on the N-terminal side of the scissile
bond. For the chymotrypsin family of serine proteases we find the
following sequence specificities: chymotrypsin binds bulky, aromatic
residues, trypsin binds positively charged residues, and the extra
cellular matrix protease binds small, non-charged amino acid residues.
The 3-D folds of these three proteases are very similar, although
there sequences are not identical, although they are evolutionarily
related. Another serine protease family, subtilisin family of serine
proteases, are products of bacilli species. Their overall native
fold is very different from that of the chymotrypsin type proteases,
but the catalytic dryad conformation is identical. The 3-D fold
of subtilisin has an a/b motif (instead
of the b-barrel motif of chymotrypsin
domains) with five parallel b-strands surrounded by 4 a-helices.
The comparison of the two families of serine proteases tells us
two different things. First, it has been reasoned to be an example
of convergent evolution, where the formation of a catalytic site
has evolved twice, with each serine protease family exhibiting a
different overall 3-D structure. Second, the differences in the
3-D structure gives us an idea of the different cellular locations
of the corresponding protein families: the catalytic site of an
enzyme is conserved over evolutionary time, while the overall structure
is conserved to provide structural stability for optimal activity
of the protein in any given environment. We need only understand
that very different sequences can provide similar 3-D structures
because water solubility depends only on the distribution of hydrophilic
and hydrophobic residues, but not on other chemical properties.
Overall structural features thus reflect the location of the protein,
if it is located intracellular, extra-cellular, if it is cell membrane
protein, or if it is resistant to temperature changes or sensitive
to proton or calcium concentrations.
Trypsin (protease) Inhibitor
Cellular control of proteases is carried out by protease inhibitors.
These are small peptides or proteins that can bind to the active
site of the protease (competitive inhibitor) but which are not hydrolyzed,
thereby blocking the access of substrates, e.g. protecting tissue
proteins. One example of a protease inhibitor is the bovine pancreatic
trypsin inhibitor (BPTI), a small protein of 58 amino acids. Its
structure has been determined by X-ray crystallography and the protein
has been widely used for folding studies of proteins. BPTI binds
to trypsin through hydrogen bonding forming a tightly packed interface
between inhibitor and enzyme. The Michaelis-Menten constant of BPTI
binding Km = 10^-13M. The lysine at position 15 binds to the specificity
pocket followed by an alanine. The reaction is blocked at the formation
of the transition state intermediate.
Understanding structure-function relationship of proteins can give
vital information for the development of drugs that interact with
proteins in a host-pathogen environment. A recent example of rational
drug design has been the development of an anti-HIV drug, the protease
inhibitor. What is important is that the knowledge of the structure
of a protein, which is essential for the life cycle of the virus,
has been elucidated by X-ray crystallography and functional studies
on related proteases, the aspartate family of proteases, has provided
insight into the ligand-enzyme interaction. Thus, a HIV protease
inhibitor has been designed by predicting a structure that binds
with an affinity several orders of magnitude higher to the viral
protease than to related host proteases. Consequently, virus replication
can be inhibited without interfering with of the host metabolism.
The human immunodeficiency virus encodes for an aspartate protease
(HIV PR). This protease is essential for proper virion assembly
and maturation. Inactivation of this protease has therefore been
identified as a therapeutic approach to suppress virus replication
and complements already existing drugs interfering with HIV reverse
Essential for the rational design of a protease inhibitor was the
successful crystallization of the protease with and without bound
substrate. The HIV protease is a member of the family of aspartate
proteases and related to the pepsin family of proteases. It is inhibited
by pepstatin, the natural inhibitor of pepsin. The structure of
pepsin and its binding of pepstatin are known and this information
forms the basis of a successful design of a HIV protease inhibitor
by using computer models to identify the best possible inhibitor
The active site of aspartate proteases contains a pair of aspartate
residues in close proximity with a water molecule hydrogen bonded
and oriented optimally to attack the scissile bond of the substrate.
The aspartate pair is located at the domain interface in pepsin,
a monomeric protein, and at the subunit interface in HIV protease,
a homodimer. The catalytic site in viral and cellular aspartic proteases
are very similar, but the importance lies in minute differences
in symmetry relations at the interface of domains in pepsin and
subunits in HIV protease.
HIV protease inhibitor
On the basis of the difference in symmetry at the active site of
pepsin and HIV protease inhibitors have been designed that show
a much higher affinity for the viral protein than for the host protease.
Ki(HIV) >> Ki(Pepsin)
What happens when an asymmetric substrate (peptide) interacts with
a symmetric enzyme? The subunits in the homodimer of the HIV protease
are able to distinguish whether they interact with the N- or C-terminal
end of the substrate/inhibitor. Like serine proteases HIV PR contains
a specificity pocket for the substrate sequence -P2-P1-P1'-P2'-
with residue P1 being either Q or E and residue P2 any hydrophobic
amino acid. A good inhibitor exhibits a high affinity for the specificity
pocket and contains an non-hydrolysable 'scissile' bond P1-P1'.
Substrate analog inhibitors have therefore been designed that function
as peptido mimetic. The scissile amide bond of a peptide substrate
is replaced by non-hydrolysable isosteres with tetrahedral geometry
(that mimics the substrate intermediate tetrahedral geometry of
a peptide substrate). The binding of a hydroxyethylene peptide mimetic
is stabilized by the hydrogen bond formation of the hydroxyl of
the backbone with the aspartates in the active site of the protease.
The development of protease inhibitors has been accelerated by
successfully using the concept that the best inhibitors are those
that mimic the transition state structure of the substrate of proteases.
Recognition of DNA
DNA binding molecules
Drugs affecting gene expression inhibit the action of hormone regulated
nuclear receptors. These are DNA binding proteins which either activate
or suppress the transcriptional activity. So called transcription
factors are regulated by dimerization induced by ligand binding.
Transcription factor inhibitors either block agonist binding, or
prevent dimerization. Examples include the nuclear receptors for
estrogen, thyroxin, glucocorticoids, and the morphogen retinoic
acid. In plants the ripening process of harvested fruits can be
delayed by inhibiting the expression of a gene involved in ethylene
production, the causative agent of ripening. Other drugs affecting
gene expression either directly bind to DNA (non-covalent) or chemically
modify DNA by cross-linking or strand cleavage. Non- covalent interaction
by small non-peptidic molecules is mediated by base intercalation.
Aromatic flat molecules integrate themselves between the base pair
stacks changing the conformation of the double helix. Examples include
the antibiotics actinomycin D and proflavin. Cross linking agents
form covalent bonds mostly to nitrogen groups on guanine bases changing
the surface structure of DNA and thus blocking protein binding.
Examples include aflatoxin and cis-platin. Other anti-fungal and
antibacterial agents induce DNA strand cleavage, such as bleomycin,
anthramycin, and tomaymycin, all of which are antibacterial and
Nuclear receptors are transcription
factors controlling gene expression activity as activators or
repressors. They form a superfamily of currently 69 members and
made of seven families. Based on their ligand specificity, they
are split into two groups, type I receptors that bind sterol based
ligands (e.g. estrogen, glucocorticoid), and type II receptors that
bind non-sterol based ligands (e.g. thyroxine, 9-cis-retinoic acid,
vitamin D). Type I receptors form homodimers, while type II receptors
form heterodimers, usually involving one retinoid X acid receptor
(RXR) subunit, homodimers, or monomers (steroidogenic factor, nerve
growth factor induced gene B). Many novel nuclear receptors being
discovered are orphan receptors of type II, meaning that their natural
ligand has not been identified yet, although it must be a non-steroid
structure based on receptor family classification. Type I and II
receptors are activated in different ways. Steroid hormone receptor
in the absence of ligand are found in the cytoplasmic compartment
complexed with heat shock protein subunits like hsp90, 70, or 56.
Ligand binding causes dissociation of the heat shock subunits, dimerization
of the receptor, and transport of the ligand-receptor complex into
the nucleus. Type II receptors are localized exclusively in the
nuclear compartments and often function as silencer in the absence
of ligand by recruiting corepressors. Ligand binding releases the
corepressor activating transcription.
A thyroid hormone affecting metabolic rate, temperature adaptation
in warm-blooded vertebrates, regulation of water and ion transport
across membranes, regulation of cholesterol metabolism and nitrogen
secretion, controls growth rate of mammalian and amphibian cells,
is involved in the maturation of the central nervous system, controls
amphibian metamorphosis, and regulates some mitochondrial enzymes
important in energy metabolism.
Thyroxine is synthesized in the thyroid gland, secreted and transported
by blood plasma proteins albumin (TBG) or transthyretin (TTR). Inside
cells, thyroxine is bound to cytoplasmic binding proteins (CBPs)
such as myocardial myoglobin or thyroxine peroxidases which catabolize
the hormone after it is no longer used. The nuclear thyroid hormone
receptor TR are encoded by two genes (alpha and beta) and differ
in ligand recognition and the effects of ligand in binding coactivators
and corepressors. The ligand binding difference is caused by a single
amino acid substitution in the binding pocket (Asn in alpha, Ser
in beta) of each receptor subtype.
Structure and function (thyroxine binding) of TTR are well characterized.
The protein forms a tetrameric complex and binds one thyroxine molecule
in a central channel formed of beta sheets. High resolution structures
allowed the elucidation of the ligand protein interaction. The most
likely interaction are two isosteric conformations. Antagonists
to TTR can modulate abnormal growth conditions controlled by the
thyroid gland. TTR is also an amyloidogenic protein. Human amyloid
disorders. Familiar amyloid polyneuropathy and cardiomyopathy, and
senile systemic amyloidosis are caused by insoluble TTR fibrils
which deposit in peripheral nerves and heart tissue. Non steroidal
anti-inflammatory drugs have been found to strongly inhibit fibril
formation in vitro. The protein-drug interaction stabilizes the
native tetrameric TTR conformation.
The availability of at least three different natural receptors
for thyroxine (T3) allows for a comparative study of agonist and
antagonist binding. Usually, different ligand structures are available,
but only one receptor structure.
A class of drugs not involving any protein interactions are short
synthetic oligonucleotides called antisense DNA or RNA strands.
They will bind to either DNA or RNA stretches on chromosomes or
RNA blocking gene expression and/or translation. Interestingly,
anti-sense drugs have been improved by combining short oligonucleotides
with polycyclic intercalating residues on each end (3' and 5') drastically
increasing affinity of intercalating binding mechanisms while at
the same time targeting this intercalating agents to short, gene
Small molecule ligand-DNA
The small ligand drug approach offers a simple solution. The synthesis
and screening of synthetic compounds that do not exist in nature,
work much like pharmacological ligand for cell surface receptors
in excitable tissue, and appear to be more readily delivered to
cellular targets than large RNA or protein ligands. The lack of
sequence specificity for intercalating molecules, however, does
not allow to target specific genes, but rather certain cellular
states or physiological and pathological conditions, like rapid
cell growth and division that can be selectively suppressed as compared
to non growing or slowly growing healthy tissue.
The following properties have been identified as important for
the successful modeling of ligand-DNA interaction:
- degrees of freedom
- role of base pair sequence
- counter ion effects
- role of solvent
This problem is analogous to that of protein ligand interaction.
The major requirement for intercalating agents is the planar aromatic
ring structure. This structure fits between to adjacent base pair
planes and can have some, although much restricted, rotational freedom
within the plane of the ring. The ligand itself may have flexibility
of structural parts outside the DNA binding site and may contain
more than one intercalating sidechain:
The structure of the antibiotic triostin A shows the presence of
two quinoxaline (double aromatic rings) units linked through a cyclic
peptide structure which is stabilized at its center by a cystein
pair (disulfhydryl covalent bond).
Triostatin A belongs to a family of antibiotics which are characterized
by cross-linked octapeptide rings bearing two quinoxaline chromophores.
Since the spacing between the chromophores is 3.5A, the intercalation
process sandwiches two base pairs between the two quinoxalines.
This phenomenon is called bis-intercalation and has first been described
for echinomycin by showing that bis-intercalating drugs cause twice
the DNA helix extension and unwinding seen as compared to single
intercalating molecule like ethidium. The latter is a chromophor
which is activated by UV light and is used by molecule biologists
to label nucleic acids in gel electrophoresis or ion gradient centrifugation.
Role of base pair sequence Experimental evidence suggests that
base pair sequence does not play a large role on the specific nature
of most intercalating complexes. As the structure of triostatin
A suggests, however, the linker peptide structure may well promote
specific interaction with the DNA surface. The major group specific
readout sequence of H-bond donor and acceptor could be involved
in triostatin A binding. The table below shows the direct readout
of the DNA base sequence on a double helical structure. The following
characteristics of non covalent bond formation are associated with
the binding sites indicated above (readout sequence of minor (S)
and major groove (W) side as they are available for protein interaction.):
||GC base pair
||AT base pair
||C-H weak hydrophobic
||CH3, strong hydrophobic
While the interaction on the major groove side is distinct for
the direction of the base pair (e.g. AT vs. TA), there is no directionality
at the minor groove side. Minor groove interaction can, however,
distinguish GC content (e.g. TATA box binding protein recognizing
AT rich sequences for RNA polymerase initiation complex).
The molecular basis of specific recognition between echinomycin
and DNA is due to the hydrogen bonding between the ligand alanine
carbonyl groups and the 2-amino group of guanine. This is consistent
with the observation that the preferred binding site is the sequence
Counter ion effect DNA is a negatively charged polyanion attracting
counter ions, positively charged Na+, or Ca++ and Mg++ ions as well
as basic residues of proteins. The presence of small counter ion
affect drug binding, since the counter ions can screen and shield
the negative backbone surface allowing non electrolytes as well
as positively charged ligand to interact more strongly with the
DNA target. High ionic strength, however, reduces non covalent interaction
mediated by hydrogen bonds and electrostatic interactions.
Role of solvent ligand-receptor binding There are three general
classes of interactions that must be considered in solvated ligand-receptor
binding (a) ligand solvent interaction (e.g. hydration shell), (b)
receptor solvent interaction, and (c) ligand-DNA complex with solvent
interaction. The three classes basically describe the sequence of
events of free ligand interacting with its receptor and the change
in overall solvent interaction before and after binding. We have
seen that the hydrophobic effect is completely described by this
system and the contribution of the entropy of free bulk water is
the major driving force of hydrophobic ligand receptor interaction.
This type of interaction is found in intercalating substrates because
the hydrophobic, aromatic side chains interactive favorably with
the aromatic environment of the base pair stacking. The total amount
of surface bound water is reduced in the after complex formation.
Rational for drug design When a compound intercalates into nucleic
acids, there are changes which occur in both the DNA and the compound
during complex formation that can be used to study the ligand DNA
interaction. The binding is of course an equilibrium process because
no covalent bond formation is involved. The binding constant can
be determined by measuring the free and DNA bound form of the ligand.
Since many of the intercalating substrates are aromatic chromophores,
this can be done spectroscopically. Also, DNA double helix structures
are found to be more stable with intercalating agents present and
show a reduced heat denaturation. Correlating these biophysical
parameters with cytotoxicity is used to support the antitumor activity
of these drugs as based on their ability to intercalate in DNA double
Improvement of anticancer drugs based on intercalating activity
is not only focused on DNA-ligand interaction, but also on tissue
distribution and toxic side effects on the heart (cardiac toxicity)
due to redox reduction of the aromatic rings and subsequent free
radical formation. Free radical species are thought to induce destructive
cellular events such as enzyme inactivation, DNA strand cleavage
and membrane lipid peroxidation.
Modeling DNA-ligand interaction
of minor groove binders
Hairpin minor grove binding molecules have been identified and
synthesized that bind to GC rich nucleotide sequences. Hairpin polyamides
are linked systems that exploit a set of simple recognition rules
for DNA base pairs through specific orientation of imidazol (Im)
and pyrrol (Py) rings. The hairpin polyamides originated from the
discovery of the three-ring Im-Py-Py molecule that bound to minor
groove DNA as an antiparallel side by side dimer.
The optimal goal of polyamide ligand design has been reached with
finding structures able to recognize DNA sequences of specific genes.
The structure shown above inhibits the expression of 5S RNA in fibroblast
cells (skin cancer cells) by interfering with the transcription
factor IIIA-binding site.
A new strategy of rational drug design exploits the combination
of polyamides with bis-intercalating structures. WP631 is a dimeric
analog of the clinically proven anthracycline antibiotic daunorobuicin.
This new synthetic compound shows an affinity of 10pM and also
showed to be resistant against multidrug resistance mechanisms often
encountered in antitumor therapy. Multidrug resistance is a phenomenon
where small aromatic compounds are efficiently expelled from the
cell by cell membrane transport proteins commonly referred to as
ABC transporters (or ATP Binding Cassette Proteins).
Drugs that form covalent
bonds with DNA targets
Drugs that interfere with DNA function by chemically modifying
specific nucleotides are Mitomycin C, Cisplatin, and Anthramycin.
Mitomycin C is a well characterized antitumor antibiotic which
forms a covalent interaction with DNA after reductive activation.
The activated antibiotic forms a cross-linking structure between
guanine bases on adjacent strands of DNA thereby inhibiting single
strand formation (this is essential for mRNA transcription and DNA
Anthramycin is an antitumor antibiotic which bind covalently to
N-2 of guanine located in the minor groove of DNA. Anthramycin has
a preference of purine-G-purine sequences (purines are adenine and
guanine) with bonding to the middle G.
Cisplatin is a transition metal complex cis-diamine-dichloro-platinum
and clinically used as anticancer drug. The effect of the drug is
due to the ability to platinate the N-7 of guanine on the major
groove site of DNA double helix. This chemical modification of platinum
atom cross-links two adjacent guanines on the same DNA strand interfering
with the mobility of DNA polymerases.
What is Life Home