(Quantitative Structure Activity Relationship)
relationships collectively referred to as QSARs, theoretical models that can be
used to predict the physicochemical and biological properties of molecules. A structure-activity relationship (SAR)
is a (qualitative) association between a chemical substructure and the
potential of a chemical containing the substructure to exhibit a certain
biological effect. A quantitative structure-activity relationship (QSAR) is a
mathematical model that relates a quantitative measure of chemical structure
(e.g. a physicochemical property) to a physical property or to a biological
effect (e.g. a toxicological endpoint).
This approach attempts to identify and
quantify the physicochemical properties of a drug and to see whether any of
these properties has an effect on drugs biological activity. If such a relationship holds true, an
equation can be drawn up which quantifies the relationship and allows the
medicinal chemist to say with some confidence that property has an important
role in the distribution or mechanism of the drug. By quantifying
physiochemical properties, it should be possible to calculate in advance what
the biological activity of a novel analogue might be.
are many practical purposes of a QSAR and these techniques are utilized widely
in many situations. The purpose of in silico studies, therefore, includes the
predict biological activity and physico-chemical properties by rational means.
comprehend and rationalize the mechanisms of action within a series of
these aims, the reasons for wishing to develop these models include
in the cost of product development (e.g. in the pharmaceutical, pesticide,
Personal products, etc. areas).
could reduce the requirement for lengthy and expensive animal tests.
(and even, in some cases, replacement) of animal tests, thus reducing animal
use and obviously pain and discomfort to animals.
areas of promoting green and greener chemistry to increase efficiency and
eliminate waste by not following leads unlikely to be successful.
Graphs and equations
A range of compounds are synthesized in
order to vary one physiochemical property (log P) and to test how this affects
the biological activity (log 1/C). A graph is then drawn to plot the biological
activity on the y-axis and physiochemical features on the x-axis. It is
necessary to draw the best possible line through the data points on the graph.
This is done by a procedure known as ‘linear regression analysis by the least
square method’. The best line will be the one closest to the data points. To
measure how close the data points are, vertical lines are drawn from each
point. These verticals are measured and then squared in order to eliminate the
negative values. The squares are then added up to give a total. The best line
through the points will be the line where this total is a minimum.
The equation of the straight line will be
y = k1x + K2 where k1 and K2 are constants. By varying k1 and K2, different
equations are obtained until the best line is obtained. This whole process can
be speedily done by computer programme. The significance of the equation is
given by a term known as the regression coefficient (r). This coefficient can
again calculated by computer. For a perfect fit r2 = 1. Good fits generally
have r2 values of 0.95 or above.
There are many physical, structural and
chemical properties which have been studied by the QSAR approach, but the most
commonly studied are hydrophobic, electronic and steric. This is because it is
possible to quantify these effects relatively easy.
The hydrophobic character of a drug is
crucial to how easily it crosses the cell membranes and may also be important
in receptor interactions. Changing substituents on a drug may well have
significant effects on its hydrophobic character and hence its biological
activity. Therefore it is important to have a means of predicting this
partition coefficient (P)
The hydrophobic character of a drug
can be measured experimentally by testing the drug’s relative distribution in
an octanol/water mixture. Hydrophobic molecules will prefer to dissolve in the
octanol layer of this two-phase system, whereas hydrophilic molecules will
prefer the aqueous layer. The relative distribution is known as the partition
coefficient and is obtained from the following equation:
= Concentration of drug in octanol/ Concentration of drug in aqueous solution
compounds will have a high P value, whereas hydrophilic compounds will have a
low P value.
graph is dawn by plotting log (1/C) versus log P; a straight line graph is
obtained showing that there is a relation between hydrophobicity and biological
activity. Such a line would have the following equation:
(1/C) = k1log P + k2
The electronic effect of various
substituents will clearly have an effect on a drug’s ionization or polarity.
This in turn may have an effect on how easily a drug can pass through cell
membrane or how strongly it can bind to a receptor.
Substitution Constant (?)
(1940) is a measure of e-withdrawing or e-donating effects exerted by the
substituents on the reaction center.
e-withdrawing groups stabilize the carboxylate ion: larger Kx, and have positive ? values, e.g.
Cl, CN, CF3.
e-donating groups (e.g. alkyl), equilibrium shifts left (favouring protonated):
lower Kx and negative ? values.
constant takes into account both resonance and inductive effects; thus, the
value depends on whether the substituent is Para or Meta substituted.
The ortho position is not measured due to steric effects. In some
positions only inductive effects effect & some both resonance &
inductive effects play a part. The electronic substitute constants are also
available for aliphatic groups
of resonance forms that stabilize the negative charged carboxylate in
constants, ? can be related to
the free energy of ionization via the Vant Hoff relationship (In this case ? would correspond to the equilibrium
constant, K, allowing for Hammett
relationship is to also be referred to as linear free energy relationship
(LFER)). Uses: Only one known example where just Hammett constants effectively
predict activity (insecticides, diethyl phenyl phosphates. These drugs do not
have to pass into or through a cell membrane to have activity).
(1/C) = 2.282 s – 0.348
is much harder to quantify. Examples are:
Taft’s steric factor (Es) (~1956), an
experimental value based on rate constants·
Molar refractivity (MR)–measure of the volume
occupied by an atom or group–equation includes the MW, density, and the
index of refraction Verloop steric parameter–computer program uses
bond angles, van der Waals radii, bond lengths.
is proposed that drug action could be divided into 2 stages: 1)
Transport & 2) Binding
1/C = k1P = k2P2 + k3s + k4Es + k5
Hansch Analysis looks
at size and sign for each component of the equation.
values of r 0. Pa and Pi are the estimates of probability for the compound to be
active and inactive respectively for each type of activity from the biological
activity spectrum. Their values vary from 0.000 to 1.000. It is reasonably that
only those types of activities may be revealed by the compound, which Pa >
Pi and so they are put into the biological activity spectrum.
If Pa > 0.7 the compound
is very likely to reveal this activity in experiments, but in this case the
chance of being the analogue of the known pharmaceutical agents for this
compound is also high.