Molecular descriptors

From DrugPedia: A Wikipedia for Drug discovery

(Difference between revisions)
Jump to: navigation, search

Jagat Chauhan (Talk | contribs)
(New page: '''Molecular descriptor''' is any molecular property to characterize the molecule to search through database, to calculate another molecular property etc. "The molecular descriptor is the...)
Next diff →

Revision as of 04:04, 18 September 2008

Molecular descriptor is any molecular property to characterize the molecule to search through database, to calculate another molecular property etc.

"The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment."

INTRODUCTION Biological active substances interact, in most cases, with biomolecules, triggering specific molecular mechanisms like activation of an enzyme cascade or opening of an ion channel, which finally leads to a certain biological response. Quantitative structure-activity relationships (QSARs) correlate this response with molecular properties of compounds under interest. Because the response depends on the concentration of the active substance at the site of action and on the strength of interaction with the biological macromolecule, both of these aspects must be modeled quantitatively by QSAR. In the mean time, a variety of descriptors of molecular properties have been developed. Computational approaches to lipophilicity are nearly as diverse as the QSAR methods themselves. Electronic properties in terms of point charges or molecular electrostatic potentials can be evaluated by quantum-chemical ab initio methods for molecules up to 50 atoms. Using semiempirical methods like AM1 or PM3, such properties can be calculated even for larger systems. Steric descriptors reach from molecular surface and volume to connectivity and topological indices and to Verloop parameters.

The descriptors (independent variables) are correlated to the biological activity (dependent variable) by means of statistical methods. Most commonly multivariate linear regression (MLR) is used, but also partial least squares (PLS) or neural networks. In some QSAR approaches genetic algorithms are employed to identify the relevant descriptors: population of models are created and step by step, models with a better "fitness score" (i.e. with better predictivity) are produced by "genetic operations" like cross-over, point mutations or selection generate. In "classical" QSAR, both 2D (e.g. topological indices, indicator variables) and 3D (surface, volume, electronic properties) descriptors are correlated with the biological activity.

  • Molecular descriptors are numerical values that characterize properties of molecules
  • Examples:

o Physicochemical properties (empirical) o Values from algorithms, such as 2D fingerprints

  • Vary in complexity of encoded information and in compute time

Descriptors for Large Data Sets

  • Descriptors representing properties of complete molecules

o Examples: LogP, Molar Refractivity

  • Descriptors calculated from 2D graphs

o Examples: Topological Indexes, 2D fingerprints

  • Descriptors requiring 3D representations

Example: Pharmacophore descriptors

DESCRIPTORS CALCULATED FROM 2D STRUCTURES

  • Simple counts of features

o Lipinski Rule of Five (H bonds, MW, etc.) o Number of ring systems o Number of rotatable bonds

  • Not likely to discriminate sufficiently when used alone
  • Combined with other descriptors for best effect

Physicochemical Properties

  • Hydrophobicity

o LogP – the logarithm of the partition coefficient between n-octanol and water

  • ClogP (Leo and Hansch) – based on small set of values from a small set of simple molecules