Measuring Structural Diversity for Screening Compound Libraries

Expert-driven In Silico Drug Discovery Solutions
27 July 2022
Andrew Golub
Group Leader, Molecular Design

Chemical diversity is an important consideration to be taken into account to prioritize the selection of screening compound libraries or sub-libraries for experimental evaluation. In hit identification projects, it is preferable to select collections with chemically diverse structures to increase the probability of finding new scaffolds that can become leads or prototypes for specific biological targets [1]. The use of chemoinformatics tools to evaluate diversity also guides the creation of new small molecules, mainly in diversity-oriented synthesis (DOS) campaigns [2-3].

Different methods can be employed to assess the diversity and complexity of molecules in a chemical library, primarily depending on the data under investigation and the study goals. The molecular representation is another critical aspect of diversity and complexity analysis, in addition to the metrics used [4,5]. Molecular descriptors (including physicochemical properties and molecular fingerprints) and chemical scaffolds are two of the most common ways to represent molecules in chemoinformatics applications [6].

Currently, among routine methods to determine chemical diversity are those presented in Table 1.

Table 1. Most commonly used methods of molecular diversity analysis [7].




Fingerprint-based methods

Small molecules are compared by the presence or absence of a set of substructural or fingerprint features (derived from molecular graph representations)

Taken into account is atom connectivity, as an approach widely used in virtual screening

Shape-based methods

Information on molecular conformational features (internal distances or external molecular properties) is encoded and then applied to compare molecules in terms of these properties

The differences between molecules are measured based on their molecular shape, using a scaffold-independent approach


Pharmacophore-based methods

Molecular similarity is evaluated in terms of the presence or absence of pharmacophoric features (which may often be represented as fingerprints)

The study is focused on pharmacophoric points without taking into account the entire molecular surface; evaluated is the presence or absence of predefined substructural features but not the connectivity among them

Scaffold-based methods (Fig. 1)


The presence of various molecular skeletons is considered without providing information about the whole molecule

Characteristics are assessed by scaffold count/frequency analysis; applied are scaffold trees and Shannon entropy

Bioactivity-based methods

In silico bioactivity profiles mapping chemical structural space into a predicted bioactivity profile against numerous protein targets are applied

Compound diversity selection tasks are often solved mainly by evaluating diversity in bioactivity space

Fig. 1. Rose maps for (a) the total numbers of the Scaffold Tree for the 12 datasets (including Life Chemicals) and (b) the non-duplicated numbers of the Scaffold Tree for the same 12 datasets. Source:

At Life Chemicals, we successfully use various approaches to analyze compound diversity and design compound libraries. These include fingerprint-based methods (employing circular fingerprints), scaffold diversity assessment, and compound clustering.

Our selections include, in particular, structural diversity-oriented screening libraries:

  • Pre-plated Diversity Sets: 50,000 unique drug-like screening compounds selected by dissimilarity search from the Life ChemicalsHTS Compound Collection of newly synthesized molecules.
  • Diversity subsets of Fragment Libraries: small libraries of structurally-diverse fragments to provide the most-promising drug-like screening compounds for fragment-based drug discovery conveniently.

We also offer Computational Chemistry Services for any specific target provided by the customer.

Further reading: Diversity-based Screening of Compound Libraries in Drug Discovery.

Please, contact us at for any additional information and price quotations.

Visit our Website for a detailed product description, or place an order online on our E-commerce website

Download SD files with compound structures directly in our Downloads section

Custom compound selection based on specific parameters can be performed on request, with competitive pricing and the most convenient terms provided.


  1. J.L. Medina-Franco, Chemoinformatic Characterization of the Chemical Space and Molecular Diversity of Compound Libraries, in: Diversity-Oriented Synthesis, 2013, pp. 325-352,
  2. E. Lenci, A. Guarna, A. Trabocchi, Diversity-oriented synthesis as a tool for chemical genetics, Molecules 19 (2014) 16506-16528
  3. S.L. Schreiber, Target-oriented and diversity-oriented organic synthesis in drug discovery, Science 287 (2000) 1964-1969.
  4. JL. Medina-Franco, G.M. Maggiora. Molecular similarity analysis, in: J. Bajorath (Ed.). Chemoinformatics for Drug Discovery. John Wiley & Sons, Inc. Hoboken, NJ, 2013. pp. 343–399.
  5. R. Sheridan, Why do we need so many chemical similarity search methods? Drug Discov. Today 7 (2002) 903-911.
  6. N. Singh, R. Guha, M.A. Giulianotti, C. Pinilla, R.A. Houghien. J.L. Medina-Franco, Chemoinfomatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J. Chem. Inf. Model. 49 (2009) 1010-1024.
  7. Koutsoukas A, Paricharak S, Galloway WR, Spring DR, Ijzerman AP, Glen RC, Marcus D, Bender A. How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space. J Chem Inf Model. 2014, 27;54(1):230-242
27 July 2022, 17:23 Andrew Golub Computational Chemistry

Comments ()

    This site uses cookies. Some of these cookies are essential, while others help us improve your experience by providing insights into how the site is being used. By using our website, you accept our conditions of use of cookies to track data and create content (including advertising) based on your interest. Accept