Natural Product-like Compound Libraries

Natural products have been used for disease treatment from ancient times and have become an inspiration for modern drug discovery and development. Approximately 40 % of the developed drugs approved by the FDA during the last decades were natural products, their derivatives, or synthetic mimetics related to natural products [1].

As many as 20 % of natural products lay in the chemical space beyond the Lipinski’s “Rule of Five” (Ro5) and typically are discarded during the lead optimization process. Meanwhile, many of such natural product-like based drugs still show the potential ability to cure life-threatening diseases (for example, they are applied as HIV protease inhibitors, anticancer agents, and heart stimulators) [2-3]. Compared to the typical synthetic small drug-like molecules, natural products tend to have more sp3-hybridized bridgehead atoms, more chiral centers, a higher oxygen content but lower nitrogen one, and preferably aliphatic rings over aromatic ones [4-5].

Remarkable structural diversity and drug-likeness of molecular scaffolds, identified in natural compounds, provide a basis for the design of novel natural product-derived compound libraries within attractive chemical space for drug discovery and lead optimization (Fig. 1) [6-7].

Life Chemicals has developed its proprietary collection of dedicated Screening Libraries of over 7,700 synthetic compounds similar to natural ones for modern drug discovery, using the following two approaches:

These screening compound collections have already been recognized to be extremely useful tools for high throughput screening (HTS) and high content screening (HCS) programs.

Fig. 1. Small-molecule approved drugs 01JAN81 to 30SEP19; n = 1394 (Picture source: J. Nat. Prod. 2020, 83, 3, 770-803).

Natural Product-like Compound Library by Similarity Search

This Screening Library was designed by 2D fingerprint similarity filtering vs natural compound scaffolds. The commercial databases of Specnet, TARGETMOL, SELECKCHEM, ICC, AnalytiCon Discovery, TimTec were used as reference sets. A Tanimoto 85 % similarity cut-off was applied to result in about 4,400 structurally diverse compounds, available from the Life Chemicals HTS Compound Collection.

Natural Product-like Compound Library by Chemoinformatics and Substructure Search

The Life Chemicals HTS Compound Collection was analyzed by two different methods

  1. Chemical descriptor calculation (7,300 compounds selected)
  2. Natural-likeness scoring (9,700 compounds selected).

By overlapping both obtained small-molecule screening compound sets, about 3,300 natural product-like molecules with excellent characteristics in these two approaches were selected [6,8].

Chemical descriptor-based selection method

The selection has been made in two steps:

Substructure search for natural-like scaffolds and most relevant groups in the Life Chemicals HTS Compound Collection (Fig. 2) [9]:

coumarins flavonoids aurones alkaloids (aloperine, cytisine, lupinine, colchicine) bile acids aryl benzothiazole arylpiperazine arylpiperidine benzofuran benzoxazole benzodiazepine benzothiophene benzylpiperidine   indole indoline indolizine isoquinoline purine quinazolinone quinoline quinoxaline steroide tetrahydroisoquinoline tetrahydroquinoline

About 60,100 samples were selected from about 509,970 compounds. 

Validation of the method and calculation of the parameters listed in Table 1 for Pure Natural Products (PNP, MNP), NPs and Derivatives/Analogs (SNP), NP-based Combinatorial Compounds (NatDiv), and LC Derivatives/Analogs (LC).

Compound distribution by the presence of the natural-like scaffolds in their chemical structure within the Natural Product-like Compound Library.

Fig. 2. Compound distribution by the presence of the natural-like scaffolds in their chemical structure within the Natural Product-like Compound Library.

Natural-likeness scoring

The evaluation of the compound natural product-likeness is an important asset in the selection and optimization of natural product-like drugs and synthetic bioactive compounds. Natural-likeness scoring, based on the sum of the frequency of certain molecule fragments among natural products (NPs) and small molecules (SMs), was performed by the natural product-likeness calculator [10-11]. Its results, presented in Fig. 3 and Table 1, show the distribution of real natural products and natural-like compounds by the score.Natural product-likeness scorer. Distribution of real natural products.

Fig. 3. Natural product-likeness scorer. Distribution of real natural products.

Table 1. Mean values of descriptors, which play the most important role in the characterization of natural products.

MW 393.9 503.6 409.2 441.3 389.2
HAC 28.2 34.6 29.1 31.1 27.7
ClogP 2.3 3.9 3.7 2.1 3.6
H-donors 2.7 2.6 1.4 2.3 1.4
H-acceptors 6.6 7.4 6.4 8.0 4.2
TPSA 98.9 108.9 83.2 104.7 79.8
Ring count 3.6 2.9 3.5 4.0 3.9
Aromatic rings 5.1 3.5 11.8 9.5 15.3
Rotatable bonds 5.2 11.5 6.1 5.3 5.0
Number of N atoms 0.7 1.2 2.1 3.6 2.6
Number of O atoms 5.9 6.1 4.3 4.4 3.1
Number of chiral atoms 5.5 6.3 1.4 2.3  1.3
LipViol≥2 18% 30% 10% 8% 2%


