It is generally known that protein-protein interactions (PPIs) are involved in important biological processes in living organisms, and their regulation can be crucial for the treatment of numerous diseases . In this context, protein-protein interactions (PPIs) are an attractive emerging class of molecular targets and are critically important in the progression of many disease states [1,2]. Inhibiting PPIs using small molecules is a tremendously important diagnostic and therapeutic strategy that may lead to greatly protracted remissions and even curative therapies for a number of diseases .
Keeping pace with a growing interest in various aspects of PPI, Life Chemicals has prepared the following Libraries of potential PPI inhibitors by ligand-based approach:
PPI Focused Library by Machine Learning (Decision Tree) Method
To predict compounds that can affect PPI, a machine learning method (decision tree, DT) was used . This method is recognized to be a useful tool to identify PPI inhibitors. The DT method is based on a cross-validation protocol to provide the balance between enrichment, sensitivity, and specificity of the learning data set.
By means of comparison of unique physicochemical features of PPI and non-PPI inhibitors, several descriptors showing a correlation for PPI inhibitors in a specified range of values were found:
- RDF 070m (≤ 3.31) is a shape-based descriptor that defines a radial distribution function of an ensemble of atoms in a spherical volume with the radius of 7 Å
- UI (> 4.13) - an unsaturation index directly linked to the number of multiple bonds, containing double, triple and aromatic bonds
- SHP2 (≤ 0.30) – an average shape profile index of order 2 deduced from the distance distribution of the geometry matrix
- Mor11m (> - 0.1) - a descriptor calculated by summing atom weights viewed by a different angular scattering function (signal 11 / weighted by atomic masses)
Other filters that were applied to the entire Life Chemicals HTS Compound Collection :
- ClogP = 1.5 – 4.5
- TPSA = 75 – 120
- MW ≤ 475
- HBD = 0 – 4
- HBA = 4 – 9
- PAINS filters
Resulting compounds were included in the Life Chemicals PPI Machine Learning Method Library that finally comprised almost 6,900 molecules (Fig. 1). PAINS, toxic and reactive compounds are excluded from the library.
Figure 1. Principal component analysis (PCA) showing the accumulation of compounds best matching our parameters.
PPI Focused Library by 2D Similarity Search vs Timbal DB
Based on the reference database of 18,936 compounds, almost 2,400 compounds were extracted from the Life Chemicals HTS Compound Collection by 2D fingerprint similarity search towards TimbalDB  with Tanimoto 85 % threshold. The Library included inhibitors for the next protein-protein complexes:
- Annexin A2/S100-A10
- Bcl-2 and Bcl-XL with BAX; BAK and BID
- BetaCatenin/Tcf4 & Tcf3
- CD80/CD28 (or CTLA-4)
- Clathrin/adaptor & accessory proteins
- SOD1 dimer
- TNFa trimer or TNFa/TNFR
- Transthyretin tetramer
- Tubulin dimer
- UL30(Pol)/UL42 subunits of HSV type 1 DNA polymerase
- XIAP/Caspase9 or SMAC (BIR3 domanin)
All PAINS, toxic, reactive, and inactive compounds are excluded from the library.
PPI Focused Library by 2P2I and iPPIDB dataset and the Rule of Four
This library was created on the basis of the study done by X. Morelli et al . The analysis of the 2P2I (http://2p2idb.cnrs-mrs.fr) and iPPIDB (https://ippidb.pasteur.fr/) dataset determined a group of structural and chemical features that were recognized as the Rule of Four (Fig. 2). The rule was used as a filter to accelerate the process of identification of potential PPI inhibitors and their application resulted in the library containing over 4,600 compounds. All the compounds were passed through PAINS filters.
Figure 2. A group of proteins presented in the study that illustrates the concept of the Rule of Four.
PPI Focused Library by 2D Similarity Search vs Binding DB, Pubmed DB
Using Binding DB and Pubmed DB (BioAssays - 1,817), a reference set of small organic molecules with activity in 28 PPI assays towards the following 7 targets was collected [5,6]:
- Toll-like receptor 4
- Hepatitis C virus core protein (dimerization inhibition)
- Tyrosine-protein kinase TYRO11
- runt-related transcription factor 1 isoform AML1c
- core-binding factor beta subunit isoform 1
- mitogen-activated protein kinase 2 (MAP2)
- mitogen-activated protein kinase 3 (MAP3)
After filtering and merging their activity data, the resulting 10,000 unique compounds were obtained and further used as a basis for the library design. The MDL public keys and the Tanimoto similarity cut-off 90 % were applied to the Life Chemicals HTS Compound Collection that enabled picking up almost 19,900 compounds.
- Mabonga L, Kappo AP. Protein-protein interaction modulators: advances, successes and remaining challenges. Biophys Rev. 2019;11(4):559-581.
- Macalino SJY, Basith S, Clavio NAB, Chang H, Kang S, Choi S. Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules. 2018;23(8):1963.
- Designing focused chemical libraries enriched in protein-protein interaction inhibitors using machine learning methods. Reynès C, Host H, Camproux AC et al. PLoSComput Biol. 2010 Mar 5;6(3):e1000695.
- Higueruelo A.P. , Jubb H., Blunde T.L. TIMBAL v2: update of a database holding small molecules modulating protein-protein interactions. Database (Oxford). 2013; 2013: bat039.
- Morelli X, Bourgeas R, Roche P: Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I). Curr Opin Chem Biol 2011,15:475-481
- Ran X., Gestwicki J.E. Inhibitors of Protein-Protein Interactions (PPIs): An Analysis of Scaffold Choices and Buried Surface Area. Curr Opin Chem Biol. 2018; 44: 75–86.