It is generally known that protein-protein interactions (PPI) are involved in important biological processes in living organisms, and their regulation can be crucial for treatment of numerous diseases. Low molecular weight PPI inhibitors that are able to selectively and potently modulate protein–protein interactions have recently reached clinical trials.
Keeping pace with a growing interest in various aspects of PPI, Life Chemicals has prepared the following Libraries of potential PPI inhibitors by ligand-based approach:
- PPI Focused Library by Machine Learning (Decision Tree) Method
- PPI Focused Library by 2D Similarity Search vs Timbal DB
- PPI Focused Library by 2D Similarity Search vs Binding DB and Pubmed DB
- PPI Focused Library by the “Rule of Four”
PPI Focused Library by Machine Learning(Decision Tree) Method
To predict compounds that can affect PPI, a machine learning method (decision tree, DT) was used . This method is recognized to be a useful tool to identify PPI inhibitors. The DT method is based on a cross-validation protocol to provide the balance between enrichment, sensitivity and specificity of the learning data set. By means of comparison of unique physicochemical features of PPI and non-PPI inhibitors, several descriptors showing a correlation for PPI inhibitors in a specified range of values were found:
RDF 070m (≤ 3.31) is a shape-based descriptor that defines a radial distribution function of an ensemble of atoms in a spherical volume with the radius of 7 Å
UI (> 4.13) - an unsaturation index directly linked to the number of multiple bonds, containing double, triple and aromatic bonds
SHP2 (≤ 0.30) – an average shape profile index of order 2 deduced from the distance distribution of the geometry matrix
Mor11m (> - 0.1) - a descriptor calculated by summing atom weights viewed by a different angular scattering function (signal 11 / weighted by atomic masses)
Other filters that were applied to the entire Life Chemicals Stock Collection :
- ClogP = 1.5 – 4.5
- TPSA = 75 – 120
- MW ≤ 475
- HBD = 0 – 4
- HBA = 4 – 9
- PAINS filters
Resulting compounds were included in The Life Chemicals PPI Machine Learning Method Library that finally comprised about 2,600 molecules (Fig. 1).
Fig. 1. A. Principal component analysis (PCA) showing accumulation of compounds best matching our parameters (see the list of the parameters above). B.Distribution of compounds within allowed values of descriptors. The plot was built to validate the method: red points correspond to the compounds obtained from Timbal database with molecular weight lower than 450, and green – to Life Chemicals PPI Inhibitors Machine Learning Method Library. All descriptors were calculated with PyChem.
PPI Focused Library by 2D Similarity Search vs Timbal DB
About 1,500 compounds were extracted from The Life Chemicals Stock HTS Collection by 2D fingerprint similarity search towards Timbal DB  with Tanimoto 85 % threshold.
PPI Focused Library by 2D Similarity Search vs Binding DB and Pubmed DB
Using Binding DB and Pubmed DB, a reference set of small organic molecules with activity in 28 PPI assays towards the following 7 targets was collected: toll-like receptor 4; Hepatitis C virus core protein (dimerization inhibition); Tyrosine-protein kinase TYRO11; runt-related transcription factor 1 isoform AML1c; core-binding factor beta subunit isoform 1; mitogen-activated protein kinase 2 (MAP2); mitogen-activated protein kinase 3 (MAP3) [3,4]. After filtering and merging their activity data, the resulting 10,000 unique compounds were obtained and further used as a basis for the library design. The MDL public keys and the Tanimoto similarity cut-off 90 % were applied to The Life Chemicals Stock Collection that enabled picking up almost 23,000 compounds.
PPI Focused Library by the “Rule of Four”
This library was created on the basis of the study done by X. Morelli et al . The analysis of 2P2I dataset (http://2p2idb.cnrs-mrs.fr) determined a group of structural and chemical features that were recognized as the “Rule of Four” (Fig. 2). It has been shown that a specific value range for AlogP/ClogP (ALOGP/CLOGP > 4), molecular weight (MW > 400), number of hydrogen bond acceptors (HDA > 4) and number of rings (Ring > 4) defines
the properties of a PPI inhibitor. The rule was used as a filter to accelerate the process of identification of potential PPI inhibitors and their application resulted in the library containing 4,300 compounds (Fig. 3). All the compounds were passed through PAINS filters.
Fig. 2. A group of proteins presented in the study that illustrates the concept of the “Rule of Four”.
Fig. 3. Scatter plots prepared with SYBYL-X software that demonstrate distribution of compounds from the Library according to the “Rule of Four” descriptor values.
- Designing focused chemical libraries enriched in protein-protein interaction inhibitors using machine learning meth- ods. Reynès C, Host H, Camproux AC et al. PLoSComput Biol. 2010 Mar 5;6(3):e1000695. doi: 10.1371/journal.pcbi.1000695
- Morelli X, Bourgeas R, Roche P: Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I). Curr Opin Chem Biol 2011,15:475-481