Compound solubility prediction in medicinal chemistry and drug discovery

Expert-driven In Silico Drug Discovery Solutions

8 May 2023

Svitlana Kondovych

Senior Researcher

Compound solubility is a crucial factor in medicinal chemistry and drug discovery, as it significantly influences the absorption, distribution, metabolism, excretion, and toxicity (ADMET) of potential drug candidates. Poor solubility can limit a drug's bioavailability, with these limitations ultimately leading to its failure in clinical trials. Thus, accurate prediction of the compound solubility is of vital importance for successful drug discovery [1-3].

In response to this challenge, computational, or in silico, methods [4] have emerged as an important tool for predicting and calculating compound solubility, offering a cost- and time-efficient alternative to experimental methods. They can broadly be divided into analytical [5,6] and numerical [7] methods, which can be combined to enhance the reliability of predictions [8].

Computational methods are based on a range of statistical and thermodynamic approaches (Fig. 1). The most widely spread analytical methods for solubility prediction are quantitative structure-property relationships (QSPR) and thermodynamics-based methods involving the calculation of the solvation-free energy and solution of corresponding equations. In their turn, numerical methods comprise molecular dynamics (MD) simulations, quantum mechanics-based models, and machine learning.

Basic computational methods for solubility prediction

Figure 1. Basic computational methods for solubility prediction.

QSPR models are built on the analysis of a large set of compounds with known solubility. In these models, mathematical equations show the relationship between the structural properties of the compounds and their solubility [5,9]. QSPR models have been applied to predict the solubility of various classes of compounds, including small molecules, peptides, and polymers. However, the accuracy of QSPR models depends on the quality and size of the training set used to develop the model.

Another analytical approach deals with the General Solubility Equation (GSE) or similar thermodynamic-based methods, which relate the solubility of a compound to its molecular structure and properties [6]. The GSE is based on the principle stating that the solubility of a compound depends on the balance between the enthalpy of dissolution and the entropy of mixing. It also involves a set of molecular descriptors to predict the solubility of organic compounds.

Molecular mechanics-based methods, such as the generalized Born Solvation model [10], are popular in drug discovery due to their simplicity and speed. These methods use empirical force fields to describe the interactions between atoms and molecules and to calculate the energy of a molecule in a given solvent environment, allowing for the prediction of the solvation-free energy and, therefore, the solubility.

MD simulations are founded on the analysis of the molecular interactions between the solute and solvent molecules while modeling the time evolution of a molecular system in real-time [11]. MD simulations can provide detailed information on the solubility of a compound, including the thermodynamic properties of the solvation process. However, MD simulations require significant computational resources and expertise, making them less accessible for most medicinal chemistry research groups.

Quantum mechanics-based methods [7], such as density functional theory, quantum Monte Carlo, or the polarizable continuum model, offer a more accurate approach by considering the electronic structure of the molecule and the surrounding solvent molecules. However, these methods are computationally expensive and may not be practical for large-scale screening of compound libraries.

Machine learning-based methods [12-13], such as Support Vector Machines, Random Forest, or Deep Learning algorithms are gaining popularity in drug discovery due to their ability to handle large datasets and provide accurate predictions with high throughput. These methods require a training set of experimentally measured solubility data to determine the relationship between molecular descriptors and solubility, which can then be applied to predict the solubility of new compounds.

Overall, despite the promise of computational methods for predicting compound solubility, there remain many issues to be addressed in terms of their reliability and accuracy. Among these key tasks, there is an essential need for accurate and diverse training sets of solubility data, as predictions can be influenced by their quality and representativeness. Moreover, such factors as crystal packing, polymorphism, and solubility-enhancing excipients can complicate solubility prediction, highlighting the top priority of careful validation of computational methods against experimental data.

At Life Chemicals, we successfully apply both thermodynamic and kinetic HTS solubility measurement methods. This service is available on request together with an array of complementary in vitro ADMET tests and customizable quality assurance services.

Additionally, we offer an off-the-shelf collection of soluble fragment-like molecules (Fig. 2):

Fragment Library with Experimental Solubility: 22,500 stock available fragments with experimentally confirmed solubility in DMSO and PBS
High Solubility Fragment Subset: 7,000 fragments with minimum experimentally confirmed solubility in PBS at 1 mM and in DMSO at 200mM, measured by the thermodynamic method using HPLC
Diversity Screening Subsets of Soluble Fragments: screening pools of 1,280, 960 and 320 drug-like low molecular weight fragments, also available in the pre-plated format
Pre-plated Soluble Fluorine-containing Fragment Set: 1,350 fluorine-containing fragments with experimentally assured solubility at 200 mM in the DMSO solution
Fluorine Fragment Cocktails: 130 in-stock sets of 10 drug-like fluorine-containing fragments each (totally, 1,300 screening compounds) with the most diverse ¹⁹F chemical shifts in order to facilitate screening results interpretation

Please, contact us at marketing@lifechemicals.com for any additional information and price quotations.

Visit our Website for a detailed product description.

Download SD files with compound structures directly from our Downloads section

Custom compound selection based on specific parameters can be performed on request, with competitive pricing and the most convenient terms provided.

References

Savjani, K. T., Gajjar, A. K., & Savjani, J. K. (2012). Drug solubility: importance and enhancement techniques. ISRN pharmaceutics, 2012, 195727. DOI: 10.5402/2012/195727
Li Di, Paul V. Fish, Takashi Mano. (2012) Bridging solubility between drug discovery and development, Drug Discovery Today, 17, 9–10, 2012, 486-495, DOI: 10.1016/j.drudis.2011.11.007
Coltescu A. R., Butnariu M, Sarac I. (2020). The Importance of Solubility for New Drug Molecules. Biomed Pharmacol J;13(2). DOI: 10.13005/bpj/1920
Das, T., Mehta, C. H., & Nayak, U. Y. (2020). Multiple approaches for achieving drug solubility: an in silico perspective. Drug discovery today, 25(7), 1206-1212. DOI: 10.1016/j.drudis.2020.04.016
Gao H, Shanmugasundaram V, Lee P. (2002). Estimation of aqueous solubility of organic compounds with QSPR approach. Pharm Res. 19(4):497-503. DOI: 10.1023/a:1015103914543
Ran Y., Jain N., and Yalkowsky S. H.. (2001). Prediction of Aqueous Solubility of Organic Compounds by the General Solubility Equation (GSE). Journal of Chemical Information and Computer Sciences 41 (5), 1208-1217. DOI: 10.1021/ci010287z
Palmer, D. S.; McDonagh, J. L.; Mitchell, J. B. O.; van Mourik, T.; Fedorov, M. V. (2012). First-Principles Calculation of the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules. Journal of Chemical Theory and Computation. 8 (9): 3322–3337. DOI: 10.1021/ct300345m
McDonagh, J. L.; Nath, N.; De Ferrari, L.; van Mourik, T.; Mitchell, J. B. O. (2014). Uniting Cheminformatics and Chemical Theory To Predict the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules. Journal of Chemical Information and Modeling. 54 (3): 844–856. DOI: 10.1021/ci4005805
Cheng, A., & Merz, K. M. (2003). Prediction of aqueous solubility of a diverse set of compounds using quantitative structure-property relationships. Journal of medicinal chemistry, 46(17), 3572-3580. DOI: 10.1021/jm020266b
Tsui, V., & Case, D. A. (2000). Theory and applications of the generalized Born solvation model in macromolecular simulations. Biopolymers: Original Research on Biomolecules, 56(4), 275-291. DOI: https://doi.org/10.1002/1097-0282(2000)56:4%3C275::AID-BIP10024%3E3.0.CO;2-E
Hossain, S., Kabedev, A., Parrow, A., Bergström, C. A., & Larsson, P. (2019). Molecular simulation as a computational pharmaceutics tool to predict drug solubility, solubilization process, es, and partitioning. European Journal of Pharmaceutics and Biopharmaceutics, 137, 46-55. DOI: 10.1016/j.ejpb.2019.02.007
Ye, Z., Ouyang, D. Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms. J Cheminform 13, 98 (2021). DOI: 10.1186/s13321-021-00575-3
Boobier, S., Hose, D.R.J., Blacker, A.J. et al. Machine learning with physicochemical relationships: solubility prediction in organic solvents and water. Nat Commun 11, 5753 (2020). DOI: 10.1038/s41467-020-19594-z

8 May 2023, 14:14 Svitlana Kondovych Computational Chemistry

Comments ()

Write comment