Title:Computational Prediction of Binding Affinity for CDK2-ligand Complexes.
A Protein Target for Cancer Drug Discovery
Volume: 29
Issue: 14
Author(s): Martina Veit-Acosta and Walter Filgueira de Azevedo Junior*
Affiliation:
- Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-
900 Brazil
- Specialization Program in Bioinformatics. Pontifical Catholic University of Rio Grande do Sul
(PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900 Brazil
Keywords:
chemical space, physical modeling, CDK2, scoring function space, drug design, crystal structure, machine learning
Abstract:
Background: CDK2 participates in the control of eukaryotic cell-cycle progression. Due to
the great interest in CDK2 for drug development and the relative easiness in crystallizing this enzyme,
we have over 400 structural studies focused on this protein target. This structural data is the basis for
the development of computational models to estimate CDK2-ligand binding affinity.
Objective: This work focuses on the recent developments in the application of supervised machine
learning modeling to develop scoring functions to predict the binding affinity of CDK2.
Method: We employed the structures available at the protein data bank and the ligand information accessed
from the BindingDB, Binding MOAD, and PDBbind to evaluate the predictive performance of
machine learning techniques combined with physical modeling used to calculate binding affinity. We
compared this hybrid methodology with classical scoring functions available in docking programs.
Results: Our comparative analysis of previously published models indicated that a model created using
a combination of a mass-spring system and cross-validated Elastic Net to predict the binding affinity of
CDK2-inhibitor complexes outperformed classical scoring functions available in AutoDock4 and AutoDock
Vina.
Conclusion: All studies reviewed here suggest that targeted machine learning models are superior to
classical scoring functions to calculate binding affinities. Specifically for CDK2, we see that the combination
of physical modeling with supervised machine learning techniques exhibits improved predictive
performance to calculate the protein-ligand binding affinity. These results find theoretical support
in the application of the concept of scoring function space.