Can deep learning models help accelerate electrostatics-driven protein pKa predictions?


Target audience

  • Structural Biologists

  • Computational Biophysicists

  • Researchers interested in MD simulations and Machine Learning

Webinar programme (1h)

  • Presentation (~30 min)

  • Q&A (~30 min)

Description about the presentation

pH is a crucial physicochemical property that affects proteins molecular structure, folding, stability, and function. Many computational methods have been developed to calculate pKa values. In the highly accurate, but slow, Poisson–Boltzmann (PB)-based methods, proteins are represented by point charges in a low dielectric medium surrounded by an implicit solvent (high dielectric). Empirical methods rely on statistically fitting parameters over large datasets of experimental pKa values. These are much faster than the physics-based methods, although at the cost of less microscopic insights and unknown predictive power on mutations and proteins dissimilar to those in the training set.

Here, I will present a novel strategy to combine the best features of PB models – accuracy and interpretability – with the speed of classical empirical methods. The deep learning pKa predictors obtained were trained on a database of 3M theoretical pKa values estimated from 50k structures using a PB method. With this approach, we can retrieve the physics-based predictions with an average error below 0.4 pK units while being up to 1000x faster.


Miguel Machuqueiro, PhD

BioISI – Instituto de Biosistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa

Dr Miguel Machuqueiro received his PhD in 2003. Since then, he has been working on molecular modeling and simulation, in particular, on the development of new in silico methods to deal with pH effects in biomolecules. Since he started his own group at FCUL (2009), Miguel has worked on extending the development of Constant-pH MD (CpHMD) methods to allow the correct modeling of the pH effects in MD simulations. Currently, they are one of the very few groups in the world who can apply CpHMD methodologies to complex systems involving lipid bilayers with peptides, proteins, and even drugs.

Organized by

Dr Yin Chen, EGI Foundation