Machine learning collective variable discovery & enhanced sampling

The time scales accessible to molecular dynamics simulations is limited by the short integration time steps required for numerical stability, which frustrates comprehensive sampling of configurational phase space needed to compute converged thermodynamic averages, surmount free energy barriers, and simulate rare events. We have maintained a long interest in enhanced sampling techniques and, in particular, the development of methods for data-driven discovery and acceleration. We developed Molecular Enhanced Sampling with Autoencoders (MESA) to interleave successive rounds of variable discovery and enhanced sampling in high-variance CVs and Girsanov Reweighting Enhanced Sampling Technique (GREST) as a means to do the same for slow CVs by appealing to a dynamical reweighting formalism. We developed State-free Reversible VAMPnets (SRVs) as a means to learn slow CVs from simulation trajectories, and developed Latent Space Simulators (LSS) that use three back-to-back deep learning networks to (i) learn the slow collective variables of the system, (ii) propagate the dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations within a dynamically coarse-grained kinetic model that enables generation of ultra-long simulation trajectories at up to six orders of magnitude lower cost than molecular dynamics. We have made our methods freely available by contributing them to enhanced sampling libraries such as SSAGES and PLUMED.

We are pursuing the following projects in this theme:

  • Application of latent space simulators to large and multi-molecular systems
  • Data-driven CV discovery and enhanced sampling for condensed phase systems
  • Backmapping of configurationally or dynamically coarse-grained systems to restore atomic resolution using denoising diffusion probability models (DDPM)
  • Incorporation of memory effects / non-Markovian dynamics into dynamical propagators
  • Adaptive sampling techniques for efficient model parameterization

Representative Publications

107.   M.S. Jones, Z.A. McDargh, R.P. Wiewiora, J.A. Izaguirre, H. Xu, and A.L. Ferguson* “Molecular latent space simulators for distributed and multi-molecular trajectories” J. Phys. Chem. A (accepted, 2023)

103.  K. Shmilovich and A.L. Ferguson* “Girsanov Reweighting Enhanced Sampling Technique (GREST): On-the-fly data-driven discovery of and enhanced sampling in slow collective variables” J. Phys. Chem. A 127 15 3497-3517 (2023) [ https://doi.org/10.1021/acs.jpca.3c00505 ]

→ Invited article for “Pablo G. Debenedetti Festschrift” virtual special issue

74.     H. Sidky, W. Chen, and A.L. Ferguson* “Molecular latent space simulators” Chem. Sci. 11 9459 (2020)
[ http://dx.doi.org/10.1039/D0SC03635H ]

→ Selected for 2020 Chemical Science HOT Article Collection

66.     H. Sidky, W. Chen, and A.L. Ferguson* “Machine learning for collective variable discovery and enhanced sampling in biomolecular simulation” Molecular Physics 118 5 e1737742 (2020)
[ https://doi.org/10.1080/00268976.2020.1737742 ]

61.     H. Sidky, W. Chen, and A.L. Ferguson* “High-resolution Markov state models for the dynamics of Trp-cage miniprotein constructed over slow folding modes identified by state-free reversible VAMPnets” J. Phys. Chem. B 123 38 7999-8009 (2019) [ http://dx.doi.org/10.1021/acs.jpcb.9b05578 ]

60.    W. Chen, H. Sidky, and A.L. Ferguson* “Capabilities and limitations of time-lagged autoencoders for slow mode discovery in dynamical systems” J. Chem. Phys. 151 064123 (2019) [ https://doi.org/10.1063/1.5112048 ]

56.     W. Chen, H. Sidky, and A.L. Ferguson* “Nonlinear discovery of slow molecular modes using state-free reversible VAMPnets” J. Chem. Phys. 150 214114 (2019) [ https://doi.org/10.1063/1.5092521 ]

→ Selected as J. Chem. Phys. “Editor’s Pick”