Using Molecular Dynamics to Validate PPI Interface Stability Predictions

Why Static Structures Are Insufficient for PPI Interface Characterization

Computational screening of PPI interfaces begins with structural data — PDB coordinates, AlphaFold2 predictions, or homology models — and produces ranked compound lists. The entire pipeline, from hot-spot identification through docking and scoring, operates on structural inputs. But structural biology, whether experimental or predicted, captures a snapshot: the lowest-energy conformation under specific conditions, often including crystallographic packing forces or cryo-preservation artifacts that have no physiological counterpart.

PPI interfaces are among the most dynamic regions of protein structures. They evolved to mediate regulated, transient protein-protein contacts — not to remain permanently associated. The interface surface is exposed to solvent in the unbound state and must rearrange to accommodate partner binding. In the bound state, conformational fluctuations at the interface can transiently open or close sub-pockets, change the accessibility of hot-spot residues, and modify the energetics of small-molecule engagement. A docking calculation performed against a single crystal structure samples one point in this conformational ensemble. If that point is not representative of the ensemble accessible under physiological conditions, the docking predictions will be systematically biased.

Molecular dynamics simulation addresses this problem by propagating the structural model through time under a defined force field and solvent environment, generating a conformational ensemble that reflects the dynamics of the interface at a chosen temperature and ionic strength. For PPI interface characterization, the key outputs of MD are: the distribution of hot-spot sub-pocket volumes and shapes over the trajectory, the stability of hot-spot residue side chain orientations, and the frequency with which productive binding geometries are accessible versus occluded.

MD Simulation Protocol for PPI Interfaces

Our standard protocol for PPI interface stability validation uses GROMACS 2023 with the CHARMM36m force field. CHARMM36m was selected over earlier AMBER variants because its parameterization was specifically optimized for intrinsically disordered protein regions and protein-protein interfaces — precisely the structural contexts where earlier force fields showed well-documented artifacts in backbone sampling. For PPI interfaces that involve disordered peptide segments (p53 transactivation domain, BH3 domain peptides), force field choice meaningfully affects the results.

System preparation follows a standardized workflow: the PPI complex structure is protonated at pH 7.4 using PROPKA, placed in a cubic TIP3P water box with periodic boundary conditions and a 12 Å buffer distance from the protein surface, and neutralized with KCl to 150 mM (physiological ionic strength). Energy minimization is performed with the steepest descent algorithm until convergence. Equilibration proceeds in two stages: 100 ps NVT with positional restraints on heavy atoms, followed by 1 ns NPT equilibration with restraints released. Production trajectories run for a minimum of 200 ns, with longer runs (500 ns–1 μs) for targets where slower conformational rearrangements are expected based on experimental B-factor distributions or NMR relaxation data.

For the validation studies we run on well-characterized targets — MDM2-p53 being the primary reference — we compare the MD-derived conformational ensemble against available crystallographic data including alternative conformers where present. RMSD to the crystal structure and RMSF per residue are calculated over the production trajectory to characterize overall structural stability and per-residue flexibility.

What MD Tells Us About PPI Hot-Spot Sub-Pockets

The most directly actionable output from MD interface validation is the pocket volume trajectory. We calculate the hot-spot sub-pocket volume at each MD frame using a grid-based cavity detection method (similar to the approach implemented in POVME and MDpocket) applied to the interface region. The resulting pocket volume distribution answers a practical question: is the sub-pocket a structurally stable feature of the interface, or does it open and close on timescales relevant to ligand binding?

For the MDM2 F19/W23 sub-pocket, our MD analysis shows a relatively stable pocket volume over the production trajectory (standard deviation approximately 15–20% of the mean volume), consistent with the crystallographic observation that this sub-pocket is well-defined across multiple crystal structures with diverse bound ligands. This stability is expected — MDM2's evolved function as a p53-binding protein has locked in a structurally stable hydrophobic sub-pocket that is present in the apo state and does not require induced fit for peptide or small-molecule engagement.

Contrast this with certain BCL-2 family interfaces, where MD analysis reveals substantially larger pocket volume fluctuations and more frequent transitions between open and closed sub-pocket geometries. For these targets, docking against a single crystal structure misrepresents the accessible binding geometry. An ensemble docking approach — selecting representative frames from the MD trajectory that span the accessible pocket geometry distribution — produces meaningfully different compound rankings than single-structure docking, and the ensemble-derived rankings correlate better with available experimental data.

Validating Predicted Hot-Spot Contributions by MD

Beyond pocket geometry, MD trajectories allow validation of predicted hot-spot residue contributions in a way that complements alanine scanning. The per-residue non-bonded interaction energy between the two PPI partners can be decomposed over the MD trajectory, providing a time-averaged picture of which residues contribute most to the binding free energy — and how stable those contributions are across the conformational ensemble.

For residues where the Rosetta alanine scanning ΔΔG prediction and the MD interaction energy decomposition agree — both flagging the residue as a major contributor — confidence in the hot-spot assignment is high. For residues where there is disagreement — Rosetta predicts a hot spot but MD shows weak time-averaged interaction energy, or vice versa — we flag the discrepancy and investigate the source. Typical sources of discrepancy include: force field treatment of aromatic interactions (the W23-MDM2 contact involves a tryptophan indole ring, and force field parameterization of π-stacking interactions varies), protonation state sensitivity in histidine-containing contacts, and crystal vs. solution conformation differences in loop regions adjacent to the interface.

OpenMM for Extended Timescale and Enhanced Sampling

For targets where key interface transitions occur on timescales longer than accessible to conventional MD (typically above ~1 μs), we supplement GROMACS simulations with OpenMM-based enhanced sampling protocols. OpenMM's Python API enables implementation of replica exchange with solute tempering (REST2) and metadynamics protocols that can access slow conformational transitions at reduced computational cost compared to brute-force long timescale simulation.

This is particularly relevant for targets where the hot-spot sub-pocket is cryptic in the apo structure but opens upon ligand binding — the so-called "cryptic pocket" problem. Standard MD from an apo structure may not observe the open pocket within accessible simulation times; REST2 or metadynamics seeded with pocket volume as a collective variable can access the open pocket state and characterize its frequency in the ensemble. Once a cryptic pocket is identified in the enhanced sampling ensemble, it becomes a valid target for structure-based drug design even though it was not apparent in the original structural input.

Integrating MD Output into the Screening Workflow

The output of MD interface validation feeds into the subsequent docking and scoring stages in two ways. First, the pocket volume distribution is used to select a representative structural ensemble for docking — typically 5–10 structures spanning the accessible pocket geometry, weighted by their frequency in the MD trajectory. Second, MD-derived hot-spot residue stability data updates the confidence weights on the hot-spot contact score: a hot-spot residue that is stable across the MD ensemble receives full weight; one with high RMSF or whose side chain samples multiple orientations receives reduced weight in the disruption score.

We're not saying MD validation is required for every compound in a large virtual screen — running production MD on every docked pose is computationally intractable. The MD step is a validation of the target-level interface model, not of individual compound poses. Once the interface model has been MD-validated and the ensemble characterized, that model is used as the basis for screening potentially thousands of compounds. The per-compound docking and scoring is fast; the MD investment is made once per target and is amortized across the screening campaign.

The practical value of the MD validation step is most evident when the static structure gives a misleading picture of the interface. In our characterization of a Wnt pathway PPI target (details in a separate post), the initial hot-spot prediction from the crystal structure identified a hot-spot residue with a well-positioned side chain in the crystal. MD analysis revealed that this side chain is highly mobile in solution, sampling multiple rotameric states, and that the hot-spot contribution in the crystal was partly an artifact of crystal packing contacts that constrained the side chain to one rotamer. Correcting for this changed the pharmacophore hypothesis and resulted in different compound prioritization. Without MD validation, that discrepancy would not have been detected before synthesis.