Hydrogen-deuterium exchange mass spectrometry and cryo-EM are the gold standards for epitope mapping. When an HDX-MS experiment works cleanly on your target, you get peptide-level resolution of protected regions within days. When a cryo-EM reconstruction resolves the CDR-antigen contacts, you know exactly which residues matter. Neither method is cheap, neither is fast in the full campaign context, and both require a purified, stable antibody-antigen complex — which you don't have at the hit identification stage.
The question we've been working through at Genolux is a narrower one: not whether computational epitope mapping can replace HDX or cryo-EM (it cannot), but whether it can deliver enough information to make consequential design decisions before those structures exist. The answer, qualified carefully, is yes — and this post describes the approach and its limits.
What "Epitope Mapping" Means Computationally
In experimental work, epitope mapping produces a defined set of antigen residues that, when mutated or protected, reduce or eliminate antibody binding. Computationally, we're predicting contact residues from a modeled complex structure and ranking them by predicted energetic contribution to binding.
These are not the same thing. Experimental epitope mapping identifies residues that are functionally important to binding; computational contact prediction identifies residues that are geometrically close in a docked pose. The overlap is substantial but imperfect. A contact residue in a model can be an irrelevant bystander if it's not making energetically significant interactions. Conversely, a functional epitope residue may be in the second shell — not directly contacting the antibody but critically important for the conformational integrity of the binding surface.
Understanding this distinction shapes how we use computational maps. We don't treat computational contact lists as ground-truth epitope definitions. We treat them as hypotheses about which antigen regions to probe in focused mutagenesis or alanine scanning experiments — a prioritization tool, not a replacement for experimental validation.
The Pipeline: From Sequence to Contact Map
Our computational mapping workflow starts from sequence, not structure. For a new antibody hit — typically a VH/VL sequence from phage display or a computational design campaign — the first step is Fv structure prediction. We use ABodyBuilder2 as our primary method for the Fv, which we've validated on a held-out SAbDab set. For the antigen, we use AlphaFold2 if a monomer structure isn't already available from PDB, accepting the caveat that AlphaFold2 occasionally gets domain-relative positioning wrong for multi-domain proteins.
Docking is performed with RosettaAntibodyDock (previously called SnugDock), which handles the coupled CDR loop flexibility and rigid-body docking simultaneously. We generate 2,000 decoys per antibody-antigen pair. The top 50 decoys by total score are then clustered by CDR-H3 RMSD (1.5 Å threshold) and by antigen contact surface RMSD (2.0 Å threshold). The largest cluster's centroid is our primary predicted complex.
Contact residues are defined as antigen residues within 4.5 Šof any antibody heavy atom in the centroid structure. We also compute a per-residue buried surface area contribution using the Shrake-Rupley algorithm, which is often more informative than raw distance: a residue 4.2 Šaway that buries 60 Ų of surface area matters more than one at 3.8 Šthat buries only 8 Ų.
Validation Against Known Epitopes
Before using this approach on novel targets, we validated it on a set of 47 antibody-antigen pairs from SAbDab where both crystal-derived contact residues and experimental mutagenesis data were available. This is a smaller set than we'd like — the intersection of "has crystal structure" and "has published mutagenesis data" is narrow — but it provides a useful reference.
The key metric we used was functional epitope recall: what fraction of alanine-scanning hotspot residues (defined as positions where alanine mutation reduces binding ≥10-fold) appear in our predicted contact set. Across the 47 pairs, our approach achieved 71% recall of functional hotspot residues with a 4.5 Å contact cutoff, rising to 78% when we expanded to include second-shell residues within 6.5 Å of CDR atoms.
Precision was lower: on average, the predicted contact set contained 2.3× more residues than the actual functional hotspot set. This is expected — not every geometric contact is a functional one. It means our computational maps narrow the space of candidates but don't definitively identify which residues matter. The practical upshot is that we typically recommend alanine scanning on the top 8–12 predicted contact residues rather than attempting to interpret the entire predicted interface.
Where the Approach Works Well — and Where It Doesn't
Computational paratope-epitope mapping works most reliably when the antigen is a rigid, globular protein with a well-defined surface and the antibody binds a concave or flat epitope. In these cases, docking poses converge well (tight clusters), and the contact set is stable across the top-ranked decoys. When multiple independent decoys agree on which antigen residues are contacted, that consensus is meaningful.
It performs poorly on flexible antigens — disordered proteins, IDPs, antigens that change conformation upon antibody binding — where the "correct" structure for docking doesn't exist because the antigen adopts multiple conformations in solution. It also performs poorly when the antibody binds a discontinuous epitope assembled from non-consecutive sequence regions, because the docked pose may capture one patch of the epitope but miss the other. Discontinuous epitopes are common in functionally important regions (receptor binding sites, quaternary structure interfaces), so this is not a rare case.
A practical example: we worked with a target where the antigen was a cytokine receptor ectodomain with a known flexible hinge between domain 1 and domain 2. Our initial docking runs using the AlphaFold2 monomer structure produced high-confidence contact predictions in domain 1. When HDX data came back (run at month 6 of the program), domain 1 protection was confirmed — but domain 2 showed equally significant protection that our model hadn't predicted because we were docking against the AlphaFold2 closed-hinge conformation. The lesson was to run docking against multiple AlphaFold2 multimer configurations and PDB-derived conformers before treating any single contact map as reliable.
Paratope Side: Which CDR Residues Are Doing the Work
The paratope mapping is, in some ways, the more tractable side of this problem. CDR loops are what they are — their positions are defined by the antibody structure, and the question is simply which CDR residues are in contact with the antigen. Our approach scores each CDR residue by its contribution to the interface using a per-residue ddG decomposition in Rosetta.
Residues with per-residue ΔΔG contributions more negative than −0.5 REU are marked as active paratope positions; these are the ones we prioritize in CDR optimization campaigns. Residues between −0.5 and 0 REU are passive contacts — they may be structurally important without being primary energetic drivers. Residues with near-zero or positive contributions are structural CDR positions that happen to be geometrically near the interface but aren't contributing to binding.
This decomposition directly informs how we plan CDR scanning. Positions with large negative contributions are candidates for conservative substitutions (maintaining the interaction type) rather than extensive diversity. Positions with near-zero contributions are candidates for more aggressive diversification, since the affinity cost of mutations there is likely low.
Integrating Computational Maps into a Design Cycle
The most useful framing for computational epitope maps is as an input to experimental prioritization — not as a stand-alone answer. The decision point where they have the most leverage is designing the initial alanine scanning panel. If you're going to scan 20–30 antigen positions anyway, starting with the computationally predicted contacts (ranked by buried surface area contribution) is more efficient than evenly sampling across the surface.
We've also found computational paratope maps useful for identifying positional redundancy in the CDR set. When the ddG decomposition shows that 80% of the paratope contribution comes from CDR-H3 and CDR-H2, that tells you something about where optimization effort should go — and where it won't pay off. Spending rounds of directed evolution on CDR-L1 positions that contribute <0.3 REU each is unlikely to move the needle on affinity.
The goal of this approach isn't to eliminate experimental epitope characterization from a program. It's to make the experimental work more targeted by the time you get to it — fewer constructs, fewer misses, faster convergence on the residues that actually matter.