Computational Antibody Papers

Filter by tags
All
Filter by published year
All
TitleKey points
    • Benchmarking of pretrained protein, antibody, and nanobody language model representations on a comprehensive suite of nanobody-specific tasks.
    • The authors introduce eight tasks spanning variable-region annotation, CDR infilling, antigen binding prediction, paratope prediction, affinity prediction, polyreactivity, thermostability, and nanobody type classification (e.g. VHH, VNAR, conventional antibody chains).
    • They evaluate generic protein LMs, antibody-specific LMs, and nanobody-specific LMs under a unified and standardized benchmark.
    • All backbone models are kept frozen, with task-specific lightweight heads trained on top to isolate representational quality.
    • No single model consistently outperforms others across all tasks, showing that nanobody-specific pretraining alone does not guarantee superior performance over antibody-specific or generic protein language models.
    • FLAb2 substantially expands existing antibody benchmarks, introducing the largest public dataset to date with a strong focus on developability rather than binding alone.
    • A broad spectrum of models is evaluated, including generic protein language models, antibody-specific models, structure-aware predictors, and simple physics-based baselines such as charge and pI calculations.
    • Zero-shot predictions from pretrained protein models are generally weak and unreliable for antibody developability. Surprisingly, simple charge-based features often outperform large models for properties such as aggregation, polyreactivity, and pharmacokinetics.
    • Intrinsic properties (e.g. thermostability, expression) are substantially easier to predict than extrinsic or context-dependent properties such as polyreactivity, pharmacokinetics, or immunogenicity.
    • Few-shot learning improves performance, but even the best models typically achieve only moderate correlations (ρ ≈ 0.4–0.6) on statistically robust datasets, highlighting the difficulty of the task.
    • Incorporating structural information improves predictions, particularly in the zero-shot setting, and helps reduce biases present in sequence-only models.
    • Many pretrained models primarily capture evolutionary signal, effectively measuring distance from germline rather than true developability. Encouragingly, this germline bias largely disappears once models are fine-tuned in a few-shot setting.
    • Scaling model size alone provides limited benefit. Given sufficient training data, simple one-hot encodings paired with small neural networks can match or outperform billion-parameter protein language models, emphasizing that data quality and quantity matter more than model scale.
    • All-atom, zero-shot generative model that designs antibody sequence and structure directly in complex with a target from epitope-conditioned prompts.
    • One specifies target, epitope, modality and the algorithm produces designs.
    • They tested 4–24 designs per target, achieving 50% target-level success, producing VHHs and scFvs with pico- to nanomolar affinities (best ≈ 26 pM).
    • Designed antibodies show therapeutic-grade developability (expression, aggregation, hydrophobicity, polyreactivity, stability) without optimization via wetlab validation.
    • Human PBMC assays (10 donors) show no detectable immunogenicity for representative de novo nanobodies.
    • Large-scale benchmarking of structural, energetic, and confidence metrics to distinguish protein binders from non-binders.
    • Curated 3,766 experimentally tested de novo binders across 15 targets from independent campaigns.
    • Of these, 436 were confirmed binders, the remainder non-binders.
    • Each design was re-modelled using AF2 (initial guess + ColabFold), Boltz-1, and AF3.
    • From these predictions they computed 200+ structural and confidence descriptors.
    • AF3-derived confidence scores (especially ipSAE_min) were the best single discriminators, although per-target precision still ranged widely (0.1–1.0), underscoring strong target dependence.
  • 2025-12-12

    CDR Conformation Aware Antibody Sequence Design with ConformAb

    • generative methods
    • structure prediction
    • ConformAb is a guided discrete-diffusion method for antibody lead optimization that preserves the seed binder’s CDR backbone conformation while introducing sequence diversity.
    • Structural preservation is enforced by steering the diffusion process to match the seed’s canonical CDR class probabilities, ensuring generated sequences retain the same canonical backbone geometry.
    • Canonical classes are assigned by folding SabDab and pOAS sequences with ABB2 and labeling them using the Kelow et al. dihedral-based canonical clustering scheme; ConformAb learns to predict these classes from sequence.
    • During generation, a KL-based guidance signal constrains mutations so that each CDR remains in the seed’s canonical class, enabling safe exploration of sequence space around the functional binder.
    • Although ConformAb does not model affinity directly, its structure-preserving diversification enables zero-shot affinity maturation: some variants emerge with improved binding despite using no antigen structure, no repertoire data, and no affinity labels.
    • The method was experimentally validated, generated sequences were expressed, tested by SPR on EGFR, IL-6, and a third target, achieving 15–60% binding rates and, for two targets, producing binders with 3–5x higher affinity than the seed.
    • Crystal structures of top EGFR and IL-6 binders confirmed that, despite substantial and non-conservative mutations, the CDR backbone conformations were preserved, validating the model’s structural guidance in wet-lab experiments.
    • Novel inverse folding algorithm for antibodies with experimental validation.
    • It uses atom-level graph MPNN, structured transformer, novel scoring and AF3 filtering; unlike ProteinMPNN/AbMPNN/AntiFold, which operate residue-level and lack downstream optimization.
    • Only experimental antibody structures (free antibodies + complexes) of antibodies were used for training.
    • AntiBMPNN uses a distinct dataset; AbMPNN and AntiFold rely heavily on modeled structures, unlike AntiBMPNN. So it might just be that the moderate gains in residue retrieval are due to a slightly bigger dataset used.
    • Unlike most models they actually performed experimental validation: ELISA assays on huJ3 (single-points, CDR1, CDR3) and D6 (CDR2), with multiple variants improving binding over wild type.
    • Benchmarking of all-atom biomolecular structure prediction methods, including AlphaFold 3 and open-source reproductions such as Boltz-1, Chai-1, HelixFold 3 and Protenix.
    • They introduced a large low-homology benchmark spanning nine tasks, including protein monomers, protein–protein, protein–ligand, nucleic acid systems, and antibody–antigen complexes.
    • Success is defined using DockQ ≥ 0.23 for protein–protein and antibody–antigen interfaces. This is actually a low bar, barely acceptable (4-6A interface RMSD roughly).
    • Antibody–antigen complexes remain particularly challenging, with AlphaFold 3 achieving only ~45–48% success and other methods performing substantially worse.
    • Antibodies have much lower proportion of high quality DOckQ in AF3 (~13%) versus a whopping 33% for nanobodies.
    • AlphaFold 3 consistently outperforms competing methods by roughly ten percentage points on antibody–antigen docking.
    • Increased sampling improves AlphaFold 3 predictions, whereas other methods show unstable or degrading performance, underscoring the importance of robust ranking and confidence calibration rather than sampling alone.
    • Novel algorithm (ITsFlexible) for predicting conformational flexibility of antibody and TCR CDR3 loops and CDR3-like loops.
    • Dataset of 1.2M loops extracted from all antiparallel β-strand motifs in the PDB, including antibody and TCR CDR3s, representing the same secondary-structure pattern.
    • Flexibility defined by structural clustering: multiple conformations with pairwise Cα RMSD > 1.25 Å yield a “flexible” label.
    • Model is a three-layer equivariant GNN (EGNN) trained as a binary classifier (rigid vs flexible).
    • ITsFlexible is better than random, but it only just about outperforms loop-length, solvent exposure, pLDDT, RMSPE, and AF2-MSA-subsampling baselines. Best results are obtained when using crystal structures rather than predicted ones showing that modeling is still the roadblock for predictability. So the gain is very moderate against strong but simple baselines such as loop length.
    • They performed cryo-em validation which is a huge positive of the paper - three antibodies were experimentally solved; two predictions matched the data, one did not (likely due to antigen-binding-induced rigidification).
    • JAM-2 is a novel method for de novo design of biologics that are then experimentally validated and show strong developability (expression, hydrophobicity, polyspecificity, monomericity). More than 57% of all designs pass all core developability criteria straight from the computer jam-2.
    • JAM-2 is a generative model, but the details are not revealed.
    • The most promising candidates (thousands per target in epitope-tiling mode, ~45 per format in target-level mode) are tested for binding by yeast display (epitope mode) or BLI (target mode) ; the entire discovery timeline is ≈ 1 month, with 2–3 days of fully computational design upfront, matching exactly what the paper reports jam-2.
    • Hit rates: Across 16 completely unseen targets: 39% average hit-rate for VHH-Fcs 18% average hit-rate for mAbs 100% of targets produced at least one binder These are all double-digit success rates from only 45 designs per format jam-2.
    • VHHs have higher hit rates but generally weaker affinities.
    • A panel of several hundred antibodies was assessed for: hydrophobicity, self-association (polyspecificity), expression titer, monomericity, thermostability. More than half (57%) met all pass criteria simultaneously, and 80%+ passed individual criteria such as expression or hydrophobicity. These molecules were not optimized, it was the first pass from the model.
  • 2025-11-26

    Drug-like antibody design against challenging targets with atomic precision

    • protein design
    • generative methods
    • developability
    • Update on earlier Chai-2 results adding developability and structural validations.
    • Previously generated scFv hits were reformatted into full-length IgGs; ~93% retained binding.
    • Developability was assessed using NanoDSF (Tm), HIC-HPLC, BVP ELISA, and AC-SINS, with Jain-style green-flag thresholds.
    • Most reformatted IgGs passed ≥3 of 4 developability flags, indicating good biophysical properties without further optimization.
    • Newly designed antibodies were generated for the GPCR benchmarks, showing successful in silico design against challenging targets.