Computational Antibody Papers

Filter by tags
All
Filter by published year
All
TitleKey points
    • Prompt-based, in-context prediction of antibody developability properties using large language models, rather than training separate predictors per property.
    • As a baseline, they evaluate TxGemma, a therapeutics-specific multimodal LLM that supports task switching via prompts and is fine-tuned using LoRA.
    • The study relies on a very large antibody dataset (~876k heavy chains) with in-silico–computed biophysical developability properties, combining sequence-based and structure-based predictors.
    • Models are trained and evaluated using prompts that include antibody sequences together with partially observed property/value pairs, asking the model to infer a missing property for a query sequence.
    • To prevent shortcut learning where the model ignores context and relies only on sequence, the authors introduce AB-context-aware training, which applies a random latent transformation jointly to context properties and targets during training, forcing explicit use of contextual information.
    • By simulating batch effects, they show that standard fine-tuned TxGemma degrades sharply as batch bias increases (from ~0.99 Spearman ρ with no bias to ~0.95 with moderate bias and ~0.58 with strong bias), whereas context-aware training remains robust even under strong batch effects.
    • De novo platform for epitope-specific antibody design against “zero-prior” targets, i.e. antigen sites with no known antibody–antigen or protein–protein complex structures and limited homology to previously solved interfaces.
    • The method combines three tightly integrated components: AbsciDiff, an all-atom diffusion model fine-tuned from Boltz-1 to generate epitope-conditioned antibody–antigen complex structures; IgDesign2, a structure-conditioned paired heavy–light CDR sequence design model; and AbsciBind, a modified AF-Unmasked / AlphaFold-Multimer–based scoring protocol using ipTM-derived interface confidence to rank and filter designs.
    • The platform was evaluated on 10 zero-prior protein targets, with fewer than 100 antibody designs per target advanced to experimental testing; specific binders were successfully identified for 4 targets (COL6A3, AZGP1, CHI3L2, IL36RA).
    • Experimental validation demonstrated both structural and functional accuracy, including cryo-EM confirmation at near-atomic resolution (DockQ 0.73–0.83) for two targets and AI-guided affinity maturation yielding a functional IL36RA antagonist with ~100 nM potency.
    • Novel framework that identifies high-affinity leads using data from only a single round of FACS, significantly reducing the labor and reagents required for traditional multi-round affinity maturation campaigns.
    • Models were trained using log enrichment ratios (continuous) or binary labels (enriched vs. depleted), calculated by normalizing post-sorting FACS abundance against pre-sorting MACS abundance to account for expression biases.
    • They benchmarked linear/logistic regression and CNNs against a semi-supervised ESM2-MLP approach ; notably, the linear models often outperformed deeper architectures in ranking validated substitutions and offered superior interpretability for identifying confounding signals like polyreactivity.
    • By generalizing information across all sequences, ML models effectively separated "affinity-driving" mutations from "passenger" substitutions, identifying sub-nanomolar binders that were not prioritized by traditional, more laborious raw sequencing count analysis.
    • The best-performing models were leveraged within a Gibbs sampling protocol to design novel sequences unseen in the original experiment, ultimately yielding multiple improved binders with up to a ~2500-fold affinity increase over the wild-type.
    • Faster free to use version of NetMHCPIIpan for deimmunization.
    • The model uses a small neural network (MLP) trained on one-hot encoded 15-mer peptides. To improve accuracy, it identifies the strongest 9-residue ‘binding core’ within those peptides and aligns them before scoring.
    • The training data isn't experimental directly but distilled from NetMHCIIpan-4.3. The authors took 75,000 peptides, ran them through the original tool, and weighted the results by how common 97 different DRB1 alleles are in North America to create a single risk score.
    • By predicting the final risk score in one pass, rather than calculating 97 individual allele bindings, it runs 300,000x faster while keeping a 95% correlation with the original tool's results.
    • To prove it works on real drugs, they tested it against MAPPs data (physical peptide presentation) from vatreptacog alfa, a drug that failed clinical trials due to immune reactions. It successfully flagged the same high-risk mutations as the much slower original software.
    • Its main value is the speed and differentiability wrt NetMHCIIpan, so it can sit in generative AI pipelines. It allows designers to screen millions of protein variants for "self" vs "non-self" peptides in minutes rather than weeks.
    • Benchmarking of pretrained protein, antibody, and nanobody language model representations on a comprehensive suite of nanobody-specific tasks.
    • The authors introduce eight tasks spanning variable-region annotation, CDR infilling, antigen binding prediction, paratope prediction, affinity prediction, polyreactivity, thermostability, and nanobody type classification (e.g. VHH, VNAR, conventional antibody chains).
    • They evaluate generic protein LMs, antibody-specific LMs, and nanobody-specific LMs under a unified and standardized benchmark.
    • All backbone models are kept frozen, with task-specific lightweight heads trained on top to isolate representational quality.
    • No single model consistently outperforms others across all tasks, showing that nanobody-specific pretraining alone does not guarantee superior performance over antibody-specific or generic protein language models.
    • FLAb2 substantially expands existing antibody benchmarks, introducing the largest public dataset to date with a strong focus on developability rather than binding alone.
    • A broad spectrum of models is evaluated, including generic protein language models, antibody-specific models, structure-aware predictors, and simple physics-based baselines such as charge and pI calculations.
    • Zero-shot predictions from pretrained protein models are generally weak and unreliable for antibody developability. Surprisingly, simple charge-based features often outperform large models for properties such as aggregation, polyreactivity, and pharmacokinetics.
    • Intrinsic properties (e.g. thermostability, expression) are substantially easier to predict than extrinsic or context-dependent properties such as polyreactivity, pharmacokinetics, or immunogenicity.
    • Few-shot learning improves performance, but even the best models typically achieve only moderate correlations (ρ ≈ 0.4–0.6) on statistically robust datasets, highlighting the difficulty of the task.
    • Incorporating structural information improves predictions, particularly in the zero-shot setting, and helps reduce biases present in sequence-only models.
    • Many pretrained models primarily capture evolutionary signal, effectively measuring distance from germline rather than true developability. Encouragingly, this germline bias largely disappears once models are fine-tuned in a few-shot setting.
    • Scaling model size alone provides limited benefit. Given sufficient training data, simple one-hot encodings paired with small neural networks can match or outperform billion-parameter protein language models, emphasizing that data quality and quantity matter more than model scale.
    • All-atom, zero-shot generative model that designs antibody sequence and structure directly in complex with a target from epitope-conditioned prompts.
    • One specifies target, epitope, modality and the algorithm produces designs.
    • They tested 4–24 designs per target, achieving 50% target-level success, producing VHHs and scFvs with pico- to nanomolar affinities (best ≈ 26 pM).
    • Designed antibodies show therapeutic-grade developability (expression, aggregation, hydrophobicity, polyreactivity, stability) without optimization via wetlab validation.
    • Human PBMC assays (10 donors) show no detectable immunogenicity for representative de novo nanobodies.
    • Large-scale benchmarking of structural, energetic, and confidence metrics to distinguish protein binders from non-binders.
    • Curated 3,766 experimentally tested de novo binders across 15 targets from independent campaigns.
    • Of these, 436 were confirmed binders, the remainder non-binders.
    • Each design was re-modelled using AF2 (initial guess + ColabFold), Boltz-1, and AF3.
    • From these predictions they computed 200+ structural and confidence descriptors.
    • AF3-derived confidence scores (especially ipSAE_min) were the best single discriminators, although per-target precision still ranged widely (0.1–1.0), underscoring strong target dependence.
  • 2025-12-12

    CDR Conformation Aware Antibody Sequence Design with ConformAb

    • generative methods
    • structure prediction
    • ConformAb is a guided discrete-diffusion method for antibody lead optimization that preserves the seed binder’s CDR backbone conformation while introducing sequence diversity.
    • Structural preservation is enforced by steering the diffusion process to match the seed’s canonical CDR class probabilities, ensuring generated sequences retain the same canonical backbone geometry.
    • Canonical classes are assigned by folding SabDab and pOAS sequences with ABB2 and labeling them using the Kelow et al. dihedral-based canonical clustering scheme; ConformAb learns to predict these classes from sequence.
    • During generation, a KL-based guidance signal constrains mutations so that each CDR remains in the seed’s canonical class, enabling safe exploration of sequence space around the functional binder.
    • Although ConformAb does not model affinity directly, its structure-preserving diversification enables zero-shot affinity maturation: some variants emerge with improved binding despite using no antigen structure, no repertoire data, and no affinity labels.
    • The method was experimentally validated, generated sequences were expressed, tested by SPR on EGFR, IL-6, and a third target, achieving 15–60% binding rates and, for two targets, producing binders with 3–5x higher affinity than the seed.
    • Crystal structures of top EGFR and IL-6 binders confirmed that, despite substantial and non-conservative mutations, the CDR backbone conformations were preserved, validating the model’s structural guidance in wet-lab experiments.
    • Novel inverse folding algorithm for antibodies with experimental validation.
    • It uses atom-level graph MPNN, structured transformer, novel scoring and AF3 filtering; unlike ProteinMPNN/AbMPNN/AntiFold, which operate residue-level and lack downstream optimization.
    • Only experimental antibody structures (free antibodies + complexes) of antibodies were used for training.
    • AntiBMPNN uses a distinct dataset; AbMPNN and AntiFold rely heavily on modeled structures, unlike AntiBMPNN. So it might just be that the moderate gains in residue retrieval are due to a slightly bigger dataset used.
    • Unlike most models they actually performed experimental validation: ELISA assays on huJ3 (single-points, CDR1, CDR3) and D6 (CDR2), with multiple variants improving binding over wild type.