Computational Antibody Papers

Filter by tags

All

Filter by published year

All

TitleKey points

2025-09-30
Efficient generation of epitope-targeted de novo antibodies with Germinal
- generative methods
- nanobodies
- protein design
- Novel open nanobody design method with experimental validation.
- On the surface it might appear like a lot of methods stitched together. The magic sauce appears to be in the joint, gradient-based co-optimization: AF-Multimer and IgLM gradients are merged through a 3-phase schedule (logits → softmax → semi-greedy), with CDR-masking/framework bias and custom losses that force CDR-mediated, loop-like interfaces; then AbMPNN edits only non-contact CDR residues, and designs are filtered independently with AF3 + PyRosetta.
- All this is actually not a ‘trained’ model but rather a filtering pipeline that WAS NOT trained (using previous methods, gradients, weights etc.) Just validated experimentally.
- Experimental benchmark was ran on four targets: PD-L1, IL-3, IL-20, and BHRF1.
- Authors measured how different their designs weren’t just ‘regurgitations’ of known abs. CDR identities were computed against SAbDab and OAS (via MMseqs); many designs show <50% CDR identity to any public sequence.
2025-09-30
mBER: Controllable de novo antibody design with million-scale experimental screening
- binding prediction
- generative methods
- protein design
- experimental techniques
- Novel de novo antibody design method with massive experimental testing.
- The computational method involves integration, not retraining, of existing tools. It combines AlphaFold-Multimer, protein language models (ESM2/AbLang2), and NanoBodyBuilder2 with templating/sequence priors to design/filter antibody-format binders.
- They perform massive testing. >1.1 million VHH binders designed across 436 targets (145 tested); ~330k experimentally screened.
- Hit rates look low per binder (~0.5–1%) but that’s ~50× above random libraries, and still yields thousands of validated binders.
- Target-level success is 45%, for how many targets we got binders; some epitopes reached 30–38% hit rates after filtering.
- The big caveat is the specificity of epitopes- it really makes a difference, with some epitopes producing nought.
2025-09-12
ImmunoMatch learns and predicts cognate pairing of heavy and light immunoglobulin chains
- language models
- Novel model to predict the heavy/light chain compatibility
- Data: H/L with the same single-cell barcode; negatives = swap L chains between pairs but only if CDRL3 length matches; balanced set of 233,880 pairs with a 90/10 train–test split.
- Training: Full VH+VL into AntiBERTa2 with a classification head; fine-tuned 3 epochs, lr 2×10⁻⁵, weight decay 0.01; κ/λ-specific variants trained identically. Final AUC-ROC 0.75 (withheld) and 0.66 (external); κ/λ models: 0.885/0.831.
- Baselines: (i) V/J gene-usage → logistic reg. & XGBoost ≈ 0.50–0.52 acc.; (ii) CDRH3+CDRL3 CNNs → moderate; (iii) ESM-2 improves with fine-tuning but AntiBERTa2 FT is best.
- It seems to do better than just ‘matching to the database’. Weak gene-usage baselines, explicit control of CDRL3 length in negatives, external generalisation, and sensitivity to interface residues (CDRH1/2 & framework) in therapeutic-antibody tests argue the model learns sequence-level pairing rules, not just V/L distributions.
2025-09-12
The Therapeutic Nanobody Profiler: characterising and predicting nanobody developability to improve therapeutic design
- developability
- nanobodies
- Introduces TNP, a nanobody-specific developability profiler inspired by TAP.
- Uses six metrics: total CDR length, CDR3 length, CDR3 compactness, and patch scores for hydrophobicity, positive charge, and negative charge.
- Thresholds are calibrated to 36 clinical-stage nanobodies.
- In vitro assays on 108 nanobodies (36 clinical-stage + 72 proprietary) show partial agreement with TNP flags, indicating complementary—but not perfectly correlated—assessments.
2025-09-12
RESP2: An Uncertainty Aware Multi-Target Multi-Property Optimization AI Pipeline for Antibody Discovery
- binding prediction
- developability
- Combined in vitro/in silico method for optimization of binders.
- Start from a wild-type scFv (heavy chain), build a random-mutant library, FACS-sort on multiple antigens, deep-sequence bins + input, and use per-sequence enrichment (bin/library) as the supervised target for (antibody, antigen) training pairs.
- Train uncertainty-aware regressors (xGPR or ByteNet-SNGP) on those enrichment targets; run in-silico directed evolution (ISDE) from the WT, proposing single mutations and auto-rejecting moves with high predictive uncertainty while optimizing the worst-case score across antigens.
- Binding is protected by the multi-antigen objective + uncertainty gating during ISDE; risky proposals are discarded before they enter the candidate set.
- Filter candidates for humanness with SAM/AntPack and for solubility with CamSol v2.2 (framework is extensible to add other gates); final wet-lab set kept 29 designs after applying these filters and uncertainty checks.
- Beyond large in-silico tests, yeast-display across 10 SARS-CoV-2 RBDs shows most designs outperform WT; a representative clone (Delta-63) improves KD on 8/10 variants and competes with ACE2.
2025-09-12
Tokenizing Loops of Antibodies
- structure prediction
- generative methods
- Novel model for loop retrieval using embedded structural representation.
- It is a multimodal tokenizer at the antibody loop (CDR) level that fuses sequence with backbone dihedral-angle features and learns a latent space with a dihedral-distance contrastive loss—unlike residue-tokenizers and canonical clusters. It produces both continuous and quantized loop tokens that can plug into PLMs (IGLOOLM / IGLOOALM).
- Trained by self-supervised on ~807k loops from experimental (SAbDab/STCRDab) and Ibex-predicted structures, with four objectives: masked dihedral reconstruction, masked AA prediction, contrastive learning over dihedral distance (with DTW alignment), and codebook learning; followed by two-phase training and specific H100 settings.
- It was benchmarked on a set of computational goals: for H3 loops IGLOO beats the best prior tokenizer by +5.9% (dihedral-distance criterion). (2) Cluster recovery: high purity vs. canonical clusters across CDRs. (3) Downstream PLM task: IGLOOLM improves binding-affinity prediction on 8/10 AbBiBench targets, rivaling larger models. (4) Controllable sampling: IGLOOALM generates diverse sequences with more structure consistency than inverse-folding baselines.
2025-09-05
Antibody immunogenicity prediction and optimization with ImmunoSeq
- developability
- Novel method to assess antibody immunogenicity.
- Created two reference libraries: a positive set from human proteins and antibodies (OAS + proteome) and a negative set from murine antibody sequences (OAS).
- Antibody sequences are fragmented into 8–12-mer peptides.
- Peptide fragments are scored: +1.0 if matching the positive reference, −0.2 if matching the negative reference.
- Validated on 217 therapeutic antibodies with known clinical ADA incidence, showing strong negative correlation between hit rate and ADA.
- On 25 humanized antibody pairs, ImmunoSeq correctly predicted reduced immunogenicity after humanization, consistent with experimental results.
2025-09-05
Assessing the Performance of AF2 and AF3-Implementations on Antibody-Antigen Complexes
- structure prediction
- Benchmarking of docking/complex prediction methods for antibody-antigen (Ab-Ag) complexes.
- Authors used 200 antibody-antigen and nanobody-antigen complexes curated from prior studies, specifically chosen to exclude any complexes present in the training data of the evaluated models.
- Evaluated methods: AF2 (v2.3.2), Protenix, ESMFold, Chai-1, Boltz-1, Boltz-1x, and Boltz-2. (Note: Boltz-2 was only tested on 18 complexes; Protenix failed on 26 large complexes.)
- DockQ and CAPRI criteria were used as primary metrics to assess structural prediction quality.
- AF2 performed best overall, especially for antibody-antigen complexes. Chai-1 outperformed AF2 on nanobody-antigen complexes.
- A composite confidence metric, AntiConf, was introduced, combining pTM and pDockQ2 scores to better assess the quality of Ab-Ag models. AntiConf = 0.3 × pDockQ2 + 0.7 × pTM
2025-09-05
MD-LLM-1: A Large Language Model for Molecular Dynamics
- non-antibody stuff
- language models
- Demonstration showing how large language models (LLMs) can be adapted to reduce the computational cost of molecular dynamics (MD).
- They use the FoldToken encoding to discretize protein 3D conformations into tokens compatible with Mistral, and fine-tune the LLM on short MD trajectories of a single state. The model is then able to generate new sequences of conformations by predicting the next frame from previous frames.
- After fine-tuning, the model can extend trajectories beyond the training data. Starting from a native state, it can discover alternative conformations potential for bypassing kinetic barriers that normally require long MD runs.
- The approach is system-specific (requires an MD trajectory for each protein), does not yet encode thermodynamics/kinetics explicitly, and relies on the choice of structural tokenization.
2025-09-05
AbSet: A Standardized Data Set of Antibody Structures for Machine Learning Applications
- databases
- Introduces AbSet, a curated dataset of >800,000 antibody structures, combining experimental PDB entries with in silico–generated antibody–antigen complexes.
- Adds value beyond SAbDab by standardizing structures, including decoy poses, and providing residue-level molecular descriptors for machine learning.
- Presents dataset profiling and validation, with analyses of structural resolution, antigen diversity, docking quality classification, and descriptor calculation efficiency.