Computational Antibody Papers

Filter by tags
All
Filter by published year
All
TitleKey points
    • Evedesign, an open-source, method-agnostic framework that standardizes biosequence design by enabling different machine learning models (sequence, structure, and evolutionary) to work together in a single workflow.
    • It works by framing design as a conditional modeling problem using three composable operations: Generate (creating new sequences), Score (predicting fitness or likelihood), and Transform (mapping between representations like sequence-to-structure).
    • The authors did not perform new wet-lab experiments; instead, they tested the framework by computationally reproducing previous studies, showing ESM-2 and ProteinMPNN could successfully rank and prioritize known beneficial mutations from existing antibody datasets.
    • First autonomous nanobody design agent.
    • Prompted by high-level goals: It translates natural language objectives, like "inhibit X interaction with Y", into complete design campaigns.
    • The agent queries literature/databases, uses bioinformatics tools, and prompts the user for specific strategic clarifications.
    • 56x expert-level speedup, by compressing weeks of expert research and computational tasks into hours by automating reasoning-intensive steps.
    • In lab tests, it successfully generated functional binders for 6 out of 9 attempted targets.
    • Authors advocate for a "prompt-to-drug" autonomous pipeline, using a central AI orchestrator to connect disparate pre-clinical and clinical steps agentically.
    • While modular proofs-of-concept exist, they remain domain-specific, brittle, and far from full-cycle implementation in actual drug discovery programs.
    • A primary recommendation is to eliminate "data silos" by making research open, peer-reviewed, and accessible via APIs to ensure outputs are easily "machine-readable" for AI training.
    • The system faces significant hurdles from LLM hallucinations and "cascading errors," where a single early-stage miscalculation (like an incorrect binding pocket) propagates through the entire chain.
    • Despite the push for autonomy, authors argue "human-in-the-loop" checkpoints remain legally and ethically mandatory for high-stakes regulatory and clinical transitions.
    • Analysis of developability data from 33 internal Biogen programs, covering 18,540 antibodies.
    • Focused on three dimensions: hydrophobicity (HIC), polyspecificity (PSR), and self-association (AC-SINS).
    • Labeled subsets included 4,594 (PSR), 1,792 (HIC), and 7,727 (AC-SINS) sequences.
    • Benchmarked three PLMs: ESM2 (general-purpose), plus IgBert and IgT5 (antibody-specific).
    • Domain-adaptive fine-tuning consistently boosted antibody-specific PLMs, but often degraded ESM2 performance.
    • Antibody-specific PLMs generally provided better embeddings for PSR and AC-SINS, while ESM2 remained highly competitive for HIC.
    • Perplexity was only weakly correlated in aggregate, but showed significant association with PSR/AC-SINS failure when controlled for a fixed light chain
    • Novel experimental and computational pipeline designed to characterize nanobody immune repertoires following immunization and phage display selection - NanoMAP.
    • It introduces a flexible clustering method that identifies clonal families by grouping sequences with similar V/J segments and CDR lengths, then applying a unique merging step that allows for minor CDR variations.
    • When benchmarked against MMseqs2 and Immcantation (SCOPer), NanoMAP scored higher on computational metrics (Silhouette, phenotypic quality, and stability) and showed better alignment with expert-curated "ground truth" labels.
    • Novel generative framework to design protein binders from NVIDIA.
    • Antibodies/nanobodie are not singled out for analysis.
    • First framework to unify generative modeling with hallucination-based optimization, allowing for a strong generative prior to be steered by inference-time compute.
    • The authors introduced Teddymer, a dataset of ~510,000 synthetic dimers created from AlphaFold predicted domain-domain interactions to overcome the scarcity of experimental multimer data.
    • The model uses advanced search algorithms, including Beam Search, Feynman-Kac Steering, and MCTS, to navigate the generative space and find high-quality binders.
    • It achieved state-of-the-art results on protein targets, small molecules, and enzyme design tasks, consistently outperforming baselines like RFDiffusion and BindCraft.
    • No Wet-Lab testing. Hopefully just yet.
    • AnewOmni, foundation model that unifies the design of small molecules, peptides, and antibodies into a single framework.
    • The team evaluated approximately 3,000 candidates for the "undruggable" KRAS G12D target by alternating between AnewOmni for CDR design and AlphaFold3 for structural validation.
    • Out of 7 synthesized nanobodies, the model achieved a 75% success rate (3 out of 4) when using a conservative structural consistency filter.
    • The most successful nanobody design demonstrated a high binding affinity with a Kd of 587 nM
    • CALM, a "sequence-native" foundation model that maps antibody and antigen primary sequences without requiring structural inference.
    • CALM employs modality-specific encoders (AntiBERTy for antibodies, ESM-2 for antigens) to align cognate pairs in a shared embedding space using cosine similarity.
    • Authors evaluate performance by the model's ability to pick the correct partner from a candidate pool in both directions ab->ag, ag->ab.
    • Calm uses optional structural masks to restrict inputs to paratope and epitope residues, which significantly reduces sequence noise and improves accuracy (but clearly needs a structure).
    • CALM achieves Top-1 of 2% in strict out-of-distribution tests, representing a 3x to 46x improvement over random baselines despite a low-data regime.
    • They lay out an autoregressive decoder for de novo design, though this generative component was not trained or tested in this study.
    • The authors evaluated AlphaFold3, Boltz-2, and Chai-1 on their ability to distinguish cognate (correct) nanobody-antigen pairs from incorrect, non-binding pairings.
    • They used 106 experimental complexes and generated a combinatorial matrix of 11,132 shuffled non-cognate pairings to serve as ground-truth "incorrect" decoys.
    • Internal confidence scores (specifically ipTM) were very weakly predictive of true binding. In terms of Average Precision (PR-AUC), AF3 performed best, followed by Chai-1 and then Boltz-2.
    • Increased sampling improves structural geometry but does not help models "select" the correct binder. Most quality gains occur within 10–25 samples; deeper sampling primarily increases the number of plausible-looking false positives.
    • Novel training scheme for antibody language models, modeling phylogenetic relationships rather than pure mutational MLM - called DASM.
    • Unlike AbLang2’s standard masked language modeling , DASM uses a mutation-selection framework that factors out nucleotide-level biases (like the codon table and SHM rates) to isolate purely functional selection effects.
    • The model was trained on approximately 2 million parent-child sequence pairs derived from reconstructed B cell phylogenies , using datasets such as JaffePaired, Tang, and Vanwinkle.
    • Model is a compact 4-million-parameter Transformer-encoder featuring 5 layers, 8 attention heads , and a custom "wiggle" activation function to stabilize output selection factors.
    • DASM was validated on the FLAb collection (Koenig and Shanehsazzadeh datasets) and MAGMA-seq high-throughput binding assays for influenza and SARS-CoV-2 antibodies. It was better than ABlang2, progen2 and esm2.