Computational Antibody Papers

Filter by tags
All
Filter by published year
All
TitleKey points
    • Method to select mutants computationally for lab testing.
    • "Stochastic beam search," a sequence-centric method that evaluates masked language models (MLMs) via pseudo-log-likelihood, rather than using costly mutation-centric approaches.
    • This technique is computationally efficient and produces higher-quality sequences by better balancing likelihood and diversity.
    • The method was extensively validated through both in silico evaluations across various models and direct head-to-head in vitro antibody campaigns.
    • In wet-lab testing, the optimized models effectively screened for synthesizability and binding, with supervised guidance achieving a 100% success rate in the experiments
  • 2026-04-30

    Lightning Boltz

    • protein design
    • Implementation adjustments to Boltz-2 that make it run much faster.
    • One of the biggest hurdles of running methods that use MSAs are MSA servers. They are computationally expensive and difficult to set up.
    • MSA carries a lot of predictive value so skipping this step is unwise.
    • Integrates MMseqs2-GPU directly into the Boltz-2 pipeline, removing the primary CPU bottleneck and enabling high-throughput, local structure prediction.
    • This implementation streamlines the MSA process, making the predictions order of magnitude faster.
  • 2026-04-30

    PromptMOL

    • structure prediction
    • PromptMOL: a PyMOL plugin that enables direct interaction with molecular structures using natural language commands.
    • While PyMOL is a gold standard in structural biology, its interface can be complex; PromptMOL removes the barrier to entry by replacing convoluted script syntax with simple, descriptive prompts.
    • By hooking directly into LLMs (via local models like LM Studio, or cloud providers like OpenAI and Anthropic), the plugin intelligently handles selections, coloring, structural analysis, and rendering tasks on the fly.
    • Generation of a large-scale, heterogeneous antibody developability dataset for AI benchmarking.
    • Built from 50 seed antibodies with up to 99 engineered variants each, resulting in thousands of unique, wet-lab-validated sequences.
    • Assesses six critical developability traits: expression, purity, thermostability, aggregation, polyreactivity, and hydrophobicity.
    • Benchmark results are currently accessible via Amazon Bio Discovery, with further findings slated for a formal publication later this year.
    • Case study & framework how to tie together available computational annotators to perform cross reactivity optimization for a VHH.
    • It replaces inefficient, sequential screening pipelines with a multi-objective Bayesian optimization loop. It uses a Gaussian process surrogate model coupled with a genetic algorithm to navigate complex sequence spaces and identify Pareto-optimal candidates.
    • The framework is model-agnostic; users must provide and validate the in silico "oracles" (predictive models) relevant to their specific optimization goals. Objectives are defined by selecting and potentially weighting these interchangeable scoring functions.
    • The authors rigorously benchmarked BOAT against standard genetic algorithms and generative baselines (like LaMBO-2). Testing relied on computational benchmarks, including comparing results against exhaustive "ground truth" Pareto fronts in limited search spaces.
    • The study did not perform wetlab validation. Because the framework relies entirely on in silico oracles as proxies, the final experimental success of the optimized candidates is ultimately tied to the predictive quality of the models the user selects.
    • Case study application of generation of novel HER2 binders using the Herceptin template, with five specific computational properties (HER2 specificity, FvNetCharge, FvCSP, HISum, and MHC II minPR) encoded as constraints.
    • The authors train a conditional CDRH3 GPT (based on a mini GPT-2 architecture) using large-scale sequences sourced from the OAS database.
    • Sequences are computationally annotated with property labels and refined via reinforcement learning (RL) to satisfy multi-property constraints.
    • Target-specific binding predictors (oracles) are used to guide the RL process to generate CDRH3 sequences that exhibit HER2-targeting capabilities similar to Herceptin.
    • Wet-lab validation confirms HER2-binding affinity and tumoricidal efficacy; while physical developability assays were not performed in the lab, these traits were primary objectives of the computational design stage.
    • Protein design model applied to antibodies and lab-tested.
    • Protenix-v2 is an integrated biomolecular modeling system that enables high-accuracy structure prediction, zero-shot generative binder design, and improved ligand-related plausibility.
    • The system incorporates refined architecture and training optimizations, while strictly excluding all wwPDB entries released on or after September 30, 2021, to prevent data leakage.
    • Performance was assessed using DockQ success rates on antibody-antigen interface benchmarks, BLI-confirmed hit rates across diverse soluble and membrane-protein targets, and PoseBusters-style chemical validity metrics
    • Workflow for de novo nanobody design: Establishing an integrated computational-experimental pipeline for single-domain antibody discovery.
    • Selected a novel target for Desmoplastic Small Round Cell Tumor (DSRCT) with no prior structural or antibody data.
    • Used an AI agent to synthesize bioinformatics tool outputs and recommend 8 binding hotspots.
    • Employed RFantibody, mBER, and IgGM to generate 288,000 unique candidates.
    • Nominated 100,000 designs via Pareto-based filtering for yeast surface display and FACS enrichment.
    • 116 enriched candidates were characterized by SPR, yielding 46 confirmed binders (39.7% hit rate) with affinities as low as 0.66 nM.
    • Training of a baseline developability predictor on the Gingko dataset.
    • Utilized the GDPal benchmark from Ginkgo Bioworks, consisting of 242 therapeutic IgGs across five assays: HIC, AC-SINS, PR_CHO, Titer, and Tm2.
    • Employed frozen ESM-Cambrian encoders (up to 6B parameters) to generate embeddings, which were processed by property-specific attention decoders (Self, Self+Cross, or Bidirectional Cross) and a prediction head.
    • Achieved significant improvements over baselines in 3/5 properties: expression titer (+20%), thermal stability (+18%), and polyreactivity (+12%).
    • Optimal attention schemes differ by property; self-attention alone suffices for aggregation-related traits (HIC, PR_CHO), while bidirectional cross-attention is required for properties involving inter-chain compatibility (Titer, Tm2).
    • Very lightweight method to predict binding affinity of antibody-antigen complexes.
    • Local sequence fragments of length 2r+1 are extracted around mutation sites. Distances between these fragments are calculated using the Levenshtein distance to account for sequence shifts.
    • Targets are predicted using a k-nearest neighbors (kNN) approach (regression or classification) based on the closest matching fragments in the training set.
    • Despite its simplicity, the model achieves results comparable to state-of-the-art machine learning models on datasets like AB-Bind, AbDesign, and Alphaseq.
    • It serves as an interpretable benchmark particularly suited for data-sparse, target-specific antibody engineering where experimental data is limited.