IsoDDE achieves 39% accuracy in high fidelity regime ((DockQ > 0.8)) which corresponds to near-experimental precision with an interface RMSD (iRMSD) typically below 1.0Å. That’s a 2.3x improvement over AF3.
Using a single model seed, IsoDDE successfully predicts 63% of interfaces DockQ > 0.23 (correlating to an iRMSD$ of roughly 4.0Å or less, which is a 1.4x improvement over AF3's single-seed performance.
IsoDDE accurately models the backbone of the highly variable CDR-H3 loop for 70% of antibodies (<2Å) in the test set, outperforming AF3’s success rate of 58% 1.2x.
When scaled to 1,000 seeds, IsoDDE reaches an 82% success rate for correct interfaces and 59% for high-accuracy predictions. So to get results one cannot exactly do it on a laptop.
It is a technical report, architecture is not discussed.
New open source reproduction of AlphaFold3 that either matches or surpasses it.
IntelliFold-2-Pro achieves a success rate of 58.2% (DockQ > 0.23 so about 4A irmsd) on antibody-antigen interactions, outperforming AlphaFold 3's 47.9%.
For small molecule co-folding, IntelliFold-2-Pro reaches 67.7%, surpassing AlphaFold 3’s 64.9%.
Interface Precision vs. Monomers: IntelliFold-2 shows marginal gains in protein monomer accuracy (LDDT of 0.89 vs AF3's 0.88).
A strategy for layer-wise selective fine-tuning of general protein language models.
Instead of full fine-tuning, they found that adapting only the first 50-75% of layers via LoRA provides optimal performance while saving computational costs.
For example, they perform sequence-specific "test-time" training where they optimize the model using a Masked Language Modeling (MLM) objective on the target sequence itself before predicting its properties. This approach led to a 18.4% accuracy boost in predicting the notoriously difficult CDR-H3 antibody loop
Protocol for ultra fast protein structure alignment.
FoldMason represents protein structures as 1D sequences using a structural alphabet (3Di+AA), which allows it to perform multiple alignments using fast string comparison algorithms and a parallelized progressive alignment following a minimum spanning tree.
It operates two to three orders of magnitude faster than traditional structure-based methods, achieving a 722x speedup over tools like MUSTANG and scaling to align 10,000 structures in a fraction of the time required by competitors for just 100.
It matches the accuracy of gold-standard structure aligners and exceeds sequence-based tools, particularly in aligning distantly related proteins or flexible structures that global superposition-based methods struggle to handle.
It is used for large-scale structural analysis of massive databases like AlphaFoldDB, building structure-based phylogenies for proteins that have diverged past the "twilight zone" of sequence similarity, and providing interactive web-based visualizations of complex MSTAs
Method addressing binding prediction strength training on low data noisy dataset.
The researchers address the issue that the field's standard benchmark, SKEMPI2, has significant hidden data leakage where different protein complexes share over 99% sequence identity, leading to inflated performance estimates in models that simply memorize these patterns. Problem raised by many, addressed by hardly any.
ProtBFF injects five interpretable physical priors, Interface, Burial, Dihedral, SASA, and lDDT, directly into residue embeddings using cross-embedding attention to prioritize the most structurally relevant parts of a protein.
By evaluating models on stricter, homology-based sequence clusters (60% similarity), the authors proved that ProtBFF allows general-purpose models like ESM to match or outperform specialized state-of-the-art predictors, even in data-limited "few-shot" scenarios.
Describing a protocol to design mini-binders for a multi domain not that well characterized target using Latent-X1 and to lesser extent Chai.
The protocol used Latent-X1 to generate de novo sequences and initial poses, which were then refolded using Chai-1 to ensure the designs were structurally consistent and plausible.
The final rank was determined by the equation score = 2.0 * Binder PTM - 0.1 * min-iPAE - 0.1 * complex RMSD. This formula prioritized high global confidence (PTM) while penalizing designs where the Latent-X1 pose and Chai-1 refolded structure disagreed (iPAE and RMSD).
To handle the complex, multidomain IgE interface, they first designed binders against a smaller, stable seed on the epsilon3 domain before iteratively expanding the interface toward the full receptor-binding site.
Out of hundreds of generated designs, fewer than 80 candidates across two rounds were selected for wet-lab testing, resulting in a 6% hit rate and the identification of three specific IgE-binding miniproteins
First fully open-source reproduction of the diffusion-based AlphaFold3 architecture that matches or exceeds its performance while strictly adhering to the same training data cutoff and model scale (especially on antibodies!).
Unlike previous open-source models, it exhibits a consistent improvement in accuracy as more computational budget is allocated (you sample more).
Protenix-v1 beats others in antibody-antigen interface prediction, outperforming AlphaFold3 52.31% vs. 48.75% success rate (dockq better than .23). That is nearly doubling the accuracy of open-source like Chai-1 23.12%.
Prompt-based, in-context prediction of antibody developability properties using large language models, rather than training separate predictors per property.
As a baseline, they evaluate TxGemma, a therapeutics-specific multimodal LLM that supports task switching via prompts and is fine-tuned using LoRA.
The study relies on a very large antibody dataset (~876k heavy chains) with in-silico–computed biophysical developability properties, combining sequence-based and structure-based predictors.
Models are trained and evaluated using prompts that include antibody sequences together with partially observed property/value pairs, asking the model to infer a missing property for a query sequence.
To prevent shortcut learning where the model ignores context and relies only on sequence, the authors introduce AB-context-aware training, which applies a random latent transformation jointly to context properties and targets during training, forcing explicit use of contextual information.
By simulating batch effects, they show that standard fine-tuned TxGemma degrades sharply as batch bias increases (from ~0.99 Spearman ρ with no bias to ~0.95 with moderate bias and ~0.58 with strong bias), whereas context-aware training remains robust even under strong batch effects.
De novo platform for epitope-specific antibody design against “zero-prior” targets, i.e. antigen sites with no known antibody–antigen or protein–protein complex structures and limited homology to previously solved interfaces.
The method combines three tightly integrated components: AbsciDiff, an all-atom diffusion model fine-tuned from Boltz-1 to generate epitope-conditioned antibody–antigen complex structures; IgDesign2, a structure-conditioned paired heavy–light CDR sequence design model; and AbsciBind, a modified AF-Unmasked / AlphaFold-Multimer–based scoring protocol using ipTM-derived interface confidence to rank and filter designs.
The platform was evaluated on 10 zero-prior protein targets, with fewer than 100 antibody designs per target advanced to experimental testing; specific binders were successfully identified for 4 targets (COL6A3, AZGP1, CHI3L2, IL36RA).
Experimental validation demonstrated both structural and functional accuracy, including cryo-EM confirmation at near-atomic resolution (DockQ 0.73–0.83) for two targets and AI-guided affinity maturation yielding a functional IL36RA antagonist with ~100 nM potency.
Novel framework that identifies high-affinity leads using data from only a single round of FACS, significantly reducing the labor and reagents required for traditional multi-round affinity maturation campaigns.
Models were trained using log enrichment ratios (continuous) or binary labels (enriched vs. depleted), calculated by normalizing post-sorting FACS abundance against pre-sorting MACS abundance to account for expression biases.
They benchmarked linear/logistic regression and CNNs against a semi-supervised ESM2-MLP approach ; notably, the linear models often outperformed deeper architectures in ranking validated substitutions and offered superior interpretability for identifying confounding signals like polyreactivity.
By generalizing information across all sequences, ML models effectively separated "affinity-driving" mutations from "passenger" substitutions, identifying sub-nanomolar binders that were not prioritized by traditional, more laborious raw sequencing count analysis.
The best-performing models were leveraged within a Gibbs sampling protocol to design novel sequences unseen in the original experiment, ultimately yielding multiple improved binders with up to a ~2500-fold affinity increase over the wild-type.