Novel algorithm (ITsFlexible) for predicting conformational flexibility of antibody and TCR CDR3 loops and CDR3-like loops.
Dataset of 1.2M loops extracted from all antiparallel β-strand motifs in the PDB, including antibody and TCR CDR3s, representing the same secondary-structure pattern.
Flexibility defined by structural clustering: multiple conformations with pairwise Cα RMSD > 1.25 Å yield a “flexible” label.
Model is a three-layer equivariant GNN (EGNN) trained as a binary classifier (rigid vs flexible).
ITsFlexible is better than random, but it only just about outperforms loop-length, solvent exposure, pLDDT, RMSPE, and AF2-MSA-subsampling baselines. Best results are obtained when using crystal structures rather than predicted ones showing that modeling is still the roadblock for predictability. So the gain is very moderate against strong but simple baselines such as loop length.
They performed cryo-em validation which is a huge positive of the paper - three antibodies were experimentally solved; two predictions matched the data, one did not (likely due to antigen-binding-induced rigidification).
New biomolecular generative algorithm for protein/molecular design
It extends AlphaFold3 architecture into a generative “world model” that designs interactions across proteins, nucleic acids, and small molecules using a shared token space and conditional diffusion.
High-throughput in-silico design: It achieves up to 100- to 1000-fold higher computational throughput than diffusion or hallucination baselines (RFDiffusion, BoltzDesign, etc.) across 11 computational benchmark tasks.
Description of two models for antibody property prediction , ANTIPASTI (CNN on structural correlation maps for affinity) and INFUSSE (Graph + ProtBERT hybrid for flexibility).
Both tested on curated antibody and antibody-antigen datasets (no new wet-lab validation, only structural data).
B-factor prediction links sequence, structure, and local dynamics-showing that antibody flexibility is partly learnable from data. Trained only on antibody/antigen data and outperforms a baseline trained on generic proteins.
A heavy-chain–only version of ABodyBuilder2, removing the light-chain component entirely.
The model is substantially faster than ABodyBuilder2 and comparable to IgFold or AlphaFold2 owing to (i) smaller embedding dimensions (128–256 vs 384 in ABB2), (ii) use of fewer submodels (3 vs 4), (iii) omission of the refinement step by default, and (iv) the inherently shorter sequence length of single heavy chains.
Accuracy-wise, HeavyBuilder performs on par with ABodyBuilder2, IgFold, and AlphaFold2 for framework and CDRH1–H2, and is slightly better for CDRH3 (∼3.4 Å RMSD vs ∼4 Å for others). While ABodyBuilder2 achieves 2.99 Å on CDRH3, that figure depends on the inclusion of the paired light chain, so the authors note that it is not a fair comparison.
It predicts antibody paratopes from sequence alone by concatenating embeddings from six protein language models — AbLang2, AntiBERTy, ESM-2, IgT5, IgBert, and ProtTrans
It does not require structural antibody data nor antigen data.
Across three benchmark datasets (PECAN, Paragraph, MIPE), it outperforms all sequence-based and structure-modeling methods, achieving PR-AUC up to ~0.76 and ROC-AUC up to ~0.97.
The training set is somewhat similar in size to previous methods so the better performance is not due to increase in number of structures in sabdab alone.
It was benchmarked against a positional-likelihood baseline (predicting commonly binding positions) and surpassed it by a reasonable margin (PR-AUC ~0.73 vs. ~0.62).
Peleke-1 models were fine-tuned on 9,500 antibody–antigen complexes from SAbDab, each annotated with interacting residues identified from crystal structures.
Structure was incorporated by annotating epitope residues explicitly in antigen sequences, allowing the LLMs to learn binding context without direct 3D input.
Generated antibodies were assessed for humanness, structural validity, stability (FoldX), and binding affinity (HADDOCK3) across seven benchmark antigens.
Novel protein design framework based on a unified all-atom diffusion model that performs both structure prediction and binder generation.
It is fully open and free.
Training setup resembles recent diffusion architectures (e.g., AlphaFold3, Chai), but its distinguishing feature is broad wet-lab validation across diverse target types.
Experimental scale: generated tens of thousands of nanobody and protein designs for 9 novel targets (no homologous complexes in PDB).
Results: tested 15 designs per target, obtaining nanomolar binders for 6 of 9 targets (≈66% success rate) — a notably strong experimental outcome.
Introduced a novel pairing predictor for VhVl chains with a clever strategy to sample negative pairs.
Defines three negative sampling strategies:
Random pairing, where heavy and light chains are shuffled without constraints.
V-gene mismatching, where non-native pairs are generated by combining VH and VL sequences drawn from different V-gene families, but within biologically plausible V-gene segments. This captures realistic but unobserved combinations that could occur during recombination.
Full V(D)J mismatching, where heavy and light chains are paired using completely distinct germline origins across V, D, and J gene segments. This produces negative examples that are maximally diverse yet biologically meaningful, reflecting combinations never seen in natural repertoires.
Shows that the space of possible VH–VL germline combinations is far larger than what is observed in public datasets, revealing non-random biological constraints on pairing.
Demonstrates that models trained on V-gene and especially VDJ mismatched datasets achieve the highest and most generalizable performance, outperforming existing methods such as ImmunoMatch, p-IgGen, and Humatch — confirming that biologically grounded negative sampling is key to robust VH–VL pairing prediction.