Computational Antibody Papers

Filter by tags
All
Filter by published year
All
TitleKey points
    • Using language models to predict polyreactivity.
    • Polyspecificity and polyreactivity are cognate, however the first is thought to be driven by factors such as overlapping epitopes whereas polyreactivity by excess charge or hydrophobicity.
    • Baculovirus particles assay (BVP) is often used to test polyreactivity. mAbs are added at high-concentrations to BVP coated plates.
    • They generated a dataset of polyreactive antibodies (~300 antibodies) that was heterogeneous in terms of antibodies/nanobodies, monospecific and formats.
    • They tested different concentrations (from 6.67nM to 667 nM) and well coating types (percentage BVP) - this was aimed at reducing noise from experimental conditions.
    • They tested two prediction modes, language models and structural descriptors. For language models, PROT5, ESM2 and Antiberty were used. Descriptors were calculated using Alphafold2-multimer. The language model predictions were superior to those calculated from AF2-multimer ones.
    • They introduced a set of single and double mutations based on most likely variants proposed by an ensemble of language models (ESMs). Most of the mutations not only didn’t remove binding ability, but actually improved it.
    • They performed evolution with the ESM-1b language model and the ESM-1v ensemble of five language models (six language models in total)
    • In the first round of evolution, they measured the antigen interaction strength by biolayer interferometry (BLI) of variants that contain only a single-residue substitution from wild-type.
    • In the second round, they measured variants containing combinations of substitutions, where we selected substitutions that corresponded to preserved or improved binding based on the results of the first round.
    • They performed these two rounds for all seven antibodies, measuring 8–14 variants per antibody in round one and 1–11 variants per antibody in round two
    • Across all seven antibodies, they found that 71–100% of the first-round Fab variants (containing a single-residue substitution) retained sub-micromolar binding to the antigen, and 14–71% percent of first-round variants led to improved binding affinity (defined as a 1.1-fold or higher improvement in Kd compared to wild-type)
    • Thirty-six out of all 76 language-model-recommended, single-residue substitutions (and 18 out of 32 substitutions that lead to improved affinity) occur in framework regions.
    • They found that Fabs for 21 out of the 31 language-model-recommended, affinity-enhancing variants that we tested had a higher melting temperature (Tm) than wild-type, and all variants maintained thermostability (Tm > 70 °C).
    • They tested for polyspecificity but there were no off the chart changes in the poly profile.
    • Five out of 32 affinity-enhancing substitutions (~16%) involve changing the wild-type residue to a rare or uncommon residue
    • Approach based on general protein language models consistently outperformed all baseline methods, including the antibody-specific ones (!).
    • They developed a language model to predict protein protein interactions from sequence on the basis of a large language model.
    • They Train protBERT to predict PPI.
    • They use the BIOGRID dataset, where interactors are mapped if they are confirmed by two independent sources, such as two independent experimental techniques in two separate studies. In total they have 179,018 positive pairs.
    • They use negatome 2.0 as a negative dataset. It relies on various sources such as manual curation from the literature or subunits from the PDB that do not interact with each other. Total of 3,958 pairs were used.
    • They use ProtBERT-BFD to pretrain the model.
    • They mapped each protein pair as [CLS] Protein A [SEP] Protein B [SEP], mapping the final output to binary.
    • They achieve 92% accuracy on the test set.
    • They also perform well on annotating negative binders as coming from different subcellular compartments. On positive samples in this dataset, the model was 85% accurate. On negative samples, SYNTERACT was only 38% accurate, classifying many negatives from subcellular compartment sampling as interactors.
    • Proposing a Bayesian scheme to optimally select generated antibodies from a previously introduced language model (GLM).
    • They use the 1B GLM-AB model from BioMap. Training involves variation on MLM that masks entire spans of sequence.
    • The entire point is how to ‘select’ better antibodies according to some unknown ‘fitness function’. If you only get a few experimental data points at a time to evaluate f, you’d better make them count. Their combination of bayesian scheme and a language model optimizes how the ‘next generated sequence points’ are picked so that best approximation to f is reached.
    • They use (computational simulation) Absolut! framework rather than wet lab data.
    • Demonstration that training the transformers on paired antibody data provides improvements over single-chain models. They created two models - one trained on single chains from Jafffe, the other with the paired information.
    • when comparing embeddings, they only extracted one sequence from the paired transformer to make comparison with the single sequence transformer sound.
    • They showed what happens when one performs UMAP of the light chains of the paired and unpaired transformer. The unpaired transformer produces more random dispersal, whereas the paired variety performs much tighter clustering, similar to heavy chains. Performance on heavy chain clustering is similar.
    • They asked for prediction of masked positions in heavy chains when the chain is paired with the native mutated variety versus back-mutated germline one. The cross entropy loss was much better when the prediction was made in the presence of the native mutated light chain.
    • They contrasted their paired model with ESM2 (650M). They fine tuned ESM2 on the paired data. They averaged all attention heads over all layers to get a single score. The fine-tuned ESM2 attends to conserved Cysteines and CDR regions whereas EMS2 does not, focusing more on linear stretches
    • Architecture was ROBERTA, 24 layers, 16 attention heads, embedding 1024, feed forward 4096.
    • trained using MLM for 100 epochs.
    • Introducing ESM-2 and ESMFold. Scaling transformer model parameter size to 15B allows for more precise predictions of structures.
    • They make available an atlas of 617 million predicted structures
    • Learning objective is MLM, masking 15% of protein input.
    • Perplexity ranges from 1 for a perfect model to 20 for a model that makes predictions at random. Intuitively, perplexity describes the number of amino acids the model is uncertain between when it makes a prediction
    • After 270k training steps the 8M parameter model has a perplexity of 10.45, and the 15B model reaches a perplexity of 6.37.
    • The 15B model achieves best perplexity and structural modeling accuracy.
    • For some structures, accuracy of structure prediction jumps from 7.7A at 8m parameters to 7.0A at 35m parameters and to 3.2A at 15m parameters. The 3b model brings it down to 2.8 and 15B model to 2.6. For other structures, good prediction is only achieved at 15B
    • Their structure predictor closely follows AlphaFold2, but instead of evoformer, they use the representation from the ESM-2.
  • 2024-03-13

    Language models enable zero-shot prediction of the effects of mutations on protein function

    • non-antibody stuff
    • language models
    • experimental techniques
    • They contrast ESM to some other language models and show that in zero shot fashion some correlations can be made with experimental measurements of variants.
    • They compare performance of ESM and DeepScan on 41 deep mutational scanning datasets collated in a single paper. They claim ESM has better overall correlations but it is not crystal clear from the graph and by their own admission by paired t-test.
    • They find that pretraining the data on Uniref30 gives worst performance. An ok performance is given for Uniref50 or Uniref70 with a dip again at Uniref100.
    • Binding sites have much higher conservation.
    • Core of the protein also appears to have lower conservation.
    • 100B parameter protein model fine-tuned and a 1B antibody-specific model.
    • For PLM training they employ data from Uniref90 and ColabFold. After filtering and deduplication they are left with approximately 350m sequences, or 100B tokens.
    • On proteins, xTrimoPGLM-100B outperforms ESM2-15B on 12 of 15 downstream tasks (e.g. thermostability, structure prediction etc.).
    • They train a 1B protein model and then fine tune it on antibodies from OAS
    • Their masking procedure includes span masking not only several residues at a time.
    • They use 678m OAS sequences.
    • They benchmarked the antibody model on naturalness and antibody structure prediction and the Xtrimo-pglm-oas outperformed ESMFold, ALphafold2 and IgFold.
    • They developed and benchmarked a suite of language models for proteins available in the Progen2 suite. Zero shot fitness predictions of antibody specific models did not provide better results.
    • Authors stipulate that data that should be fed to the models should be carefully selected, not simply provided in the raw format.
    • Learning objective is autoregressive, predicting the next token.
    • The family is trained on 151M, 764M, 2.7B, and 6.4B parameters
    • They clustered OAS at 85% sequence identity using linclust yielding ~554m sequences.
    • Each sequence is then provided as-is and flipped.
    • DMS studies: We collected expression and antigen-binding enrichment measurements for variants of the anti-VEGF g6 antibody from a DMS study (Koenig et al., 2017). From a second DMS study, we collected binding enrichment measurements for variants of the d44 anti-lysozyme antibody (Warszawski et al., 2019). Binding affinity (KD) and thermal stability measurements (TM) for the remaining six antibodies (C143, MEDI8852UCA, MEDI8852, REGN10987, S309, and mAb114) were drawn from a recent study on antibody affinity maturation using pretrained language models (Hie et al., 2022).
    • The larger the number of parameters, the lower the perplexity on the hold out test set.
    • Testing on antibody binding, expression and melting temperature, the OAS-trained model performs worse than the generalistic models.
    • Diffusion-based antibody-antigen binding site structural co-design
    • Sampling of antibody sequence and structure directly conditional on the antigen structure.
    • Model receives antigen structure and antibody framework in complex. Then CDRs are randomly initialized with AA types, orientations and positions.
    • The advantage over GANs and VAEs should be that it generates candidates iteratively so filters can be applied on the fly to the sampling process.
    • Diffusion probabilistic models learn to generate data via denoising samples from a prior distribution
    • They predict the amino acid type, ca coordinate and orientation in SO(3)
    • In addition to the joint design of sequences and structures, we can constrain partial states for other design tasks. For example, by fixing the backbone structure (positions and orientations) and sampling only sequences, we can do fix-backbone sequence design.
    • We cluster antibodies in the database according to CDR-H3 sequences at 50% sequence identity.
    • RMSD: is the Cα root-mean-square deviation (RMSD) between the generated structure and the original structure with only antibody frameworks aligned - however here higher RMSD means that the generated structure is more diverse.
    • However they also checked how accurate they are in RMSD when they fix sequences (so structure gets modified). Here for H3 they achieve 3.246 A.
    • AAR: is the amino acid recovery rate measured by the sequence identity between the reference CDR sequences and the generated sequences
    • They compared to RosettaAntibodyDesign by IMP (percentage of CDRs with better energy than the original cdr, AAR and Ca RMSD)
    • They optimize the antibody by perturbing it for several steps (forward diffusion) and then denoise it (going backwards) to find antibodies with better IMP but they also look at RMSD and Seq id.