Executive summary: AbDiver is an online tool that helps researchers make the most of the vast NGS data on antibody mutations by allowing them to spot parallels between natural and therapeutic antibodies. AbDiver accelerates the decision-making process during the lead optimization stage for a more efficient therapeutic pipeline.
Note: This article covers content published in Jakub Młokosiewicz, Piotr Deszyński, Wiktoria Wilman, Igor Jaszczyszyn, Rajkumar Ganesan, Aleksandr Kovaltsuk, Jinwoo Leem, Jacob Galson, Konrad Krawczyk. “AbDiver – A tool to explore the natural antibody landscape to aid therapeutic design.” Bioinformatics, btac151, https://doi.org/10.1093/bioinformatics/btac151
How can researchers design a new drug based on an antibody faster? One good method is based on taking advantage of the natural sequence diversity of these molecules. Our understanding of antibody diversity for antibody engineering has grown significantly due to the deposition of hundreds of millions of human antibody sequences in next-generation sequencing (NGS) repositories.
Researchers can contract a query antibody sequence to naturally-observed diversity in similar antibody sequences stored in NGS to find a mutational roadmap for designing biotherapeutics.
However, the sheer scale of the antibody NGS datasets renders such searches computationally challenging.
To facilitate access to antibody NGS data, a group of researchers, including our team members, developed AbDiver, a free online tool that allows researchers to compare their query sequences to those observed in the natural repertoires.
AbDiver addresses three antibody-specific use cases:
AbDiver was applied to a set of 742 therapeutic antibodies, demonstrating that it can easily retrieve relevant results (for the majority of sequences).
As the underlying data for AbDiver, we used publicly curated, unpaired BCR NGS datasets from the Observed Antibody Space (OAS) (Kovaltsuk et al., 2018).
In May 2021, the set consisted of 81 studies with 906,933,358 unique BCR sequences numbered according to the IMGT scheme (105,730,531 light chains and 801,202,827 heavy chains). We are going to update AbDiver as more datasets become available.
To benchmark our solution, we used a set of 742 therapeutic antibodies, which extended a set from our previous study (Krawczyk et al., 2021).
The AbDiver V-region natural profiling service annotates the variable region of the query antibody sequence with the naturally observed amino acid frequency statistics for each position.
The tool calculates frequency statistics from all antibodies that have the same combination of V-gene and J-gene. The study included amino acid positional frequency if it consisted of at least 100 observations at a given position. For each position, our team calculated the study-specific Shannon entropy and ranks of the amino acids by frequency.
We created indexes based on k-mer (k=5) for CDRs separately in full variable-region sequences and CDR3s. The tool identifies variable sequence matches based on the same length CDR1, CDR2, with one residue discrepancy allowed for CDR3.
AbDiver was created with the goal of helping researchers navigate natural antibody diversity and draw between natural and therapeutic antibodies for the purpose of engineering. This would eliminate Post Translational Modification risks while maintaining favorable biophysical properties. AbDiver also excavates sequences with potentially better product profiles than the lead therapeutic.
We hope that AbDiver supports researchers in designing and engineering therapeutics based on antibodies.