Antibody Knowledge Graph

Benefit from decades of research packed into Antibody Knowledge Graph.A curated and comprehensive knowledge graph that opens the doors to data accumulated through decades of research into therapeutic antibodies. Accelerate your research with easy access to antibody data that is no longer fragmented or non-standardized. Reap benefits from antibody-specific searches and generate novel conclusions about the biology of antibodies thanks to our data integration.

Learn more
Problems we solve

Access to comprehensive data

The Antibody Knowledge Graph allows easy access to decades of accumulated knowledge on natural and therapeutic antibodies.

Unveil the connections

Analyze data in the graph to discover hidden links between natural and therapeutic antibodies - and accelerate your pipeline.

Reach beyond the in-house data

Contrast your molecules to prior research findings and gain novel insights going way beyond your in-house data.

Data sources

Patent Documents

We collect data on antibodies from patent documents collected in recognized sources - our database covers c. 250,000 sequences from c. 19,000 documents. Each antibody sequence is linked to the text metadata of the document from which it originated (for example, patent title or abstract). This streamlines the process of text-based searches for sequences associated with specific biological entities.



The FDA approved 100 antibodies for therapeutic use, with hundreds more undergoing clinical trials. We collect the sequences of these antibodies, including their assigned International Nonproprietary Names (INNs) - and then associate them with rich metadata such as target information. Our database currently covers more than 750 therapeutic antibodies.



The Protein Data Bank (PDB) is the primary public source for three-dimensional conformation data on biomolecules. We identify antibody sequences from the PDB using sequence features and text mining of metadata fields linked to particular chains and entire PDB documents. We have identified close to 5,000 structural depositions containing antibodies.


Scientific publications

Often times biological sequences are not deposited in standardized repositories such as GenBank and these can instead be directly in scientific publications and their supplementary material. There currently does not exist a reliable automatic method to identify such sequences. Therefore this category encompasses such antibody sequences that we add to our database on the basis of manual curation of scientific publications. Antibody sequences here are linked to the metadata of publications that they originate from to facilitate text-based retrievals.



Next Generation Sequencing (NGS) allows querying the vast variability of antibody sequences. We identify projects that performed NGS for antibodies and analyze such datasets to get richer information on sequence variability instead of studying the limited germline sequences available. We currently track variable region sequences from almost 250 independent studies in an ever-growing dataset.