Antibody-specific patent search engine for faster design of antibody therapeutics

patent search, patent database, use case, antibody data

Executive summary: By accessing antibody data in patent documents via NaturalAntibody infrastructure, researchers can accelerate the antibody development process. They can quickly check whether the sequence they’re working on has already been developed by another company and use data stored in patents to study similar antibodies with their targets and engineering properties.

This article is based on findings from ​​Konrad Krawczyk, Andrew Buchanan & Paolo Marcatili (2021) Data mining patented antibody sequences, mAbs, 13:1, DOI: 10.1080/19420862.2021.1892366


Patent documents offer a glimpse into the past 30 years of antibody engineering geared at developing monoclonal antibody therapeutics. The information in patents is potentially valuable for antibody design. But patents aren’t designed to communicate scientific knowledge but provide legal protection.

Can antibody data from patent documents be helpful in sharing engineering know-how, or is it just a legal reference?

To answer this question, we quantified the number of antibody sequences in patents destined for medicinal purposes and checked how well they reflect the primary sequences of therapeutic antibodies in clinical use.

Our analysis of 245,109 antibody chains from patents showed that they reflect the primary sequences of antibody therapeutics in clinical use really well. This means that researchers can find therapeutically relevant information in patents if they identify and extract pertinent data points.


Accessing information about antibodies held in patent literature is challenging. Sequences are buried within documents, hindering researchers’ attempts to quickly check if the sequences similar to what they’re working on have already been developed by another entity.


To address this challenge, we developed the Patented Antibody Database: a collection of antibody data from patent documents encompassing major sources such as USPTO and WIPO.

Our patent database covers c. 250,000 sequences from c. 19,000 documents. We linked each antibody sequence to the text metadata of the document from which it originated - for example, patent title or abstract. This accelerates the process of text-based searches for sequences associated with specific biological entities.

Key use cases for users:

  • Quickly checking whether an antibody was developed by another pharma company using an antibody-specific patent search engine
  • Identifying similar antibodies to the ones being studied and exploring their targets and engineering properties.

See how our Patented Antibody Database works in a demo version of our solution.

Reach out to us to learn more about our Patented Antibody Database and its place within the NatuarlAntibody data ecosystem.

Take a look at AbStudio - a solution that allows teams to create, collate, and discover antibody-specific datasets to accelerate research decision-making.