Antibodies are proteins found in (jawed) vertebrates. They play a key role in the immune system by minimizing or preventing damage caused by external agents such as viruses.
Where do antibodies come from? What is the structure of an antibody? And how does it all work? Keep on reading to learn the essentials of antibody function and how it comes about as a result of its sequence and structure.
Table of contents:
Antibodies are naturally occurring proteins that recognize a wide variety of molecular surfaces. This is in fact how they defend living organisms against diseases. The primary goal of an antibody is to identify non-self molecular surfaces to neutralize noxious material (such as viruses).
This makes antibodies valuable for clinical diagnostic and therapeutic domains. Their versatile binding nature opens the doors to developing molecules that bind and modify the function of virtually any molecule - the key premise of diagnostics and drug development.
A typical antibody in its IgG isotype consists of four polypeptide chains. Two of them are identical to each other and are referred to as heavy chains. The remaining two are also identical and are known as light chains. The four chains are connected via disulfide bonds.
For purposes of antibody engineering, we primarily look at a single combination of the heavy and light chains.
We can divide each chain naturally into two parts: the constant and variable domain.
The constant domain exhibits little or no variability from one antibody to another and can be thought of as a tag. It acts as a homing mechanism for other components of the immune system to identify a particular antibody bound to something.
The constant domain allows the antibody to interact with other components of the immune system or to be transported to a specific location in the organism. The variable domain exhibits much greater variability and determines which molecules the antibody will bind to.
The variable domain contains about 110-150 amino acids. Within the variable domain, we can identify even smaller regions that are known to be particularly important for the binding process: complementarity-determining regions (CDRs).
Every chain contains three such regions. These are denoted by CDRH1, CDRH2, CDRH3, and CDRL1, CDRL2 , CDRL3 for the heavy and light chains, respectively.
The main role of antibodies is to bind to molecules or agents that pose a threat to the organism. The molecules antibodies bind to are referred to as antigens.
In some cases, the act of binding turns out to be sufficient as it deactivates the potentially dangerous agent - for example, by blocking its active site or forcing it to precipitate out of the solution.
In other cases, however, this is just the first step in which the antibody plays the role of a tag (see the entry on the constant domain above). This tag allows other components of the immune system to quickly identify and neutralize the antigen.
In both scenarios, it’s the binding process that determines the specificity and performance of an antibody. These properties are completely determined by the polypeptide chains and, in particular, their CDRs.
Hence, understanding the relation between the structure of an antibody and the antigens it binds to is key for engineering new antibodies with therapeutic potential.
Note: the subset of residues that recognize an antigen on the antibody is called a paratope, whereas the part on the cognate set of residues on the antigen is called the epitope (see Figure 1 reproduced from (Norman et al. 2020)).
Antibodies can be broadly categorized into two groups.
Let’s focus exclusively on the second type of antibody now.
In a living organism, antibodies can be either present on the surface of a B cell (B-cell receptors) or in a soluble form in extracellular fluids. The existence of these two forms is linked to how antibodies are produced by B cells.
The surface of every B cell contains a large number of antibodies. A typical human B cell counts some 50000-100000 on its surface.
When a B cell comes across an antigen to which its surface antibodies can bind, it becomes activated. Activated B cells start to proliferate and differentiate, producing effector B cells and memory B cells.
The main purpose of effector B cells is to generate soluble antibodies that will then combat the encountered antigen.
Memory B cells, on the other hand, are like improved versions of their parent cell: they’re triggered by the same antigen but the response is faster and more effective as a result of prior exposure to the antigen. As the name ’memory’ suggests, they can survive in the organism for years, providing a long-lasting immune response to that particular antigen.
Since the number of potential antigens is significant, the naive repertoire needs to at the very least match this volume to provide initial, though weak, binders to mount an effective immune response producing highly specific & strong binders.
In humans, the theoretical number of possible antibody sequences can reach one trillion (Soto et al. 2019; Birney et al 2019). On the other hand, the number of genes that encode antibodies is relatively small, so to generate such diversity we need a number of additional mechanisms.
Here’s a quick overview of such mechanisms:
The variable region of an antibody is encoded in several gene segments that come in three distinct classes:
Gene segments of every class appear in multiple non-identical copies (for example, several non-identical V genes). A developing B cell randomly chooses a set of segments - one from each class for the heavy chain and one from V and J classes for the light chain. For a reference of antibody genes, please seeIMGT (Lefranc et al. 2005) or VBase (Retter et al. 2005).
This choice determines the exact sequence and structure of the variable region of antibodies produced by this B cell and is the first source of diversity of antibodies.
Another fine-tuning mechanism is employed when a B cell becomes activated by binding to an antigen. Such a B cell begins to divide and during this process, the relevant genes are subject to a high rate of point mutations. This process is known as somatic hypermutation(Wagner and Neuberger 1996).
The process causes the new cells to have slightly different genetic material and produce slightly different antibodies. As these mutations are inherently random, the affinity of the resulting antibodies to the original antigen might increase or decrease.
The B cells undergo such a directed evolutionary process where cells producing particularly effective antibodies are further selected for proliferation.
The B cells that produce weakly binding antibodies proliferate at a much lower rate and are soon replaced by their more effective variants. This process is known as affinity maturation. It increases the quality of the produced antibodies and the effectiveness of the immune response.
In computational antibody engineering, we look to extract biological features of the ~150 amino acid long sequences of heavy and light chains and link them to function. Efforts include predicting binding activity with respect to an antigen, correlation with biophysical properties (solubility), and derisking of therapeutics (e.g., deimmunization).
Such applications require gaining a deeper understanding of the antibody biology, antibody engineering, and subsequently, computational toolkit at hand which includes myriad datasets and statistical tools for antibody analytics (machine learning included).