In 1959, the American biochemist Walter Kauzmann proposed a radical answer to the drawback of protein construction. At the time, it was unclear how proteins, the workhorses of the cell, fold into their distinctive three-dimensional kinds.
Every protein is made up of a set of 20 amino acids, somewhat like beads on a string. The size and order of these amino acid beads dictate how that protein folds into its distinctive form. This is vital as a result of the form of a protein is significant to its operate. Any disruption to this construction destroys the protein’s means to do its job. How nature ensures right protein folding every time stays one of the largest mysteries in science.
At the coronary heart of the drawback is the information that amino acids work together with water in two distinct methods. Some of them, like lysine, love water. These hydrophilic amino acids simply dissolve and blend effectively with water. And then there are these like tryptophan that don’t like water. These hydrophobic amino acids don’t combine with water and have a tendency to keep away from it as a lot as doable, to the extent that they usually clump collectively to minimise water publicity.
Since about 70% of the cell is made of water, the means the amino acids are organized and how that association interacts with water molecules is pivotal to how they fold. If a protein incorporates a stretch of hydrophobic amino acids, they are going to naturally are likely to mixture, compacting the total protein in the course of.

Sensitive to vary
Kauzmann constructed on this concept and proposed that proteins have a core largely made up of hydrophobic amino acids and a floor made primarily of hydrophilic amino acids.
The concept was confirmed to be right in the following decade when scientists started to precisely map protein constructions by X-ray crystallography and noticed what he predicted was true: the hydrophobic amino acids have been usually buried in the core, whereas the hydrophilic ones tended to localise to the floor.
Further analysis confirmed that, in contrast to the floor, the amino acids at the core have been additionally very delicate to modifications. It appeared that even minor modifications in the core may disrupt the protein’s form and, consequently, operate.
Another piece of proof supporting this line of thought was that the amino acid sequences from the cores of proteins frequent to completely different kinds of life have been remarkably comparable. It was reasoned that this was so since nature couldn’t afford to vary these with out deadly penalties.
But this raised one other query. If the results of a incorrect amino acid mixture are so drastic, how did nature, whereas counting on sluggish, incremental trial and error, handle to seek out purposeful protein constructions in any respect?
Even for a modest 60-amino-acid protein core, the quantity of doable combos is round 1078, a quantity similar to the estimated quantity of atoms in the identified universe. It’s astonishing that evolution was in a position to navigate such an unlimited house of potentialities to seek out the steady, purposeful sequences not as soon as, however repeatedly, throughout the hundreds of thousands of proteins found in life right this moment.
This mystery has lastly been put to relaxation by a crew from the Centre for Genomic Regulation in Spain and the Wellcome Sanger Institute in the U.Okay.
Implications for therapeutic proteins
In a brand new paper in Science, the crew challenged the authentic assumption that protein cores are delicate to vary by arguing that, of the astronomically excessive quantity of combos of protein cores which can be doable, few have been examined. The modifications made in these research have been additionally localised to small areas and didn’t enable for compensating changes elsewhere in the protein.
The crew proceeded to check this by first producing a library of 78,125 completely different amino acid combos throughout seven places in the cores of three proteins: the SH3 area of FYN tyrosine protein kinase from people, the CI-2A protein from barley, and the CspA from the Escherichia coli bacterium. Then they examined the stability of some of these combos to evaluate the affect of the modifications they launched in the protein.

Remarkably, the authors found that whereas most combos have been certainly detrimental, a number of remained steady, displaying that protein cores are extra resilient to vary than beforehand believed. The precise quantity of steady combos assorted from protein to protein, with the highest being the human SH3-FYN, which confirmed greater than 12,000 completely different steady core conformations.
The crew then fed this information right into a machine-learning algorithm to test if, based mostly on their information, they’d be capable to predict protein core stability based mostly on the amino acid sequence alone. They examined their mannequin on 51,159 pure SH3 sequences throughout all domains of life which can be obtainable in public databases and found that it may precisely predict stability even when the sequences have been lower than 25% comparable with the human SH3.
The examine’s outcomes have a number of vital implications for therapeutic protein engineering. Many proteins set off an undesirable immune response when administered as a result of their amino acid sequence. Changing that amino acid sequence was a sluggish and painful course of, because it was believed that too many modifications, particularly at the core, would disrupt protein construction. Now, with the new insights, it might be doable to hurry up the course of by screening bigger combos, with many extra modifications than have been tried beforehand.
However, whereas the examine holds clear promise for therapeutic functions, its deeper significance lies in what it means for basic biology. The information that the protein core is tolerant to a bigger diploma is an perception that resonates past drugs, and into the very nature of evolution itself. It’s a reminder to us that life, at its deepest stage, is way extra adaptable than we imagined.
Arun Panchapakesan is an assistant professor at the Y.R. Gaitonde Centre for AIDS Research and Education, Chennai.






