Artificial intelligence solves one of the biggest problems in biology by predicting the shape of each protein expressed in the human body.
The research was conducted by London-based artificial intelligence company DeepMind, which used its AlphaFold algorithm to build the most complete and accurate human proteome database to date, which supports human health and disease.
Last week, DeepMind Publish The method and code of its model, AlphaFold2 in Nature, shows that it can predict the structure of a known protein with almost perfect accuracy.
The second nature paper that followed was published on Thursday, showing that the model can confidently predict the structural positions of nearly 60% of the amino acids (components of proteins) in the human body, as well as in many other organisms such as Drosophila, Mice and Escherichia coli.
Only the structural positions of about 30% of the amino acids were previously known. Knowing the location of amino acids allows researchers to predict the three-dimensional structure of proteins.
This set of 350,000 protein structure predictions is now available through a public database hosted by the European Institute of Bioinformatics of the European Molecular Biology Laboratory (EMBL-EBI).
“Accurately predicting their structure has a wide range of scientific applications, from developing new drugs and disease treatments, to designing future crops that can withstand climate change, or enzymes that can degrade plastics,” said Edith Hurd, director general of the institute. Edith Heard) said. EMBL. “Applications are only limited by our imagination.”
Protein structure is important because they determine how the protein functions. Understanding the shape of a protein—such as a Y-shaped antibody—can allow scientists to learn more about the role of the protein.
Malformed proteins can cause diseases such as Alzheimer’s, Parkinson’s, and cystic fibrosis. Being able to easily predict the shape of a protein allows scientists to control and modify it, so they can improve its function by changing its DNA sequence or targeting drugs that can be attached to it.
Accurately predicting the structure of proteins from DNA sequences has always been one of the biggest challenges in biology. The current experimental methods to determine the shape of a single protein require months or years in the laboratory, which is why only about 180,000 protein structures have been resolved among the more than 200 million known proteins in organisms.
Demis Hassabis, CEO of DeepMind, said: “We believe that this will represent the most important contribution that artificial intelligence has made to advance the level of scientific knowledge so far.” “Our goal is to expand [the database] In the next few months, the entire protein world will exceed 200 million proteins. ”
Scientists who were not involved in the DeepMind research used phrases such as “stinging” and “transformative” to describe the impact of progress, comparing the data set to the human genome.
John McGeehan, a structural biologist and director of the Enzyme Innovation Center at the University of Portsmouth, said: “This is one of the moments when my hair is standing on the back of my neck.” The past few months.
“We can directly use this information to develop faster enzymes to break down plastics. These experiments are being carried out immediately, so the acceleration of the project will take several years.”
AlphaFold is not without limits. According to Minkyung Baek, a researcher at the Institute of Protein Design at the University of Washington, proteins are dynamic molecules that constantly change shape according to the substances they bind to, but DeepMind’s algorithm can only predict the static structure of proteins.
However, she said that its greatest contribution to scientists is that it is open source. “Last year they showed [this] Everything is possible, but no code is provided, so people know it is there, but cannot use it. ”
Within seven months after DeepMind announced the news, Baek and her colleagues used DeepMind’s ideas to build their own open source version of the algorithm, which they called RosettaFold, and published it in the journal science last week. “I’m really happy that they made everything public, which is a huge contribution to biological research and commercial pharmaceuticals,” she said. “Now more people can benefit from their methods [and] It promotes the development of this field faster. ”