This is the reason Demis Hassabis started DeepMind – MIT Technology Review

Hassabis has been thinking about proteins on and off for 25 years. He was introduced to the problem when he was an undergraduate at the University of Cambridge in the 1990s. A friend of mine there was obsessed with this problem, he says. He would bring it up at any opportunityin the bar, playing pooltelling me if we could just crack protein folding, it would be transformational for biology. His passion always stuck with me.

That friend was Tim Stevens, who is now a Cambridge researcher working on protein structures. Proteins are the molecular machines that make life on earth work, Stevens says.

Nearly everything your body does, it does with proteins: they digest food, contract muscles, fire neurons, detect light, power immune responses, and much more. Understanding what individual proteins do is therefore crucial for understanding how bodies work, what happens when they dont, and how to fix them.

A protein is made up of a ribbon of amino acids, which chemical forces fold up into a knot of complex twists and twirls. The resulting 3D shape determines what it does. For example, hemoglobin, a protein that ferries oxygen around the body and gives blood its red color, is shaped like a little pouch, which lets it pick up oxygen molecules in the lungs. The structure of SARS-CoV-2s spike protein lets the virus hook onto your cells.

COURTESY OF DEEPMIND

The catch is that its hard to figure out a proteins structureand thus its functionfrom the ribbon of amino acids. An unfolded ribbon can take 10^300 possible forms, a number on the order of all the possible moves in a game of Go.

Predicting this structure in a lab, using techniques such as x-ray crystallography, is painstaking work. Entire PhDs have been spent working out the folds of a single protein. The long-running CASP (Critical Assessment of Structure Prediction) competition was set up in 1994 to speed things up by pitting computerized prediction methods against each other every two years. But no technique ever came close to matching the accuracy of lab work. By 2016, progress had been flatlining for a decade.

Within months of its AlphaGo success in 2016, DeepMind hired a handful of biologists and set up a small interdisciplinary team to tackle protein folding. The first glimpse of what they were working on came in 2018, when DeepMind won CASP 13, outperforming other techniques by a significant margin. But beyond the world of biology, few paid much attention.

That changed when AlphaFold2 came out two years later. It won the CASP competition, marking the first time an AI had predicted protein structure with an accuracy matching that of models produced in an experimental laboften with margins of error just the width of an atom. Biologists were stunned by just how good it was.

Watching AlphaGo play in Seoul, Hassabis says, hed been reminded of an online game called FoldIt, which a team led by David Baker, a leading protein researcher at the University of Washington, released in 2008. FoldIt asked players to explore protein structures, represented as 3D images on their screens, by folding them up in different ways. With many people playing, the researchers behind the game hoped, some data about the probable shapes of certain proteins might emerge. It worked, and FoldIt players even contributed to a handful of new discoveries.

If we can mimic the pinnacle of intuition in Go, then why couldnt we map that across to proteins?

Hassabis played that game when he was a postdoc at MIT in his 20s. He was struck by the way basic human intuition could lead to real breakthroughs, whether making a move in Go or finding a new configuration in FoldIt.

I was thinking about what we had actually done with AlphaGo, says Hassabis. Wed mimicked the intuition of incredible Go masters. I thought, if we can mimic the pinnacle of intuition in Go, then why couldnt we map that across to proteins?

The two problems werent so different, in a way. Like Go, protein folding is a problem with such vast combinatorial complexity that brute-force computational methods are no match. Another thing Go and protein folding have in common is the availability of lots of data about how the problem could be solved. AlphaGo used an endless history of its own past games; AlphaFold used existing protein structures from the Protein Data Bank, an international database of solved structures that biologists have been adding to for decades.

AlphaFold2 uses attention networks, a standard deep-learning technique that lets an AI focus on specific parts of its input data. This tech underpins language models like GPT-3, where it directs the neural network to relevant words in a sentence. Similarly, AlphaFold2 is directed to relevant amino acids in a sequence, such as pairs that might sit together in a folded structure. They wiped the floor with the CASP competition by bringing together all these things biologists have been pushing toward for decades and then just acing the AI, says Stevens.

Over the past year, AlphaFold2 has started having an impact. DeepMind has published a detailed description of how the system works and released the source code. It has also set up a public database with the European Bioinformatics Institute that it is filling with new protein structures as the AI predicts them. The database currently has around 800,000 entries, and DeepMind says it will add more than 100 millionnearly every protein known to sciencein the next year.

A lot of researchers still dont fully grasp what DeepMind has done, says Charlotte Deane, chief scientist at Exscientia, an AI drug discovery company based in the UK, and head of the protein informatics lab at the University of Oxford. Deane was also one of the reviewers of the paper that DeepMind published on AlphaFold in the scientific journal Nature last year. Its changed the questions you can ask, she says.

View original post here:
This is the reason Demis Hassabis started DeepMind - MIT Technology Review

Related Posts

Comments are closed.