DeepMinds David Silver speaks to the Bulletin of the Atomic Scientists about games, beauty, and AIs potential to avert human-made disasters. Photo provided by David Silver and used with permission.
David Silver thinks games are the key to creativity. After competing in national Scrabble competitions as a kid, he went on to study at Cambridge and co-found a video game company. Later, after earning his PhD in artificial intelligence, he led the DeepMind team that developed AlphaGothe first program to beat a world champion at the ancient Chinese game of go. But he isnt driven by competitiveness.
Thats because for Silver, now a principal research scientist at DeepMind and computer science professor at University College London, games are playgrounds in which to understand how mindshuman and artificiallearn on their own to achieve goals.
Silvers programs use deep neural networksmachine learning algorithms inspired by the brains structure and functionto achieve results that resemble human intuition and creativity. First, he provided the program with information about what humans would do in various positions for it to imitate, a learning style known as supervised learning. Eventually, he let the program learn by playing itself, known as reinforcement learning.
Then, during a pivotal match between AlphaGo and the world champion, he had an epiphany: Perhaps the machine should have no human influence at all. That idea became AlphaGo Zero, the successor to AlphaGo that received zero human knowledge about how to play well. Instead, AlphaGo Zero relies only on the games rules and reinforcement learning. It beat AlphaGo 100 games to zero.
I first met Silver at the Heidelberg Laureate Foruman invitation-only gathering of the most exceptional mathematicians and computer scientists of their generations. In Heidelberg, he was recognized for having received the Association for Computing Machinerys prestigious Prize in Computing for breakthrough advances in computer game-playing.
Few other researchers have generated as much excitement in the AI field as David Silver, Association for Computing Machinery President Cherri M. Pancake said at the time. His insights into deep reinforcement learning are already being applied in areas such as improving the efficiency of the UKs power grid, reducing power consumption at Googles data centers, and planning the trajectories of space probes for the European Space Agency. Silver is also an elected Fellow of the Royal Society and was the first recipient of the Mensa Foundation Prize for the best scientific discovery in the field of artificial intelligence.
Silvers stardom contrasts with his quiet, unassuming nature. In this condensed, edited, from-the-heart interview, I talk with Silver about games, the meaning of creativity, and AIs potential to avert disasters such as climate change, human-made pathogens, mass poverty, and environmental catastrophe.
As a kid, did you play games differently from other kids?
I had some funny moments playing in National School Scrabble competitions. In one event, at the end of the final game, I asked my opponent, Are you sure you want to play that? Why not play this other word which scores more points? He changed his move and won the game and championship, which made me really happy.
More than winning, I am fascinated with what it means to play a game really well.
How did you translate that love of games into a real job?
Later on, I played junior chess, where I met [fellow DeepMind co-founder] Demis Hassabis. At that time, he was the strongest boy chess player of his age in the world. He would turn up in my local town when he needed pocket money, play in these tournaments, win the 50-pound prize money, and then go back home. Later, we got to know each other at Cambridge and together we set up Elixir, our games company. Now were back together at DeepMind.
What did this fascination with games teach you about problem solving?
Humans want to believe that weve got this special capacity called creativity that our algorithms dont or wont have. Its a fallacy.
Weve already seen the beginnings of creativity in our AIs. There was a moment in the second game of the [2016] AlphaGo match [against world champion Lee Sodol] where it played a particular move called move 37. The go community certainly felt that this was creative. It tried something new which didnt come from examples of what would normally be done there.
But is that the same kind of broad creativity that humans can apply to anything, rather than just moves within a game?
The whole process of trial-and-error learning, of trying to figure out for yourself, or asking AI to figure out for itself, how to solve the problem is a process of creativity. You or the AI start off not knowing anything. Then you or it discover one new thing, one creative leap, one new pattern or one new idea that helps in achieving the goal a little bit better than before. And now you have this new way of playing your game, solving your puzzle, or interacting with people. The process is a million mini discoveries, one after the other. It is the essence of creativity.
If our algorithms arent creative, theyll get stuck. They need an ability to try out new ideas for themselvesideas that were not providing. That has to be the direction of future research, to keep pushing on systems that can do that for themselves.
If we can crack [how self-learning systems achieve goals], its more powerful than writing a system that just plays go. Because then well have an ability to learn to solve a problem that can be applied to many situations.
Many thought that computers could only ever play go at the level of human amateurs. Did you ever doubt your ability to make progress?
When I arrived in South Korea [for the 2016 AlphaGo match] and saw row upon row of cameras set up to watch and heard how many people [over 200 million] were watching online, I thought, Hang on, is this really going to work? It was scary. The world champion is unbelievably versatile and creative in his ability to probe the program for weaknesses. He would try everything in an attempt to push the program into weird situations that dont normally occur.
I feel lucky that we stood up to that test. That spectacular and terrifying experience led me to reflect. I stepped back and asked, Can we go back to the basics to understand what it means for a system to truly learn for itself? To find something purer, we threw away the human knowledge that had gone into it and came up with AlphaZero.
Humans have developed well-known strategies for go over millennia. What did you think as AlphaZero quickly discovered, and rejected, these in favor of novel approaches?
We set up board positions where the original version of AlphaGo had made mistakes. We thought if we could find a new version that gets them right, wed make progress. At first, we made massive progress, but then it appeared to stop. We thought it wasnt getting 20 or 30 positions right.
Fan Hui, the professional player [and European champion] we were working with, spent hours studying the moves. Eventually, he said that the professional players were wrong in these positions and AlphaZero was right. It found solutions that made him reassess what was in the category of being a mistake. I realized that we had an ability to overturn what humans thought was standard knowledge.
After go, you moved on to a program that mastered StarCrafta real-time strategy video game. Why the jump to video games?
Go is one narrow domain. Extending from that to the human brains breadth of capabilities requires a huge number of steps. Were trying to add any dimensions of complexity where humans can do things, but our agents cant.
AlphaStar moves toward things which are more naturalistic. Like human vision, the system only gets to look at a certain part of the map. Its not like playing go or chess where you see all of your opponents pieces. You see nearby information and have to scout to acquire information. These aspects bring it closer to what happens in the real world.
Whats the end goal?
I think its AI agents that are as broadly capable as human brains. We dont know how to get there yet but we have a proof of existence in the human brain.
Replicating the human brain? Do you really think thats realistic?
I dont believe in magical, mystical explanations of the brain. At some level, the human brain is an algorithm which takes inputs and produces outputs in a powerful and general way. Were limited by our ability to understand and build AIs, but that understanding is growing fast. Today we have systems that are able to crack narrow domains like go. Weve also got language models which can understand and produce compelling language. Were building things one challenge at a time.
So, you think theres no ceiling to what AI can do?
Were just at the beginning. Imagine if you run evolution for another 4 billion years. Where would we end up? Maybe we would have much more sophisticated intelligences which could do a much better job. I see AI a little bit like that. There is no limit to this process because the world is essentially infinitely complex.
And so, is there a limit? At some point, you hit physical limits, so its not that there are no bounds. Eventually you use up all of the energy in the universe and all of the atoms in the universe in building your computational device. But relative to where we are now, thats essentially limitless intelligence. The spectrum beyond human intelligence is vast, and thats an exciting thought.
Stephen Hawking, who served on the Bulletins Board of Sponsors, worried about unintended consequences of machine intelligence. Do you share his concern?
I worry about the unintended consequences of human intelligence, such as climate change, human-made pathogens, mass poverty, and environmental catastrophe. The quest for AI should result in new technology, greater understanding, and smarter decision making. AI may one day become our greatest tool in averting such disasters. However, we should proceed cautiously and establish clear rules prohibiting unacceptable uses of AI, such as banning the development of autonomous weapons.
Youve had many successes meeting these grand challenges through games, but have there been any disappointments?
Well, supervised learningthis idea that you learn from exampleshas had an enormous mainstream impact. Most of the big applications that come out of Google use supervised learning somewhere in the system. Machine translation systems from English to French, for example, in which you want to know the right translation of a particular sentence, are trained by supervised learning. It is a very well understood problem and weve got clear machinery now that is effective at scaling up.
One of my disappointments at the moment is that we havent yet seen that level of impact with self-learning systems through reinforcement learning. In the future, Id love to see self-learning systems which are interacting with people, in virtual worlds, in ways that are really achieving our goals. For example, a digital assistant thats learning for itself the best way to accomplish your goals. That would be a beautiful accomplishment.
What kinds of goals?
Maybe we dont need to say. Maybe its more like we pat our AI on the back every time it does something we like, and it learns to maximize the number of pats on the back it gets and, in doing so, achieves all kinds of goals for us, enriching our lives and helping us doing things better. But we are far from this.
Do you have a personal goal for your work?
During the AlphaGo match with Lee Sedol, I went outside and found a go player in tears. I thought he was sad about how things were going, but he wasnt. In this domain in which he had invested so much, AlphaGo was playing moves he hadnt realized were possible. Those moves brought him a profound sense of beauty.
Im not enough of a go player to appreciate that at the level he could. However, we should strive to build intelligence where we all get a sense of that.
If you look aroundnot just in the human world but in the animal worldthere are amazing examples of intelligence. Im drawn to say, We built something thats adding to that spectrum of intelligence. We should do this not because of what it does or how it helps us, but because intelligence is a beautiful thing.
See the article here:
DeepMind's David Silver on games, beauty, and AI's potential to avert human-made disasters - Bulletin of the Atomic Scientists