Google and Facebook Race to Solve the Ancient Game of Go With AI
Rémi Coulom spent the last decade building software that can play the ancient game of Go better than practically any other machine on earth. He calls his creation Crazy Stone. Early last year, at the climax of a tournament in Tokyo, it challenged the Go grandmaster Norimoto Yoda, one of the world’s top human players, and it performed remarkably well. In what’s known as the Electric Sage Battle, Crazy Stone beat the grandmaster. But the win came with a caveat.
Over the last 20 years, machines have topped the best humans at so many games of intellectual skill, we now assume computers can beat us at just about anything. But Go—the Eastern version of chess in which two players compete with polished stones on 19-by-19-line grid—remains the exception. Yes, Crazy Stone beat Yoda. But it started with a four-stone advantage. That was the only way to ensure a fair fight.
In the mid-’90s, a computer program called Chinook beat the world’s top player at the game of checkers. A few years later, IBM’s Deep Blue supercomputer shocked the chess world when it wiped the proverbial floor with world champion Gary Kasparov. And more recently, another IBM machine, Watson, topped the best humans at Jeopardy!, the venerable TV trivia game. Machines have also mastered Othello, Scrabble, backgammon, and poker. But in the wake of Crazy Stone’s victory over Yoda, Coulom predicted that another ten years would pass before a machine could beat a grandmaster without a head start.
At the time, that ten-year runaway seemed rather short. In playing Go, the grandmasters often rely on something that’s closer to intuition than carefully reasoned analysis, and building a machine that duplicates this kind of intuition is enormously difficult. But a new weapon could help computers conquer humans much sooner: deep learning. Inside companies like Google and Facebook, deep learning is proving remarkably adept at recognizing images and grasping spacial patterns—a skill well suited to Go. As they explore so many other opportunities this technology presents, Google and Facebook are also racing to see whether it can finally crack the ancient game.
As Facebook AI researcher Yuandong Tian explains, Go is a classic AI problem—a problem that’s immensely attractive because it’s immensely difficult. The company believes that solving Go will not only help refine the AI that drives its popular social network, but also prove the value of artificial intelligence. Rob Fergus, another Facebook researcher, agrees. “The goal is advancing AI,” he says. But he also acknowledges that the company is driven, at least in a small way, by a friendly rivalry with Google. There’s pride to be found in solving the game of Go.
Building A Brain for Go
Today, Google and Facebook use deep learning to identify the faces in photos you post to the ‘net. It’s how computers recognize the commands barked into a phone and translate things from one language to another. Sometimes, it can even understand natural language—the natural way that we humans converse.
This technology relies on what are called deep neural networks, vast networks of machines that approximate the web of neurons in the human brain. If you feed enough tree photos into these neural nets, they can learn to identify a tree. If you feed them enough dialogue, they can learn to carry on a decent (if sometimes weird) conversation. And if you feed them enough Go moves, they can learn to play Go.
“Deep neural networks are very appropriate for Go because Go is very driven by patterns on the board. These methods are very good at generalizing from patterns,” says Amos Storkey, a professor at the University of Edinburgh, who is using deep neural networks to tackle Go, much like Google and Facebook.
The belief is that these neural nets can finally close the gap between machines and humans. In playing Go, you see, the grandmasters don’t necessarily examine the results of each possible move. They often play based on how the board looks. With deep learning, researchers can begin to duplicate this approach. In feeding images of successful moves into neural networks, they can help machines learn what a successful move looks like. “Rather than just trying to work out what the best things to do are, they learn from how humans play the game,” Storkey says of neural nets. “They effectively copy human play.”
Deeper Than Deep Learning
Building a machine that can win at Go isn’t just a matter of computing power. That’s why programs like Coulom’s haven’t cracked it. Crazy Stone relies upon what’s called a Monte Carlo tree search, a system that essentially analyzes the outcomes of every possible move. This is how machines mastered checkers and chess and other games. They looked further ahead than the humans they beat. But with Go, there are too many possibilities to consider. In chess, on any given turn, the average number of possible moves is 35. With Go, it’s 250. And after each of those 250 possible moves, there are another 250. And so on. It’s impossible for a tree search to consider the results of every move (at least not in a reasonable amount of time).
But deep learning can fill the gap, providing a level of intuition, as opposed to brute force. Last month, in a paper posted the academic research site Arxiv, Facebook demonstrated a method that combines the Monte Carlo tree search with deep learning. In competition with humans, the system held its own, and according to the company, it even played with a style that felt human. After all, it has learned from real human moves. Coulom calls the company’s results “very spectacular.”
Ultimately, Coulom says, this kind of hybrid approach will crack the problem. “What people are trying to do is combine the two approaches so that it’s better than each,” he says. He points out that Crazy Stone already uses a form of machine learning in concert with Monte Carlo. It’s just that his methods aren’t as complex as the neural networks employed by Facebook.
Facebook’s paper shows the power of deep learning, but it’s also a reminder that big AI tasks are ultimately solved by more than a single technology. They’re solved by many technologies. Deep learning does many things well. But it can always use help from other forms of AI.
Trial and Error
After Facebook revealed its Go work, Google soon unloaded a response. A top Google AI researcher, Demis Hassabis, said that, in a few months, the company would reveal “quite a big surprise” related to the game of Go. Google declined to say more for this story, and it’s unclear what the company has in store. Coulom, for one, says it’s unlikely Google could so quickly produce something that can beat the top Go players, but he believes the company will take a significant step down that road.
In all likelihood, this too will rely on multiple technologies. And we’re guessing that one of them is something called reinforcement learning. While deep learning is good at perception—recognizing how something looks, sounds, or behaves—reinforcement algorithms can teach machines to act on this perception.
Hassabis oversees DeepMind, a Google subsidiary based in Cambridge, England, and DeepMind has already made good use of deep learning in tandem with reinforcement algorithms. Earlier this year, he and his team published a paper that described how the two technologies could be used to play old Atari video games—and, in some cases, beat professional game testers. After a deep neural net helps the system understand the state of play—what the board looks like at any given time—the reinforcement algorithms use trial and error to help the system understand how to respond to this state of play. Basically, the computer tries a particular move, and if that move brings a reward—points in the game—it recognizes that the move as a good one. After trying enough moves, the system comes to understand the best ways of playing. The same kind of thing can work with Go.
This approach is different from a standard tree search in that the system is learning what a good move looks like. Researchers train it to play before the real match begins. As with deep learning, it plays through a kind of “knowledge” rather than applying brute force to the problem.
Ultimately, if they solve the game of Go, machines need all of these technologies. Reinforcement learning can feed off of deep learning. And both can dovetail with a traditional approach like the Monte Carlo tree search. Cracking Go remains enormously difficult. But modern AI is getting closer. When Hassabis reveals his “big surprise,” we’ll known just how close it has come.