Technology
Humans set to fold as poker bots raise the stakes Computer software that can beat top poker players could bring advances in financial markets and biology NIC FLEMING
A DOZEN men wearing dark green T-shirts and wide grins whoop, shake hands and high-five, while another group in navy blue baseball caps do their best to look magnanimous in defeat in front of several dozen onlookers. In most respects this was a low-key event, but the scene, at a nondescript booth of a Las Vegas convention centre in July this year, may to be a pivotal moment for the development of artificial intelligence. That’s because at the Gaming Life Expo at the Rio AllSuite Hotel & Casino, a computer program called Polaris became the first to beat a team of worldclass poker players, each of whom had previously won more than $1 million. Some may see the victory as the latest dismal step in silicon’s march towards superiority over humans. Others will view it as an exciting move forward in artificial intelligence – a foretaste of the sophisticated tasks computers should be able to perform for us in years to come. In 1997, IBM’s Deep Blue became the first computer to defeat a human chess world champion in a full match when it beat Garry Kasparov. Then, last year, researchers announced the development of a program that had mastered draughts 28 | NewScientist | 15 November 2008
(checkers) – meaning it could never lose a game no matter how skilled its opponent. These games have a crucial factor in common: they favour players with the mathematical ability, or processing power, to calculate the consequences of choices many moves down the line. So it is perhaps hardly surprising that in this area computers have become pre-eminent. Poker is different. It is a game of cunning, bluff and deception – not attributes we traditionally associate with motherboards, logic gates and processor chips. So does Polaris’s success mean human poker players like Dave “Devilfish” Ulliott and Phil “The Unabomber” Laak are now busted flushes? The answer is no, not yet. The version of poker at which Polaris excels is heads-up (twoplayer) limit Texas hold ’em. For the uninitiated, Texas hold ’em is the popular form of the game in which players are initially dealt two cards. They can elect to either bet or fold in a series of rounds before and after three open “community” cards are dealt, and after fourth and fifth community cards are dealt. If more than one player remains at the table at the end, the winner is the one who can make the best five-card hand from the seven cards he or she has available. The “limit” in the game’s name
means players can only bet certain fixed amounts. In short, it is a simple version of the game, with fewer permutations, than the more popular multiplayer pot-limit or no-limit forms, in which bets can be as large as the value of the chips on the table at any one time, or are completely unrestricted. Computers do not yet hold all the aces in these more complex incarnations of the game. However, Polaris’s developers at the University of Alberta in Edmonton, Canada, believe it is only a matter of time before machines get the upper hand. They have a new version of the software, updated to play headsup no-limit hold ’em, now under test. If the program performs well enough, they hope to pit it against top human players next year. “If I were a betting man, which I’m not, I’d be willing to bet a poker program will be able to surpass all human players within two years at heads-up no-limit Texas hold ’em,” says Darse Billings, a founder member of the university’s Computer Poker Research Group. Michael Bowling, who leads the Alberta group, is hedging his bets. “It’ll happen within my lifetime. It could be five years, or 50.” In a poker game, players must decide on their next move despite being unable to see their opponents’ cards. This makes it a much tougher challenge for computer programs than, say,
chess or draughts, where all the pieces are on the board when decisions are made. When enough processing power is available, an optimal strategy for games such as chess and draughts can be worked out by creating a “decision tree” – a map of all possible future plays in which each branch of the tree represents a possible play. This allows the consequences of each play to be broken down into manageable sections and evaluated to determine how likely the play is to lead to a win.
Imperfect information With poker this approach is problematic, and not only because there are so many potential permutations of cards and bets. One of the fundamental problems for any poker player is that the best strategy varies, depending on your opponent’s style of play. “Everybody knows that computers are really fast and excellent at doing well-defined calculations,” says Billings. “But we’re moving into territory where the information can be unreliable, can be imperfect, can be the result of deliberate deception.” The larger the number of possible states in a game, the more memory a computer needs to run its calculations. In 2005 the Alberta group developed new algorithms capable of handling 10 billion game states, up from the previous best of 100 million. The latest algorithms can
TOP BOTTERS EARN WHILE THEY SLEEP Hundreds of online poker players use fully automated bots in the hope of making money without lifting a finger, even though this is against poker websites’ rules. Most are crude, off-theshelf programs bought online, designed to evade the sites’ detection systems. They generally lose money for their owners. It is estimated by industry and leading botters that only around 1 in 10 players using bots make a profit, mainly in low-stakes games. Those “botters” who do make money are understandably secretive. Being
identified can lead to their accounts being frozen and funds seized. One London-based botter told New Scientist his program made in the region of $35,000 per year. The online poker industry recognises the threat from increasingly sophisticated bots. “It is a growing problem,” says Darse Billings of the University of Alberta in Edmonton, Canada, who acts as a security consultant for the Full Tilt Poker site. “It is becoming easier for people to produce a poker-playing program and to plug it in to play online.” www.newscientist.com
NICK KOUDIS/GETTY
–Bluffing isn’t a computer’s strong suit–
handle 1000 billion states. But even heads-up, limit hold ’em has around a billion billion (1018) permutations. To simplify the calculations a computer has to do, researchers bracket certain combinations of cards and game states together. For example, the software might be instructed to act in an identical way if dealt two cards both lower than 7 that are not of the same suit, in sequence or a pair. As improved algorithms appear, fewer states need to be grouped together, reducing the potential for errors. The new, no-limit version of Polaris will band together bets of different sizes, so it might react identically to an opponent raising by 10 or 12 chips, for example, further simplifying the computational task. The program has been trained to “learn” optimal www.newscientist.com
game strategies by examining a database derived from simulations of 800 million two-player hands. These strategies are embodied in a series of software bots with names such as Mr Blonde, Mr Pink and Agent Orange, each one tailored to counter particular styles of play. The research has attracted a great deal of interest because it
products, and biologists have used a similar approach to examine decision-making in the animal kingdom. The fundamental idea that the strategy of one individual depends on the strategies of others in a population has helped researchers uncover the forces at work in shaping behaviours such
“The program uses bots with names such as Mr Blonde, Mr Pink and Agent Orange” illustrates how computers might in future be used to solve problems in fields where there are similar uncertainties to those encountered at the poker table. Large companies are already using game theorists to help with tasks such as bidding for contracts and setting prices for their
as contests, reciprocal altruism and habitat selection. Noel Sharkey, professor of artificial intelligence and robotics at the University of Sheffield in the UK, says the new work will help computer systems make inroads into new areas. “This type of technology might be
very successful in the financial markets,” he says. “The markets are more like poker than chess because there is incomplete information about the state of play at any time.” Billings warns against judging the value of the research purely in terms of its immediate practical applications. “When mathematicians began working out how to solve quadratic equations, they were not thinking about how they could be used in industry, they were just solving a problem,” he says. “When you are pushing the boundaries of what can be done, you are opening doors to applications that don’t even exist yet and we can’t possibly imagine what those might be. I’m a firm proponent of solving things for the sake of solving things.” ● 15 November 2008 | NewScientist | 29