No, a Computer Did Not ‘Solve’ Poker

A paper published in Science is being picked up far and wide with such declarations as “Game Theorists Crack Poker” and “Humanity Folds: Computers Have Cracked Texas Hold-Em.” I don’t have access to the full paper, so I’m going off the report in Nature about it and I’d say no, the computer has not cracked poker.

First, the program only deals with one relatively simple variation of poker, Heads Up Limit Hold Em (HULHE). That means the program would only play against a single opponent and the bets would be limited and preset.

A new computer algorithm can play one of the most popular variants of poker essentially perfectly. Its creators say that it is virtually “incapable of losing against any opponent in a fair game”.

This is a step beyond a computer program that can beat top human players, as IBM’s chess-playing computer Deep Blue famously did in 1997 against Garry Kasparov, at the time the game’s world champion. The poker program devised by computer scientist Michael Bowling and his colleagues at the University of Alberta in Edmonton, Canada, along with Finnish software developer Oskari Tammelin, plays perfectly, to all intents and purposes.

That means that this particular variant of poker, called heads-up limit hold’em (HULHE), can be considered solved. The algorithm is described in a paper in Science1.

The strategy the authors have computed is so close to perfect “as to render pointless further work on this game”, says Eric Jackson, a computer-poker researcher based in Menlo Park, California.

But there are some important limitations here. First, heads up limit hold em is not, in fact, one of the most popular variants of poker. In fact it’s very rarely played, only at a few tournaments throughout the year. Second, the preset bet limit all but eliminates the utility of bluffing and thus eliminates a great deal of relatively subjective elements from the game. And HULHE is far simpler than, say, heads up no-limit hold-em or, far more complex yet, no-limit hold-em with a table full of players.

The program actually does bluff, but it only does so based on mathematical calculations. Specifically, it calculates how often one should bluff in HULHE. But how often one should bluff is not the same as determining when one should bluff, and as I said, the utility of bluffing in a limit game is virtually nil compared to a no-limit game. Deciding when to bluff. as opposed to how often, requires one to consider variables that, at this point, a computer almost certainly cannot consider effectively.

A computer program could, of course, use a learning algorithm to track its opponent’s tendencies, and poker players actually already do that (there are programs that do it for you for online poker play and the top pros chart their opponents’ play in live tournaments as well and do the same kind of analysis a computer would do (how often do they bluff from early or late position, how often do they fold to a reraise out of position, etc). But as Chong Li says in Bloodsport, “brick not hit back.”

What I mean by that is this: While you can track a player’s tendencies to inform a decision on whether to make a reraise or fold, that player is, if they’re any good, also thinking about their own tendencies. While you’re thinking about how they’re most likely to respond to a bet or a raise, they should also be thinking about how you’re going to predict their behavior. It’s that constant back and forth that I don’t think a computer could really consider effectively.

Then there are physical tells that a program can’t incorporate into its decision-making process (not yet, at least; one can imagine that if it had a camera on its opponent or could read things like blood pressure and skin temperature, a computer might ultimately be better than a human player at this). And there are factors such as relative stack size (if a player has a small number of chips relative to you, are they more or less likely to call you? Are they going to play tighter or looser?); their perception of how you play; their perception of whether you’re bluffing a lot or getting a run of good cards; whether they’ve just sat down at the table or have been playing for a while; whether they just got back to even, are way down or way up in chips; etc. All of these things should be considered when making a decision to call, bet, raise or fold.

The reality is that making the mathematically correct decision is not always the same thing as making the actually correct decision. Poker is not just a game of math, it’s a game of psychology. It’s a game of reading people and reading situations. It’s a game of incomplete information and it’s the incompleteness of that information, and the fact that any decent poker player can do the math pretty well in their heads, that diminishes the value of purely mathematical play.

Ultimately, computers may reach the point where they could outplay humans at a game like no-limit hold-em with multiple players, but I suspect that’s still a ways off. Computers do have a huge advantage in terms of data storage, recall and analysis. But I don’t think they are sophisticated enough yet to have an advantage at the human elements of the game, and that’s where the real difference between an average poker player and a great player can be found.

POPULAR AT PATHEOS Nonreligious
What Are Your Thoughts?leave a comment
  • garnetstar

    As we know, computers, though good at handling large amounts of data, are also extremely stupid. They do only what they’re told. And, it’s not as if even humans fully understand psychology and could tell the computers the right decisions.

    They’re no competition.

  • http://www.thelosersleague.com theschwa

    It’s nice to see the WOPR is still getting work these days. Good job, Joshua.

  • http://en.uncyclopedia.co/wiki/User:Modusoperandi Modusoperandi

    Talk about out of date! This was covered in an episode of Small Wonder way back in 1985!

  • Artor

    A friend of mine has a very analytical mind, and he plays like this. He gets really frustrated when someone makes a play that wasn’t mathematically correct by his consideration. When I have a number of choices I could make, I look at whom I’m playing against, and decide based on who is the biggest threat, or who screwed me most recently, not which choice is dictated by the algorithm running in my head. For some reason, he completely disregards the social aspect of the game, and is oblivious to his own subjective preferences. We can’t play cards together anymore.

  • dingojack

    ‘I haven’t actually seen the film, but based on a couple of bad reviews in the more sensationalist tabloids…’

    @@

    Dingo

  • http://www.facebook.com/profile.php?id=523300770 stuartsmith

    To be fair, the designers openly state that the game is not actually the best possible poker player. Rather, over an infinite series of games, it is guaranteed to do no worse than break even (and will do better if up against anyone who plays a strategy other than its own.) They clearly state that it lacks the ability to take advantage of unskilled players in order to maximize its winnings, so that in actual play a human could easily win bigger and faster than it does over the short term. Also, they clearly state that the program is designed for online poker, so no-one has the advantage of reading body language.

  • http://kamakanui.zenfolio.com Kamaka

    I doubt the computer can set traps, either. I wonder if the programmers even considered the idea.

  • mmfwmc

    Hi,

    I write poker AI for a living and have competed against these guys in the Alberta competition.

    You’re definitely right about the fact that limit isn’t popular, and that this is much simpler than NLHE. I do have some quibbles about your analysis of what it does.

    Specifically, it calculates how often one should bluff in HULHE. But how often one should bluff is not the same as determining when one should bluff

    This is wrong. It also calculates when to bluff. It definitely does calculate them effectively.

    The second thing is that you’re confusing the game theory definition of solved with what you (and any other real player) consider perfect play. Perfect poker is exploitative – when an opponent has a weakness (e.g. is super tight) you can exploit them by raising everything and folding when they show any aggression. The ai will play the same way against anyone.

    The difference is that if you employ your “raise fold” strategy against a player who isn’t a calling station, you’ll lose. Probably lose big. By playing exploitatively, you become exploitable yourself. Good players fix this by changing their strategies depending on the various factors you list above.

    The key difference is that the AI is not trying to play exploitative poker. It’s playing a very close approximation to a Nash equilibrium, which means there is no strategy you can employ that will beat it reliably in a human lifetime. If you were to play it perfectly on a poker site, the result is that you would both go bankrupt from rake.

    It might make money against most poker players, because of the “Fundamental theorem of poker” (which is not actually a theorem, and demonstrably incorrect in some circumstances). It will not, however, make as much money against a weak player as the best AI would.

    My real quibble with their paper is the definition of solved. It’s like proof – real proof is an unarguable truth. This is like the difference between a Teapot atheist and someone who could actually proof logically that there is no god. The difference is vanishingly small, but it is significant. In this case, the AI plays almost unexploitably, but not quite.

    Ignoring that minor bump, yes this is a solution in the game theory sense. But you probably wouldn’t want to enter it in high stakes on a poker site, because you’d just be losing rake.

    Btw, more poker posts please!

  • mmfwmc

    Sorry, “It will not, however, make as much money against a weak player as the best AI would.” should have been “as the best human player would”

  • Abby Normal

    One good thing about the study, it gives scientific ammunition to people who wish to classify poker as a game of skill, rather than a game of chance. That classification has significant impact on a how many anti-gambling statutes are interpreted. If a computer can “solve” it, it must be a game of skill.

  • eric

    But how often one should bluff is not the same as determining when one should bluff

    If it’s consistently beating really good human players, I don’t think that’s a very strong criticism. Obviously it knows when to bluff if its (statistically and consistently) making more money off its bluffs than its losing.

    The reality is that making the mathematically correct decision is not always the same thing as making the actually correct decision

    Again, however, people only care about the “actually correct decision” if it allows them to make (more) money in the long run. If the computer is making decisions that cause it to win money from the best players consistently, that’s a pretty darn good argument in itself for it making “actually correct” decisions. This is very much like correct play in blackjack: if you come up with a heuristic that allows you to win more often than any other heuristic, that becomes what the community is going to consider to be the correct play.

    Having said all that, I very much like limit hold’em. Mainly because the people who play it are often no-limit players looking for something different, and they don’t calculate implied pot odds correctly. They do not get that limit is something of a fundamentally different game, and if you play it like no-limit, you’re going to lose.

  • moarscienceplz

    I would only be impressed if they came up with an AI that could reliably predict an opponent’s bluff.

  • http://en.uncyclopedia.co/wiki/User:Modusoperandi Modusoperandi

    *Bleep bloop* My cards are sooo bad. *Whir* I should probably fold. *Bink clunk* Oh, heck, I will raise.

  • tsig

    I’d run the program so I’d know what it was going to do.

  • mmfwmc

    I’d run the program so I’d know what it was going to do.

    And if you did that, you’d break almost exactly even. That’s their definition of solved. If you ran the program so you knew what it was going to do and then played appropriately, you wouldn’t win a noticeable amount in your entire lifetime.

    Except you’d go broke from the rake before that :)

  • throwaway, never proofreads, every post a gamble
  • Pingback: Texas hold’em has been solved? | Rturpin's Blog()