Two people play the following game: Both players contribute to an ante of amount A (not necessarily equally). Each player receives a uniformly distributed random number between 0 and 1. Each player only knows the value of his number. The first player may bet any nonnegative amount. The second player may either match the amount bet or give the ante to the first player (the second player may not raise or bet on his own). If the second player matches the bet then the whole pot is given to the the player with the higher number (which are revealed after the second player decides what he is going to do). The questions is: what fraction of the ante should be contributed by each player to make the game "fair"? The solution follows: Optimal strategies for the players: Let X and Y be player 1's and player 2's numbers, respectively. Let R be the ratio of the bet to the ante. 2 ARCCOS(-SQRT(7X)) 0 < X <= 1/7 R = -------- COS(-----------------) - 1 SQRT(7X) 3 1/7 <= X <= 4/7 R = 0 3 4/7 <= X < 1 R = SQRT(------) - 1 7(1-X) Always call if R = 0. 1/7 + R If R > 0 call if Y >= ------- 1 + R The first player should get back 4/7 of the ante, the second player 3/7. How I derived the solution follows: There are going to be many optimal solutions for each player for this game. Many of these will not be analytic functions of the information available. Finding one of these to start would be pretty difficult. The first step is to put some extra conditions on the solutions to make the problem more tractable. The extra conditions come from strategies that are good against nonoptimal play by your opponent. Two conditions for player 2 are fairly obvious. He should always call a bet of 0. For any particular bet he should call on his best hands and fold on his worst hands. For the first player things aren't as clear. However if the first player only bets on his best hands he won't do very well since the second player will only call when he has him beat. The first player can always guaranty an expected value of his number times the ante by betting 0. Since when bluffing you expect to lose if you are called you should bluff with your worst hands and bet 0 with ones that are just poor. At this point you want to find a pair of optimal strategies for playing the game. These strategies have the property that neither person has any incentive to change his strategy (though he may be indifferent to making changes) to some other strategy if the other person is using an optimal strategy. Searching for strategies that leave the other person indifferent to his choices is often a very usefull way to search for optimal strategies. Specificly, in this case the first player should bluff exactly enough to keep the second player indifferent to calling or folding with the worst hand he is calling with. For the second player things aren't so clear. Two possibilities stick out, either make the first player indifferent to where he bets on good hands or to where he bets on bad hands (bluffs). The first option won't work since the amount to bet for the maximum expected value for betting on good hands depends on the hand, whereas for bluffing it doesn't. This means that the betting will be spread out but that the bluffing won't. This will induce the second player to change his strategy so that this won't result in an optimal solution. To see this, let E be the expected value divided by A and let F (a function of R) be the smallest value the second player will call with. Then for good hands (X > F), E = F + (1 + R)(X - F) - R(1 - X). Simplifying gives E = X + 2RX - RF - R. dE dF -- = 2X - F - 1 - R-- which is a function of X. dR dR For bad hands (X < F), E = F - R(1 - F). dE dF -- = (1+R)-- + F - 1 which is independent of X. dR dR So we will try to find a strategy that leaves the first player indifferent to where he bluffs. The expected value for bluffing with the best hand you will bluff with should be the same as betting 0 with the hand instead. Let this hand be B. Equating expected values gives: B = F - R(1 - F). B + R So that F = ----- . 1 + R Plugging this into the expected value for good hands gives: B + R B + R E = ----- + (1 + R)(X - -----) - R(1 - X) 1 + R 1 + R 1 - B = (2R + 1)X - 2R + 1 - B - ----- 1 + R dE 1 - B -- = 2(X - 1) + -------- dR 2 (1 + R) Setting to 0 and solving for R gives: 1 - B R = SQRT(--------) - 1 2(1 - X) 2 d E 1 - B --- = - 2 -------- < 0 2 3 dR (1 - R) So that the above value of R gives the maximum expected value. Let C be the best hand that the first player will bet 0 on. Then he will be 1 - B indifferent to betting 0 or betting R = SQRT(--------) - 1 (which we would 2(1 - C) also expect to be zero in this case). Setting the expected values to be equal gives: 1 - B E = C = (2R + 1)C - 2R + 1 - B - ----- 1 + R 1 - B 1 - B => -------- = (1 + R) = SQRT(--------) 2(1 - c) 2(1 - C) 1 - B => 1 - C = ----- 2 To get another relation between B and C, I looked at the total amount of hands needed for bluffing. The first player should bluff exactly enough so that the second player is indifferent to calling or folding with his worst calling hand. Let L be the ratio of bluffs to legitimate hands. The second players expected return when calling is: E = (1 + R)L - R, and is 0 if he R folds. Setting these equal gives L = -----. 1 + R / 1 R(X) / 1 1 So that B = | -------- dX = | (1 - --------) dX / C R(X) + 1 / C R(X) + 1 / 1 2(1 - X) 2 1 - B 3/2 = | (1 - SQRT(--------)) dX = 1 - C - - SQRT(-----)(1 - C) / C 1 - B 3 2 1 - B 2 2 1 - B 3/2 1 - B 2 1 - B 1 1 = ----- - -SQRT(-----)(-----) = ----- - - ----- = - - -B 2 3 1 - B 2 2 3 2 6 6 => B = 1/7, C = 4/7 While against an optimal strategy it doesn't matter what hands the first player bluffs with particular hands, in order to get an analytic function for bluffing, we will use the following strategy: The first player will bluff having his best hands with his worst hands. The rational for this is that the second player is more likely to incorrectly call small bets with bad hands (Y < B) than large bets. Let Z be a hand to be bluffed with that is to be paired with some good hand, X. / 1 R(X) 2 SQRT(21) 3/2 3 2 Z = | -------- dX = 1 - X - ----------(1 - X) = --------- - --------- / X R(X) + 1 9 2 3 7(R + 1) 7(R + 1) Solving the cubic equation for R gives these three solutions: 2 ARCCOS(-SQRT(7Z)) R = -------- COS(-----------------) - 1 SQRT(7Z) 3 2 ARCCOS(-SQRT(7Z)) + 2Pi R = -------- COS(-----------------------) - 1 SQRT(7Z) 3 2 ARCCOS(-SQRT(7Z)) + 4Pi R = -------- COS(-----------------------) - 1 SQRT(7Z) 3 The second solution has the wrong value for Z = 1/7 and the third solution approaches the wrong limit as Z -> 0, so that the first solution is the one we want to use when bluffing. The expected value to the first player of the game (dividing by the amount of the ante and not counting his contribution to it) is: / 1/7 / 4/7 / 1 1 - B E = | 1/7 dX + | X dX + | ((2R + 1)X - 2R + 1 - B - -----) dX / 0 / 1/7 / 4/7 R + 1 / 1 3 3 1 = 1/49 + 15/98 + | (2 SQRT(--------)X - X - 2 SQRT(--------) + 3 - - - / 4/7 7(1 - X) 7(1 - X) 7 6/7 ______________) dX = 4/7 3 SQRT(--------) 7(1 - X) So that the first player will win back 4/7 of the ante. Hence, the second player will get back 3/7 of the ante. These are also the amounts they should contribute to the ante to make the game fair. Comments: It is interesting that bluffing is naturally part of an optimal stratgey for playing poker. It will be interesting to see the affects on the betting stategies if the game is extended to allowing raising. I suspect that the first player is giving away too much information about his hand, using the current strategy, for it to be used if the second player is allowed to raise. I am also curious to see if sandbagging (checking and raising) will be part of an optimal strategy. I believe that solving similar games with raising allowed is going to be difficult (for me at least). Normally in multidecision games the trick is to work backwards from terminal positions in the game or for cases with loops, work backwards until the same position reappears. However, in this case since betting reveals information about your hand distribution which will affect future decisions, but the distribution is also based on the expected value of the next stage in the game, it seems it will be difficult to apply the above strategy in solving these kind of games.
A secure version of this page is located at: https://wolff.to/bruno/poker.html