最优放弹珠 I
Optimal Marbles I
题目详情
两名玩家 A、B 各有 100 颗弹珠。两人同时决定往盒子里放入 1 到 100 颗弹珠(对方看不到)。随后进行两次“有放回抽取”:每次随机从盒中抽 1 颗弹珠。
- 若抽到 A 的弹珠,A 从第三方获得 的收益(其中 为 A 放入的弹珠数)。
- 若抽到 B 的弹珠,B 从第三方获得 的收益。
两人都最优博弈。求 A 的两次抽取总期望收益。
Two players, say and , play the following game: Both players have marbles and may put anywhere between and marbles in the box each. This decision is not revealed to the other player. Then, they draw marbles with replacement between trials. If the marble belongs to , then assuming that put marbles in the box, is paid monetary units from a third party. Similarly if the marble belongs to , then assuming but marbles in the box, is paid monetary units from a third party. Assume both players play optimally. Find the expected total payout of player .
解析
单次抽取下 A 的期望收益为
由于两次抽取独立且策略不会因次数改变,最优策略与单次相同,最后把单次期望乘以 2。
连续最优反应可得到对称均衡在 。在整数限制下取 可满足最优性近似。
此时单次期望收益为
两次总期望约为 。
Original Explanation
We can first make some simplifications to the game. Firstly, if the strategy is optimal, then if this game were to be repeated many times, they would not change their strategy. Therefore, the optimal strategy for the game where there are consecutive marble draws is the same as the optimal strategy for the game with one draw. Then, we just multiply the expected profit by to represent the two draws. Furthermore, as this game is symmetric for the two players, their optimal strategy will be the same. This point will be important later.
Let be the expected profit that obtains with player putting in balls and putting in balls. Namely, for the one draw game,
As player draws his ball with probability and is the payout. Let's fix and find the that is the best response to this . In other words, given , what optimizes ? To do this, we take the partial derivative of in and treat as continuous for now. We will then account for discreteness at the end.
This yields that
Solving the above with the quadratic equation yields that . However, the root results in a negative value, so is the best response for player if player puts marbles in. Similarly, as this game is symmetric, the optimal response for player if player puts marbles in is .
To find the optimal strategy for each player, this means that we need to find the combo such that neither of the players can do better by adjusting their strategy. We already are aware from before that by the symmetry of the game. Therefore, to solve for this, we just substitute in as in the first equation. This yields we can say that
As is not possible, we conclude that is the optimal strategy. However, this is not actually possible, as our marbles must be an integer value. Therefore, we should test and to see if they are Nash equilibria.
For , the expected payout for one draw for each player is . One can check that by varying and keeping fixed at , player can't do any better. Therefore, is a Nash equilibrium. For , the expected payout is . However, one can also verify that the expected payout for is also . However, this can't be an equilibrium, as should change to marbles to yield higher expected payout. Thus, while is also a Nash equilibrium, is preferable because of the higher expected payout. This means that the optimal strategy is for both players to place marbles and have a total expected payout of .