30 面骰 vs 20 面骰:Bob 可不看 Alice
30 Die Split III
题目详情
Alice 有一枚公平 30 面骰,Bob 有一枚公平 20 面骰。两人同时掷出点数,点数更大者获胜;若平局,则 Bob 获胜。
Bob 在掷完第一次后可以选择是否重掷一次(他看不到 Alice 的点数),若选择重掷则用第二次结果作为最终点数。
在 Bob 最优策略下,求 Alice 获胜概率。
Alice and Bob have fair -sided and -sided dice, respectively. The goal for each player is to have the largest value on their die. Alice and Bob both roll their dice. However, Bob has the option to re-roll his die in the event that he is unhappy with the outcome. He can't see Alice's die beforehand. Bob then keeps the value of the new die roll. In the event of a tie, Bob is the winner. Assuming optimal play by Bob, find the probability Alice is the winner.
解析
Alice 获胜当且仅当 。
由于 ,要最小化 等价于让 Bob 最大化自己最终点数 的期望(因为 )。
若 Bob 采用阈值策略:当第一次掷到 就重掷,否则保留,则
该二次函数在 处取最大值。
因此 Bob 最优为:第一次掷到 1..10 就重掷,掷到 11..20 就保留。此时 ,所以
Original Explanation
Let be the event Alice wins. First, we want to find . Let and be the value of Alice's roll and Bob's first roll, respectively. The key here is to condition on whether or not . Namely, this means We can quickly see that , as this accounts for of equally-likely values. If , then clearly Alice wins regardless of what Bob obtains, so .
If , then we are really rolling independent sided fair dice. Namely, to compute , we see that the probability of a tie is , as there are values they can agree upon out of outcomes. Therefore, the probability the two rolls are different is . Because of the fact that the two dice are symmetric, .
Combining these all together, we see that Now, Bob is going to roll again whenever . This is because he wants to minimize Alice's chance of winning. It is easy to see that Alice just needs to roll a value of at least .
We just need to find the maximum value of , say , such that . Solving this inequality, we get that , so is the maximum value that Bob rolls again. Let be the event that Alice wins in this new game where Bob can roll again. We have that We get this because of the fact that Bob rolls again with probability which is when he rolls at most . Then, the probability Alice wins is going to change by
To calculate , this is simply just averaging all the cases where for , as when , the value is uniformly distributed on the first positive integers. Therefore, Therefore, we get that