掷或拿 I
Take and Roll I
题目详情
你有一枚公平 20 面骰与 100 次动作。骰子起始朝上为 1。
每次动作你可以选择:
- 掷(roll):重掷当前朝上点数;
- 拿(take):把当前朝上点数兑现为收益(但游戏不会结束,后续仍可继续 take 或 roll;例如你可以对初始 1 连续 take 100 次获得 100)。
你的策略是:在游戏开始前选定一个阈值 ,当且仅当你第一次掷到点数 时开始一直 take(在此之后不再 roll)。
在理性选择 的情况下,期望收益最大是多少?
You are given a fair 20-sided die and 100 actions in a game. The die starts with upface . The two options you can perform are to roll and to take. Performing a roll re-rolls the current upface of the die. Performing a take allows you to cash out the current upface of the die. Note that the game does not end when you perform a take and that you do not have to roll between takes. Therefore, for example, you can just perform 100 takes on the initial upface and walk away with guaranteed. Your strategy is to cash out the upface when you roll at least some threshold for the first time. You fix this at the beginning of the game. Assuming rational strategy in selecting , what is your expected payout on this game?
解析
若阈值为 :
- 一旦掷到 ,其条件期望为 。
- 每次 roll 命中 的概率为 ,因此命中所需 roll 次数期望为 。
- 于是可用于 take 的动作数期望约为 。
期望收益函数为
比较整数 可得最优为 ,最大期望收益为
Original Explanation
If you take when you have at least for the first time, your expected face showing is . Since there are values on the die at least , the probability on any given roll of seeing at least is . So, the average number of rolls needed to see a value at least is . Therefore, you are able to claim on turns. Hence, your expected payout would be
To find the maximum, one can treat as continuous and differentiate in and then consider the two integers is between as potential maximizers.
Doing this by product rule and simplifying, . The zeros of this polynomial, by quadratic formula, are . The root where we add is larger than 20, so is our optimizer.
In an interview, one could notice that , so that's how one could deduce . Lastly, plugging in and yields that yields the optimal value with .