抽牌红加 1 黑减 1
Dynamic Card Game
题目详情
一副牌 52 张:红 26、黑 26。洗牌后不放回依次抽牌,你可以在任意时刻停止。
- 抽到红牌:收益 +1;
- 抽到黑牌:收益 -1。
问:最大化期望收益的最优停规则是什么?这个游戏的期望价值是多少?
A deck has 52 cards: 26 red and 26 black. The deck is shuffled, and cards are drawn one by one (without replacement). You can stop at any time. For each red card drawn, you gain +1; for each black card, you lose -1. What is the optimal stopping rule to maximize expected payoff, and how much is the game worth (in expectation)?
解析
用动态规划。
设状态 为剩余黑牌数 与红牌数 。
立即停下的“价值”为 (可从对称性理解:你的当前得分 + 未见牌的红黑差为常数)。
若继续抽:下一张为黑的概率 (转到 ),为红的概率 (转到 )。令 为最优期望收益,则
边界:,。
计算可得 ,因此游戏价值约为 2.62 美元。
Original Explanation
Let represent the count of black and red cards remaining in the deck. By symmetry, the net from stopping immediately is (because the difference between black and red among the unseen cards is but your “score so far” + “remaining color difference” is constant).
If you continue, the next card is black with probability (transition to ) or red with probability (transition to ). Let be the maximum expected payoff. Then with boundary conditions and . By dynamic programming, one finds . So the game is worth about $2.62.