抽帽子：付 1 元可再抽一次

The Picking Hat

专题: Brainteaser / 脑筋急转弯
难度: L6
来源: QuantQuestion

题目详情

帽子里有数字 1..100。每轮你从帽子里抽一个数，记下来，然后把它放回帽子里。游戏结束时你最后一次写下的数字就是你得到的收益（元）。

你可以玩任意多轮，但每多玩一轮需要支付 1 元。

在最优策略下，这个游戏的公平价值（期望净收益）是多少？

A hat contains the numbers 1-100. The rules to the game are as follows. Each round, AJ draws a number out of the hat, writes it down, and puts the number back in the hat. The last number written down is the number of dollars awarded to AJ. AJ may play as many rounds as he would like, but each round costs $1. Assuming optimal play, what is the fair value of this game?

解析

最优策略为“阈值停”：若抽到 $\ge x$ 则停，否则继续。

计算可得最优整数阈值为 $x=87$ ，对应的期望净收益为

\frac{1209}{14}\approx 86.36.

Original Explanation

We want to determine an optimal threshold $x$ , where, if AJ's draw is at least $x$ , then he will not play another round. Notice that, if AJ draws again, without taking into consideration the $1 penalty, then AJ's expected payoff is the same as the previous round. So, we can write the following expression, where$ \mathbb{E}[A]$ denotes the total winnings.

\begin{aligned} \mathbb{E}[A] & = - 1 + \sum_{i = x}^{100} \frac{i}{100} + \sum_{i = 1}^{x - 1} \frac{\mathbb{E}[A]}{100} \\ & = -1 + \frac{(100 - x + 1)(100 + x)}{200} + \frac{x - 1}{100} \mathbb{E}[A] \end{aligned}

Let's isolate $\mathbb{E}[A]$ .

\begin{aligned} \mathbb{E}[A] & = \frac{(101 - x)(100 + x) - 200}{2(101-x)} \end{aligned}

We now wish to determine

x_\text{optimal} = \underset{x}{\text{arg max}} \left\{\frac{(101 - x)(100 + x) - 200}{2(101-x)} \right\}.

Let's take the derivative with respect to $x$ .

\frac{d}{dx} \mathbb{E}[A] = \frac{10001 - 202 x + x^2}{4(101-x)^2}

Setting $\frac{d}{dx} \mathbb{E}[A] = 0$ and solving for $x$ , we find

\begin{aligned} 10001 - 202 x_\text{optimal} + x_\text{optimal}^2 & = 0 \\ x_\text{optimal} & = \frac{202 \pm \sqrt{800}}{2} \\ & = 101 \pm 10 \sqrt{2} \end{aligned}

Since $x \leq 100$ ,

x_\text{optimal} = 101 - 10 \sqrt{2} \approx 86.9.

Our optimal integer $x$ is then either 86 or 87. Plugging both into $\mathbb{E}[A]$ , we find $\mathbb{E}[A]$ is maximized when $x = 87$ and has value $\frac{1209}{14}$ .