返回题库

抽帽子:付 1 元可再抽一次

The Picking Hat

专题
Brainteaser / 脑筋急转弯
难度
L6

题目详情

帽子里有数字 1..100。每轮你从帽子里抽一个数,记下来,然后把它放回帽子里。游戏结束时你最后一次写下的数字就是你得到的收益(元)。

你可以玩任意多轮,但每多玩一轮需要支付 1 元。

在最优策略下,这个游戏的公平价值(期望净收益)是多少?

A hat contains the numbers 1-100. The rules to the game are as follows. Each round, AJ draws a number out of the hat, writes it down, and puts the number back in the hat. The last number written down is the number of dollars awarded to AJ. AJ may play as many rounds as he would like, but each round costs $1. Assuming optimal play, what is the fair value of this game?

解析

最优策略为“阈值停”:若抽到 x\ge x 则停,否则继续。

计算可得最优整数阈值为 x=87x=87,对应的期望净收益为

12091486.36.\frac{1209}{14}\approx 86.36.

Original Explanation

We want to determine an optimal threshold xx, where, if AJ's draw is at least xx, then he will not play another round. Notice that, if AJ draws again, without taking into consideration the 1penalty,thenAJsexpectedpayoffisthesameasthepreviousround.So,wecanwritethefollowingexpression,where1 penalty, then AJ's expected payoff is the same as the previous round. So, we can write the following expression, where\mathbb{E}[A]$ denotes the total winnings.

E[A]=1+i=x100i100+i=1x1E[A]100=1+(100x+1)(100+x)200+x1100E[A]\begin{aligned} \mathbb{E}[A] & = - 1 + \sum_{i = x}^{100} \frac{i}{100} + \sum_{i = 1}^{x - 1} \frac{\mathbb{E}[A]}{100} \\ & = -1 + \frac{(100 - x + 1)(100 + x)}{200} + \frac{x - 1}{100} \mathbb{E}[A] \end{aligned}

Let's isolate E[A]\mathbb{E}[A].

E[A]=(101x)(100+x)2002(101x)\begin{aligned} \mathbb{E}[A] & = \frac{(101 - x)(100 + x) - 200}{2(101-x)} \end{aligned}

We now wish to determine

xoptimal=arg maxx{(101x)(100+x)2002(101x)}.x_\text{optimal} = \underset{x}{\text{arg max}} \left\{\frac{(101 - x)(100 + x) - 200}{2(101-x)} \right\}.

Let's take the derivative with respect to xx.

ddxE[A]=10001202x+x24(101x)2\frac{d}{dx} \mathbb{E}[A] = \frac{10001 - 202 x + x^2}{4(101-x)^2}

Setting ddxE[A]=0\frac{d}{dx} \mathbb{E}[A] = 0 and solving for xx, we find

10001202xoptimal+xoptimal2=0xoptimal=202±8002=101±102\begin{aligned} 10001 - 202 x_\text{optimal} + x_\text{optimal}^2 & = 0 \\ x_\text{optimal} & = \frac{202 \pm \sqrt{800}}{2} \\ & = 101 \pm 10 \sqrt{2} \end{aligned}

Since x100x \leq 100,

xoptimal=10110286.9.x_\text{optimal} = 101 - 10 \sqrt{2} \approx 86.9.

Our optimal integer xx is then either 86 or 87. Plugging both into E[A]\mathbb{E}[A], we find E[A]\mathbb{E}[A] is maximized when x=87x = 87 and has value 120914\frac{1209}{14}.