最优策略是把所有 place 放在前面、take 放在后面(place 放在 take 之后只会降低期望可回收金额)。
若先做 p 次 place,则盒子里总金额为 p;再做 m=100−p 次 take。
每次 take 的期望能拿走当前总金额的一半,因此 m 次 take 的期望回收比例为
i=1∑m2i1=1−2−m.
所以期望收益为
G(p)=p(1−2−(100−p)).
比较 p 可得最优为 p=94(即 94 次 place 后接 6 次 take),此时期望为
94(1−2−6)=94⋅6463=322961≈92.5313.
Original Explanation
There are two key elements to this game that should be considered: the sequence of places and takes, and the total number of places and takes. We'll begin by addressing the first element. Intuitively, no place actions should follow a take. Greedily, it makes sense to place a take after all places to maximize the expected value of that take (since each place action after a take reduces the potential payout by 1);therefore,alltakesshouldbeclusteredattheendofthe100turns.Thiscanalsobeprovenusinginduction,butwe′llleavethatasanexerciseforthereader.Now,let′sconsiderthesecondelement:thenumberofplacesandtakes.Iftherearepplaceactions,therewillbe100 - ptakes,andthepplaceactionsmusthappenbeforethe100 - ptakes.Todeterminetheoptimalvalueofp,wecanassessthepointatwhichaddingonemoreplaceactionnolongerincreasestheexpectedpayout.Lookatthep−thactionwhere1 < p < 100andassumethisisthelastplacebeforewestarttaking,orthenextaction(p + 1)wouldbethefinalplacebeforetakingbegins.Whatistheexpectedoutcomeofthegameifthep−thactionisthefinalplace?Therearepdollarsintotal,and100 - ptakesremaining.Ourexpectedpayoffwillbep \times \sum_{i=1}^{100 - p} \frac{1}{2^i},whichcanbeseenfromthelinearityofexpectation−ateverytake,wewilltakehalfofwhatisremainingonaverage.Whatistheexpectedvalueofthegameifactionp + 1isthelastplace?Therearep + 1dollarsinaggregateandwehave100 - (p + 1)takes.Thus,ourexpectedpayoffis(p + 1) \times \sum_{i=1}^{100 - p - 1} \frac{1}{2^i}.Wearelookingforthelargestvalueofpsuchthatthepayoffwherethep−thplaceisthelastplaceisgreaterthanthepayoffwheretheincrementalp + 1$-th place is the last place. Thus:
pi=1∑100−p2i1>(p+1)i=1∑100−p−12i1
and
2100−pp>i=1∑99−p2i1
approaches 1, and thus p>2100−p⇒p>93. With 94 places, the expected payoff is
94i=1∑62i1≈92.5
To check this is the maximum payoff, we can look at the 93 and 95 placing cases. With 93 places, the expected payoff is
93i=1∑72i1≈92.3
With 95 places, the expected payoff is
95i=1∑52i1≈92.0
Both surrounding values are less than our expected payoff, and hence the optimal number of times we should place is 94 with an expected payoff of
322961≈92.5