机器人游泳计时赛
Robot Swimming Trials
题目详情
正如我们之前两次一样,我们发现自己沉浸在一场机器人体育比赛中。在机器人游泳计时赛中,3N 个相同的机器人通过游 N 场比赛争夺决赛中 N 个同等的席位。每个机器人预先承诺在每场比赛中花费一定量的燃料。所有比赛结束后,决赛的席位将交给比赛的获胜者,从最快的获胜者到最慢的获胜者依次递推。(一旦机器人赢得一场比赛,它就没有资格再赢得另一场比赛。)机器人的速度随着它花费的燃料量严格增加,平局通过在花费相同燃料量的机器人中随机选择获胜者来打破。
从数学上讲,3N 个机器人各提交一个策略,这是一个由非负实数“出价”组成的 N 元组,其总和为 1,代表在 N 场比赛中每场燃烧的燃料。然后从最高出价(横跨所有比赛和所有机器人)向下确定比赛的获胜者,平局随机打破。一旦机器人赢得一场比赛,其其他出价将被删除,因此我们保证能得到 N 个不同的决赛晋级者。
例如,假设 N=3 且 3N=9 个机器人提交它们的策略如下
机器人
第1场
第2场
第3场
Automatonya
0.6
0.1
0.3
Botty
0.6
0.3
0.1
Chroma
0
1
0
Data
0.3
0.5
0.2
Electro
0.2
0.8
0
Fernandroid
0.4
0.5
0.1
Gregulator
0.5
0.5
0
Hannanobot
0
0.9
0.1
IO
0.2
0.7
0.1
第二场比赛首先决出胜负,因为 Chroma 的出价 1 是总最高出价,所以 Chroma 被宣布为该计时赛的获胜者。接下来,第一场比赛决出胜负,因为 0.6 是剩余出价中的最高者(我们忽略第二场比赛中的 0.9、0.8 和 0.7,因为它已经有了获胜者)。我们抛一枚公平的硬币来决定 Automatonya 和 Botty 之间的获胜者;假设 Automatonya 走运并被宣布为获胜者。然后决定第三场比赛,Data 被宣布为获胜者,因为在尚未获胜的机器人中,0.2 是最高出价(Automatonya 的 0.3 被忽略)。
在机器人游泳计时赛(RST)的传奇历史中,元游戏定格在人们普遍认为的纳什均衡中:每个机器人均匀随机地选择一场比赛,并将他们所有的燃料都分配给它。我们称之为离散策略。然而,有传言称这种传统智慧并不完全准确:对于足够大的 N,离散策略不是纳什均衡。你的任务是找出两条信息:
计时赛的纳什均衡不再是离散策略的最小 N 是多少?
对于这个 N,如果其他 3N-1 个机器人天真地采用离散策略,而你的机器人进行最优比赛(利用对对手策略的了解),你进入决赛的概率 p 是多少(四舍五入到 6 位有效数字)?
请以“N, p”的格式提交你的答案。
As we have twice before, we find ourselves engrossed in a robot sports competition. In the Robot Swimming Trials, 3N identical robots compete for N equivalent spots in the finals by swimming N races. Each robot precommits to spending a certain amount of its fuel in each race. After all the races are run, the spots in the finals are given to the winners of the races, moving from the fastest winner to the slowest. (Once a robot wins a race, it is ineligible to win another race.) A robot’s speed is strictly increasing in the amount of fuel it spends, and ties are broken by randomly choosing the winner among the robots that have spent the same amount of fuel.
Mathematically speaking, the 3N robots each submit a strategy, which is an N-tuple of nonnegative real number “bids” summing to 1, representing the fuel burned in each of the N races. The winners of the races are then determined from the highest bid (across all races and all robots) on down, with ties broken randomly. Once a robot wins a race their other bids are deleted, so we are guaranteed to get N distinct qualifiers for the finals.
For example, suppose N=3 and the 3N=9 robots submit their strategies as
| Robot | Race 1 | Race 2 | Race 3 |
|---|---|---|---|
| Automatonya | 0.6 | 0.1 | 0.3 |
| Botty | 0.6 | 0.3 | 0.1 |
| Chroma | 0 | 1 | 0 |
| Data | 0.3 | 0.5 | 0.2 |
| Electro | 0.2 | 0.8 | 0 |
| Fernandroid | 0.4 | 0.5 | 0.1 |
| Gregulator | 0.5 | 0.5 | 0 |
| Hannanobot | 0 | 0.9 | 0.1 |
| IO | 0.2 | 0.7 | 0.1 |
The second race gets resolved first because Chroma’s bid of 1 is the highest overall, and Chroma is declared the winner of that time trial. Next, the first race is resolved because 0.6 is the highest remaining bid (we ignore the 0.9, 0.8, and 0.7 in the second race because it already has a winner). We flip a fair coin to determine who is the winner between Automatonya and Botty; say that Automatonya gets lucky and is declared the winner. Then the third race is decided, and Data is declared the winner, because 0.2 is the highest bid among robots that have not yet won (Automatonya’s 0.3 is ignored).
Over the storied history of the RST, the metagame settled into what was widely believed to be the Nash equilibrium: each robot uniformly randomly selects a race and devotes all of their fuel to it. Let’s call this the discrete strategy. However, rumors are circulating that this conventional wisdom is not entirely accurate: for a large enough N, the discrete strategy is not the Nash equilibrium. You’ve been tasked to find two pieces of information:
What is the smallest N for which the trial does not have the discrete strategy as the Nash equilibrium?
For this N, if the other 3N-1 robots naively play the discrete strategy and your robot plays optimally (exploiting this knowledge of your opponents’ strategies), with what probability p will you make the finals (rounded to 6 significant digits)?
Please submit your answer in the form “N, p”.
解析
我们已知对手都在采用离散策略,这意味着他们将各自独立地均匀随机选择一场比赛,并把所有燃料都投入进去。由于我们要寻找使离散策略不再最优的最小 N,因此我们将采用与此不同的策略。这意味着我们不会把所有的燃料分配给任何一场比赛,但由于我们所有的对手在每场比赛中都“出价” 1 或 0,最好的非离散策略将涉及在每场比赛中出价一个非零的金额,这将击败所有的 0 出价(但输给所有的 1 出价)。因此,不失一般性,我们可以假设在每场比赛中出价 1/N 的策略,我们可以称之为均匀策略。
现在我们需要计算击败离散策略的最小 N。均匀策略赢得一场比赛当且仅当其他 3N-1 个采用离散策略的机器人中没有一个选择该比赛投入其燃料。计算单场比赛的这个概率很简单,(1-1/N)^(3N-1)。但是,离散策略的分配在 N 场比赛中并不是独立的!假设它们是独立的会给出一个错误的答案 N=9 和 p=0.350245…
同样,使用均匀策略赢得比赛的这些事件在 N 场比赛中也不是互斥的!假设它们互斥会给出一个错误的答案 N=8 和 p=0.370916…
相反,我们可以找到一个很好的递归来计算其他 3N-1 名玩家使得至少一场比赛无人防守的排列数量。例如,令 P(R,m,n) 等于如果我们需要将 R 个机器人分配给 (m+n) 场总比赛(其中 m 场已经分配了机器人,n 场尚未分配),我们最终将至少分配一名机器人给所有 (m+n) 场比赛的概率。然后均匀随机地将下一个机器人分配给一场比赛意味着如下递归:
P(R,m,n) = (m * P(R-1,m,n) + n * P(R-1,m+1,n-1))/(m+n)。
加上边界值
P(R,m,0) = 1(所有比赛都已分配了至少一名机器人)
以及对于 n>0,
P(0,m,n) = 0(我们的机器人用完了,而且还有一场未分配的比赛),
我们可以通过归纳法有效地计算出 P 的值。我们要找出使得
1 - P(3N-1,0,N) > 1/3 的最小 N,
这在 N=8 和 p=0.334578… 时出现。
恭喜本月成功计算出正确规模和赢得游泳计时赛概率的解谜者!
Original Explanation
We are given that our opponents are all playing the discrete strategy, which means they will all be independently choosing a race uniformly randomly and devoting all of their fuel to it. Since we are searching for the smallest N so that the discrete strategy is not optimal, we will be playing something other than this strategy. This means we won’t be assigning all of our fuel to any one race, but since all of our opponents are “bidding” 1 or 0 on every race, the best non-discrete strategies will involve bidding a nonzero amount on every race, which will beat all of the 0 bids (but lose to all of the 1 bids). So without loss of generality we can assume the strategy of bidding 1/N on every race, we can call this the uniform strategy.
Now we need to compute the smallest N for which this beats out the discrete strategy. The uniform strategy wins a race exactly when none of the 3N-1 other discrete-strategy-playing robots select that race for their fuel. It’s straightforward to compute this probability for a single race, (1-1/N)^(3N-1). But the assignment of discrete strategies is not independent across the N races! Assuming they were would give an incorrect answer of N=9 and p=0.350245…
Also, these events of winning a race with the uniform strategy are not disjoint across the N races! Assuming they were would give an incorrect answer of N=8 and p=0.370916…
Instead, a nice recursion could be found to count up the number of arrangements of the 3N-1 other players that left at least one race undefended. For example, let P(R,m,n) equal the probability that, if we need to assign R robots to (m+n) total races, m of which already have a robot assigned and n of which don’t, we will eventually assign at least one robot to all (m+n) races. Then assigning the next robot to a race uniformly randomly implies the recursion:
P(R,m,n) = (m * P(R-1,m,n) + n * P(R-1,m+1,n-1))/(m+n).
Along with the boundary values
P(R,m,0) = 1 (all races have been assigned at least one robot)
and for n>0,
P(0,m,n) = 0 (we’ve run out of robots and we still have an unassigned race),
we can compute values of P efficiently by induction. We want to find the smallest N such that
1 - P(3N-1,0,N) > 1/3,
which occurs at N=8 and p=0.334578….
Congrats to this month’s solvers who computed the correct size and probability of winning the swimming trials!