机器人更新版游泳计时赛

Robot Updated Swimming Trials

专题: Probability / 概率
难度: L7
来源: Jane Street

题目详情

早在10月份，我们就接到了一个谜题：寻找我们能邀请参加机器人游泳计时赛的最少机器人数量，以使得参赛者的最优策略不再是离散的。点击这里查看原谜题的解释和示例；在下一段中，我们将简要回顾一下设定。

锦标赛主管选择一个正整数 N，然后邀请 3N 个机器人参加计时赛，即所有 3N 个机器人之间进行的 N 场比赛。机器人承诺将它们相同数量的燃料分配给这 N 场比赛，在任何给定的比赛中，燃烧燃料最多的机器人获胜（如果在某场比赛中花费的燃料最多且相同，则在这些打平的机器人中均匀随机分配胜者）。所有机器人根据其时间表参加所有比赛，然后确定 N 个不同的获胜者，每场比赛一名，方法是从剩余的比赛中依次选择花费燃料最多且领先于所有尚未赢得比赛的其他机器人的机器人。所有机器人都通过这种方式竞争（即选择它们的燃料分配分布以最大化概率）进入决赛。离散策略是指机器人均匀随机地选择一场比赛，并将所有燃料分配给该比赛，而对其他比赛分配零燃料的策略。

我们发现（剧透警告！）邀请 24 个机器人参加 8 场计时赛（即 N=8）创建了一个离散策略不再是最优策略的锦标赛。锦标赛组织者采用了这种锦标赛规模，观众也获得了更丰富的策略和更多样化的比赛结果的奖励[1](#fn:1)。不出所料，参赛者从每个人总是使用离散策略转变为使用更微妙和连续的燃料分配。元游戏不断演变，最终定格在一个纳什均衡中，即给定竞争者以一定的概率 p 选择离散策略，否则选择将非零燃料分配给至少两场比赛的策略。

求 p，即当 N=8 时，在机器人更新版游泳计时赛中使用纳什均衡策略的机器人将其所有燃料投入单场比赛的概率，精确到6位有效数字。

1. \$1

Back in October, we were tasked with the puzzle of finding the smallest number of robots we could invite to a Robot Swimming Trial in order to change the optimal strategy of participants away from discrete. Click here to see the original puzzle explanation and example; in the next paragraph we will go over a quick refresher on the setup.

The tournament directors choose a positive integer N, and then 3N robots are invited to compete in the trials, which are N races between all 3N robots. The robots commit to using a schedule of their identical fuel amounts to the N races, and on a given race whatever robot burns the most fuel wins (ties for most fuel spent on a given race are split uniformly randomly among the tying robots). All robots swim in all races according to their schedules, and then N distinct winners are determined, one from each race, by successively selecting the robot from the remaining races that spent the most fuel and finished ahead of all other robots that haven’t yet won a race. Robots all compete (i.e. choose their fuel allotment distributions to maximize the probability) to be selected for the finals by this method. The discrete strategy is the one in which a robot chooses a race uniformly randomly and assigns all of its fuel to that race, and zero fuel to the other races.

We found (spoiler alert!) that inviting 24 robots to compete in 8 trial races (i.e. N=8) created a tournament in which the discrete strategy was no longer optimal. The tournament organizers adopted this tournament size, and viewers were rewarded with a richer set of strategies and more diverse race results¹. As expected, the competitors switched from everyone always using the discrete strategy to more subtle and continuous allotments of fuel. The metagame evolved and eventually settled into a Nash equilibrium in which a given competitor chooses the discrete strategy with a certain probability p, and otherwise elects for an allotment that distributes nonzero fuel to at least two races.

Find p, the probability a robot using the Nash equilibrium strategy when competing at a Robot Updated Swimming Trial with N=8 devotes all of its fuel to a single race, to 6 significant digits.

Also, many fewer extremely slow races in which all robots devoted zero fuel, and whichever robot’s random motion took it across the finish line first was declared the winner. ↩

解析

Original Explanation

May’s challenge asked for the probability that a robot should adopt the discrete strategy in the Nash equilibrium solution of the newly structured Robot Updated Swimming Trials (or RUST for short). This equilibrium probability p has the property that if all opponents are choosing the discrete strategy with this probability, then the remaining competitor is indifferent to choose the discrete strategy or not. Interestingly, in order to solve for p we do not need to compute what the alternate continuous strategy is, we must only assume it would put nonzero weight on every race with probability 1. We leave the argument for this assumption to the reader.

By symmetry, the remaining competitor’s probability of winning must be exactly 1/3 whether they choose discrete or continuous. So we just have to compute the probability of winning with discrete or continuous with all other competitors using probability p, set it equal to 1/3, and then backsolve for p. In the image above, we show the derivation of how to solve for p by computing the win probability when using the continuous strategy (a similar computation could be done using the discrete strategy, but care is required to account for ALL the possible ways to win!).

The final result shows that the probability of choosing the discrete strategy in the Nash equilibrium is very high, p ~ 0.999560…

Congrats to this month’s solvers that were able to compute this tricky probability!