返回题库

机器人棒球

Robot Baseball

专题
Probability / 概率
难度
L7

题目详情

人工自动机田径协会(Quad-A)再次出击,与 季后赛棒球他们正在开发机器人棒球比赛。游戏组成 一系列独立的击球,其中击球手试图最大化预期 得分,并且投手试图最小化预期得分。

击球是一系列投球,其中包括连续计数的球和好球,两者 从零开始。对于每个投球,投手决定是投球还是击球, 击球手决定是等待还是挥杆;这些决定是秘密做出的 同时。这些选择的结果如下。

  • 如果投手投球而击球手等待,则球数增加 1。

  • 如果投手投出好球,而击球手等待,则好球数加 1。

  • 如果投手投球且击球手挥动,则击球次数加 1。

  • 如果投手投出好球并且击球手挥动,击球手的概率p 击出本垒打[1](#fn:1),并且以1-p的概率,击中次数增加 1。

出现以下情况时,击球结束:

  • 球数达到 4 个,击球手得 1 分。

  • 击球次数达到 3 次,击球手得分为 0 分。

  • 击球手击出本垒打,击球手获得 4 分。

通过改变击球区的大小,Quad-A 可以调整值p,即概率 a 以全垒打为结果的投球。他们发现观众 最兴奋的是达到满数的击球数,即达到 三球两击的状态。令 q 为击球达到满分的概率 计数; q 取决于 p。假设击球手和投手都使用最佳混合 策略,Quad-A 选择了最大化 qp。找到这个q,最大值 击球概率达到完整计数,精确到小数点后十位。

1. \$1

The Artificial Automaton Athletics Association (Quad-A) is at it again, to compete with postseason baseball they are developing a Robot Baseball competition. Games are composed of a series of independent at-bats in which the batter is trying to maximize expected score and the pitcher is trying to minimize expected score.

An at-bat is a series of pitches with a running count of balls and strikes, both starting at zero. For each pitch, the pitcher decides whether to throw a ball or strike, and the batter decides whether to wait or swing; these decisions are made secretly and simultaneously. The results of these choices are as follows.

  • If the pitcher throws a ball and the batter waits, the count of balls is incremented by 1.
  • If the pitcher throws a strike and the batter waits, the count of strikes is incremented by 1.
  • If the pitcher throws a ball and the batter swings, the count of strikes is incremented by 1.
  • If the pitcher throws a strike and the batter swings, with probability p the batter hits a home run1 and with probability 1-p the count of strikes is incremented by 1.

An at-bat ends when either:

  • The count of balls reaches 4, in which case the batter receives 1 point.
  • The count of strikes reaches 3, in which case the batter receives 0 points.
  • The batter hits a home run, in which case the batter receives 4 points.

By varying the size of the strike zone, Quad-A can adjust the value p, the probability a pitched strike that is swung at results in a home run. They have found that viewers are most excited by at-bats that reach a full count, that is, the at-bats that reach the state of three balls and two strikes. Let q be the probability of at-bats reaching full count; q is dependent on p. Assume the batter and pitcher are both using optimal mixed strategies and Quad-A has chosen the p that maximizes q. Find this q, the maximal probability at-bats reach full count, to ten decimal places.

  1. Robot fielding has a long way to go. ↩

解析

我们的任务是微调机器人棒球的击球区,以 最大化击球数达到完整计数的概率,并且 找到这个最大概率。最直接的方法 这是从倒计时到 at 开始的过程。 蝙蝠,对于给定的击球区概率p,确定纳什 投手投出好球的概率与 击球手挥棒的概率。

一个有趣的对称性可以简化工作:求解器可能会注意到 投球的结果相对于投手的 选择和击球手的选择。这意味着概率 投手投出好球和击球手挥棒的概率 在任何情况下都是相同的(两个玩家都想选择一个 使对手对他们两者无动于衷的概率 选项,否则他们的举动就会被利用,并会减少他们的 预期得分(或投手增加对手预期得分 点))。

一旦我们倒退到零球和零的起始计数 罢工,我们可以计算达到完整计数的概率 然后调整p以最大化它。很高兴我们被要求提供 函数的最大高度,比函数的高度更容易估计 该最大值的位置(实现该最大值的p值)。小心 校准发现达到满计数的最大概率为 0.2959679934…(大约达到 p=0.2269732…)。

其他有趣的统计数据:最佳棒球投手有令人印象深刻的 68% 的三振率,21% 的保送率。击球手的本垒打率为 11% 板外观,斜线为 .139/.318/.556。

恭喜本月的求解者!


Original Explanation

We were tasked with fine-tuning the strike zone of Robot Baseball to maximize the probability of an at bat reaching a full count, and finding this maximal probability. The most straightforward way to do this was to work backwards from full count down to the start of an at bat, for a given strike zone probability p, determine the Nash equilibrium of pitcher’s probability to throw a strike and the batter’s probability to swing.

An interesting symmetry to simplify the work: the solver might notice that the outcome of a pitch is symmetric with respect to the pitcher’s choice and the batter’s choice. This means the probability of the pitcher throwing a strike and the probability of the batter swinging will be the same in any scenario (both players want to choose a probability that makes the opponent indifferent between their two options, otherwise their move is exploitable and will reduce their expected points (or for the pitcher increases the opponents expected points)).

Once we work backwards to the starting count of zero balls and zero strikes, we can compute the probability of reaching a full count and then adjust p in order to maximize it. Happily we are asked for the maximal height of the function, which is easier to estimate than the location of that maximum (the p value that achieves it). Careful calibration finds the maximum probability of reaching full count to be 0.2959679934… (achieved at approximately p=0.2269732…).

Other stats of interest: optimal baseball pitchers have an impressive 68% strike out rate and admit 21% walks. Batters hit home runs at 11% of plate appearances, and have a slash line of .139/.318/.556.

Congrats to this month’s solvers!