骰子游戏2

Dice Game 2

专题: Probability / 概率
难度: L5
来源: OpenQuant

题目详情

我们玩一个游戏，有两个四面体骰子，红色和蓝色，每个骰子的面都标记为 1-4。在每一回合中，你同时掷两个骰子。

如果蓝色显示的内容多于红色显示的内容，则你将获得差额，否则无需支付任何费用。如果蓝色和红色显示相同的值，则游戏立即结束，否则，你再进行一轮。

游戏的公允价值是多少？

We play a game with two tetrahedral dice, red and blue, each with faces labeled 1-4. In each turn, you simultaneously roll both dice.

If the blue shows more than the red, you are paid the difference, otherwise paid nothing. If the blue and red show the same value, the game is immediately over, otherwise, you take another turn.

What is the fair value of the game?

解析

让我们使用随机方法来解决这个问题。我们将 $b$ 称为蓝色骰子的结果，将 $r$ 称为红色骰子的结果。首先，让我们找到 $P(b > r)$ 、 $P(b = r)$ 和 $P(b < r)$ 及其结果，因为它们各自产生不同的结果。

病例总数： $4 \cdot 4 = 16$

在这些 $16$ 案例中，有 $4$ 方法可以创建 $b = r$ 等结果。因此， $P(b = r) = \frac4{16} = \frac14$

对于剩余的 $12$ 案例，一半将导致 $b>r$ ，另一半将导致 $r<b$ 。两种结果的概率均为 $\frac{12}{16} \cdot \frac12 = \frac38$ 因此， $P(b > r) = P(b < r) = \frac38$

现在让我们根据这三种情况分别找到我们的耳环。我们将 $x$ 称为我们的收入。

在 $b = r$ 的情况下，游戏停止，我们什么也没赚到，所以我们的条件期望是 $E[x | b = r] = 0$

在 $b<r$ 的情况下，我们什么也没赚到，但游戏不会停止，所以 $x$ 保持不变，因此： $E[x | b < r] = E[x]$

在最后的情况下， $b>r$ ，我们得到了差额并且能够继续比赛。在 $b>r$ 的总共六种情况中，一种将导致 $3$ 的差异，两种将导致 $2$ 的差异，以及三种将导致 $1$ 的差异。平均收集的差异将为 $\frac16\cdot3+\frac26\cdot2+\frac36\cdot1 = \frac53$ 。这将不包括在你的收入中，因此 $E[x | b > r] = E[x] + \frac53$

将所有内容放在一起我们得到： $E[x] = \frac14(0) + \frac38(E[x]) + \frac38(E[x] + \frac53)$ 求解 $E[x]$ 我们剩下： $\Longrightarrow E[x] = \boxed{2.5}$

import random

roll_dice = lambda: random.randint(1, 4)

earnings = []
num_iters = 10000
for i in range(num_iters):

    profit = 0

    while(True):

        red = roll_dice()
        blue = roll_dice()

        if blue == red:
            break

        elif blue > red:
            profit += blue - red

        elif blue < red:
            continue

    earnings.append(profit)

print(sum(earnings)/num_iters)

Original Explanation

Let's use a stochastic approach to solve this problem. Let's call $b$ the outcome of the blue die and $r$ the outcome of the red die. First, let's find the $P(b > r)$ , $P(b = r)$ , and $P(b < r)$ and their outcomes, as they each yield a different result.

Total cases: $4 \cdot 4 = 16$

Of these $16$ cases, there are $4$ ways to create an outcome such that $b = r$ . Thus, $P(b = r) = \frac4{16} = \frac14$

With the $12$ remaining cases, half will result in $b>r$ and the other half will be $r<b$ . Both outcomes have a probability of $\frac{12}{16} \cdot \frac12 = \frac38$ Thus, $P(b > r) = P(b < r) = \frac38$

Now let's find our earings given each of these three scenarios. Let's call $x$ our earnings.

In the situation $b = r$ , the game stops and we earn nothing so our conditional expectation is $E[x | b = r] = 0$

In the situation $b<r$ , we earn nothing but the game doesn't stop so $x$ remains unchanged, therefore: $E[x | b < r] = E[x]$

In the final situation where $b>r$ , we are paid the difference and are able to continue playing. Of the six total cases where $b>r$ , one will result in a difference of $3$ , two will result in a difference of $2$ , and three will result in a difference of $1$ . The average collected difference will be $\frac16\cdot3+\frac26\cdot2+\frac36\cdot1 = \frac53$ . This will be in addition to your earnings so $E[x | b > r] = E[x] + \frac53$

Putting everything together we get: $E[x] = \frac14(0) + \frac38(E[x]) + \frac38(E[x] + \frac53)$ Solving for $E[x]$ we are left with: $\Longrightarrow E[x] = \boxed{2.5}$

import random

roll_dice = lambda: random.randint(1, 4)

earnings = []
num_iters = 10000
for i in range(num_iters):

    profit = 0

    while(True):

        red = roll_dice()
        blue = roll_dice()

        if blue == red:
            break

        elif blue > red:
            profit += blue - red

        elif blue < red:
            continue

    earnings.append(profit)

print(sum(earnings)/num_iters)