返回题库

骰子游戏2

Dice Game 2

专题
Probability / 概率
难度
L5

题目详情

我们玩一个游戏,有两个四面体骰子,红色和蓝色,每个骰子的面都标记为 1-4。在每一回合中,你同时掷两个骰子。

如果蓝色显示的内容多于红色显示的内容,则你将获得差额,否则无需支付任何费用。如果蓝色和红色显示相同的值,则游戏立即结束,否则,你再进行一轮。

游戏的公允价值是多少?

We play a game with two tetrahedral dice, red and blue, each with faces labeled 1-4. In each turn, you simultaneously roll both dice.

If the blue shows more than the red, you are paid the difference, otherwise paid nothing. If the blue and red show the same value, the game is immediately over, otherwise, you take another turn.

What is the fair value of the game?

解析

让我们使用随机方法来解决这个问题。我们将 bb 称为蓝色骰子的结果,将 rr 称为红色骰子的结果。首先,让我们找到 P(b>r)P(b > r)P(b=r)P(b = r)P(b<r)P(b < r) 及其结果,因为它们各自产生不同的结果。

病例总数:44=164 \cdot 4 = 16

在这些 1616 案例中,有 44 方法可以创建 b=rb = r 等结果。因此, P(b=r)=416=14P(b = r) = \frac4{16} = \frac14

对于剩余的 1212 案例,一半将导致 b>rb>r,另一半将导致 r<br<b。 两种结果的概率均为 121612=38\frac{12}{16} \cdot \frac12 = \frac38 因此, P(b>r)=P(b<r)=38P(b > r) = P(b < r) = \frac38

现在让我们根据这三种情况分别找到我们的耳环。我们将 xx 称为我们的收入。

b=rb = r 的情况下,游戏停止,我们什么也没赚到,所以我们的条件期望是 E[xb=r]=0E[x | b = r] = 0

b<rb<r 的情况下,我们什么也没赚到,但游戏不会停止,所以 xx 保持不变,因此: E[xb<r]=E[x]E[x | b < r] = E[x]

在最后的情况下,b>rb>r,我们得到了差额并且能够继续比赛。在b>rb>r的总共六种情况中,一种将导致33的差异,两种将导致22的差异,以及三种将导致11的差异。平均收集的差异将为 163+262+361=53\frac16\cdot3+\frac26\cdot2+\frac36\cdot1 = \frac53。这将不包括在你的收入中,因此 E[xb>r]=E[x]+53E[x | b > r] = E[x] + \frac53

将所有内容放在一起我们得到: E[x]=14(0)+38(E[x])+38(E[x]+53)E[x] = \frac14(0) + \frac38(E[x]) + \frac38(E[x] + \frac53) 求解 E[x]E[x] 我们剩下: E[x]=2.5\Longrightarrow E[x] = \boxed{2.5}

import random

roll_dice = lambda: random.randint(1, 4)

earnings = []
num_iters = 10000
for i in range(num_iters):

    profit = 0

    while(True):

        red = roll_dice()
        blue = roll_dice()

        if blue == red:
            break

        elif blue > red:
            profit += blue - red

        elif blue < red:
            continue

    earnings.append(profit)

print(sum(earnings)/num_iters)

Original Explanation

Let's use a stochastic approach to solve this problem. Let's call bb the outcome of the blue die and rr the outcome of the red die. First, let's find the P(b>r)P(b > r), P(b=r)P(b = r), and P(b<r)P(b < r) and their outcomes, as they each yield a different result.

Total cases: 44=164 \cdot 4 = 16

Of these 1616 cases, there are 44 ways to create an outcome such that b=rb = r. Thus, P(b=r)=416=14P(b = r) = \frac4{16} = \frac14

With the 1212 remaining cases, half will result in b>rb>r and the other half will be r<br<b. Both outcomes have a probability of 121612=38\frac{12}{16} \cdot \frac12 = \frac38 Thus, P(b>r)=P(b<r)=38P(b > r) = P(b < r) = \frac38

Now let's find our earings given each of these three scenarios. Let's call xx our earnings.

In the situation b=rb = r, the game stops and we earn nothing so our conditional expectation is E[xb=r]=0E[x | b = r] = 0

In the situation b<rb<r, we earn nothing but the game doesn't stop so xx remains unchanged, therefore: E[xb<r]=E[x]E[x | b < r] = E[x]

In the final situation where b>rb>r, we are paid the difference and are able to continue playing. Of the six total cases where b>rb>r, one will result in a difference of 33, two will result in a difference of 22, and three will result in a difference of 11. The average collected difference will be 163+262+361=53\frac16\cdot3+\frac26\cdot2+\frac36\cdot1 = \frac53. This will be in addition to your earnings so E[xb>r]=E[x]+53E[x | b > r] = E[x] + \frac53

Putting everything together we get: E[x]=14(0)+38(E[x])+38(E[x]+53)E[x] = \frac14(0) + \frac38(E[x]) + \frac38(E[x] + \frac53) Solving for E[x]E[x] we are left with: E[x]=2.5\Longrightarrow E[x] = \boxed{2.5}

import random

roll_dice = lambda: random.randint(1, 4)

earnings = []
num_iters = 10000
for i in range(num_iters):

    profit = 0

    while(True):

        red = roll_dice()
        blue = roll_dice()

        if blue == red:
            break

        elif blue > red:
            profit += blue - red

        elif blue < red:
            continue

    earnings.append(profit)

print(sum(earnings)/num_iters)