返回 Guide 目录

概率与统计 / Probability & Statistics

马尔可夫链

Markov Chains

本页结构

核心概念

  • 马尔可夫性质与转移矩阵 Markov property and transition matrices
  • 状态分类、常返性、周期性与吸收态 State classification, recurrence, periodicity and absorbing states
  • 平稳分布与期望到达时间方程 Stationary distributions and expected hitting time equations

学习顺序

  1. 先画状态图,再列方程。 Draw the state graph before writing equations.
  2. 长期占比问题通常转化为平稳分布。 For long-run proportions, solve the stationary distribution.
  3. 期望时间问题通常对第一步进行条件分解。 For expected time problems, condition on the first step.

概览

Overview

Markov Chains are a fundamental tool for modeling systems that transition between a finite number of states, where the future state depends only on the current state, not on the sequence of events that preceded it.

马尔可夫链是对在有限数量的状态之间转换的系统进行建模的基本工具,其中未来状态仅取决于当前状态,而不取决于之前的事件序列。

一、核心定义与性质

I. Core Definitions and Properties

The Markov Property

马尔可夫性质

A sequence of random variables X1,X2,X3,X_1, X_2, X_3, \dots is a Markov Chain if it satisfies the Markov Property (or memoryless property): the conditional probability distribution of the next state, given the present state and all the past states, depends only on the present state.

随机变量序列 X1,X2,X3,X_1, X_2, X_3, \dots 是一个马尔可夫链,如果它满足 马尔可夫特性 (或无记忆特性):给定当前状态和所有过去状态,下一个状态的条件概率分布仅取决于当前状态。

Transition Matrix

转移矩阵

For a discrete state space X={x1,x2,,xn}\mathcal{X} = \{x_1, x_2, \dots, x_n\}, the dynamics of the chain are governed by the n×nn \times n Transition Matrix PP.

对于离散状态空间 X={x1,x2,,xn}\mathcal{X} = \{x_1, x_2, \dots, x_n\},链的动态由 n×nn \times n 转换矩阵 PP 控制。

  • Entry PijP_{ij}: The probability of transitioning from state xix_i to state xjx_j.
  • Properties: Each entry Pij[0,1]P_{ij} \in [0, 1], and the sum of entries for each row must total 1 (i.e., j=1nPij=1\sum_{j=1}^n P_{ij} = 1). This makes PP a stochastic matrix.
  • kk-step Transition: The probability of moving from state ii to state jj in kk steps is given by the (i,j)(i, j)-th entry of the matrix PkP^k.
  • 条目 PijP_{ij}:从状态 xix_i 转换到状态 xjx_j 的概率。
  • 属性:每个条目 Pij[0,1]P_{ij} \in [0, 1],以及每行条目的总和必须为 1(即 j=1nPij=1\sum_{j=1}^n P_{ij} = 1)。这使得 PP 成为随机矩阵
  • kk 步转换:在 kk 步中从状态 ii 移动到状态 jj 的概率由矩阵 PkP^k 的第 (i,j)(i, j) 条目给出。

二、状态与链的分类

II. Classification of States and Chains

The long-term behavior of a Markov Chain is determined by the properties of its states.

马尔可夫链的长期行为由其状态的属性决定。

Property Definition Relevance
Irreducible Every state is reachable from every other state. Guarantees that a unique stationary distribution may exist.
Aperiodic The chain does not return to a state in a fixed, regular cycle. Necessary for the chain to converge to the stationary distribution regardless of the starting state.
Ergodic A chain that is both irreducible and aperiodic. Crucial: An ergodic chain has a unique stationary distribution, and the chain will converge to it over time.
Recurrent The chain is guaranteed to return to the state it left. All states in a finite, irreducible chain are recurrent.
Transient The chain has a non-zero probability of never returning to the state it left. The chain will eventually leave transient states forever.
财产 定义 关联
不可约 每个州都可以从其他州到达。 保证唯一的平稳分布可能存在。
非周期性 链条不会以固定的、有规律的周期返回到状态。 无论起始状态如何,链都必须收敛到平稳分布。
历经 一条不可约且非周期的链。 至关重要:遍历链具有独特的平稳分布,并且随着时间的推移,该链将收敛到它。
经常性 链条保证返回到它离开时的状态。 有限的、不可约的链中的所有状态都是循环的。
瞬态 该链有一个非零的概率永远不会返回到它离开的状态。 该链最终将永远离开瞬态。

三、平稳分布与长期行为

III. Stationary Distribution and Long-Term Behavior

The Stationary Distribution π=(π1,,πn)\pi = (\pi_1, \dots, \pi_n) is a probability vector that, once reached, remains unchanged by further transitions.

平稳分布 π=(π1,,πn)\pi = (\pi_1, \dots, \pi_n) 是一个概率向量,一旦达到该概率向量,进一步的转换将保持不变。

  • Defining Equation:
  • 定义方程
π=πP,i=1nπi=1\pi = \pi P, \quad \sum_{i=1}^n \pi_i = 1
  • Interpretation: πi\pi_i is the long-run proportion of time the chain spends in state xix_i. In finance, this can represent the long-run probability of a market being in a certain regime (e.g., high volatility).
  • Existence and Uniqueness: A stationary distribution exists for any finite-state Markov Chain. It is unique if and only if the chain is irreducible.
  • 解释πi\pi_i 是链处于状态 xix_i 的长期时间比例。在金融领域,这可以代表市场处于某种状态(例如高波动性)的长期概率。
  • 存在性和唯一性:任何有限状态马尔可夫链都存在平稳分布。当且仅当链是不可约的时它才是唯一的。

四、吸收链与期望到达时间

IV. Absorbing Chains and Expected Hitting Time

An Absorbing State xix_i is a state from which the chain cannot leave (i.e., Pii=1P_{ii} = 1). A chain is Absorbing if it has at least one absorbing state and every non-absorbing state can reach an absorbing state.

吸收状态 xix_i 是链条无法离开的状态(即 Pii=1P_{ii} = 1)。如果一条链至少具有一个吸收状态并且每个非吸收状态都可以达到吸收状态,则该链是吸收

Expected Time to State (Expected Hitting Time)

预计状态时间(预计击球时间)

To find the expected number of steps μi\mu_i to reach a target state (often an absorbing state) starting from state xix_i, we solve a system of linear equations.

为了找到从状态 xix_i 开始达到目标状态(通常是吸收状态)的预期步数 μi\mu_i,我们求解线性方程组。

For a target state xnx_n (where μn=0\mu_n = 0):

对于目标状态 xnx_n(其中 μn=0\mu_n = 0):

  • Example: To find the expected time to reach x3x_3 from x1x_1 in a 3×33 \times 3 chain:
  • 示例:要查找 3×33 \times 3 链中从 x1x_1 到达 x3x_3 的预期时间:
μ1=1+P11μ1+P12μ2+P13μ3(where μ3=0)\mu_1 = 1 + P_{11}\mu_1 + P_{12}\mu_2 + P_{13}\mu_3 \quad (\text{where } \mu_3 = 0)
μ2=1+P21μ1+P22μ2+P23μ3(where μ3=0)\mu_2 = 1 + P_{21}\mu_1 + P_{22}\mu_2 + P_{23}\mu_3 \quad (\text{where } \mu_3 = 0)

Gambler's Ruin Problem (A Classic Absorbing Chain)

赌徒破产问题(经典吸收链)

This is a classic example of an absorbing Markov Chain where the states are the player's current capital, and the absorbing states are 0 (ruin) and a+ba+b (opponent's ruin).

这是吸收马尔可夫链的经典示例,其中状态是玩家当前的资本,吸收状态是 0(毁灭)和 a+ba+b(对手毁灭)。

  • Fair Coin (p=0.5p=0.5): The probability of ruin (reaching 0) starting with capital aa against an opponent with capital bb is:
  • Fair Coin (p=0.5p=0.5):以资本 aa 开始对抗资本 bb 的对手的破产概率(达到 0)为:
P(Ruin)=ba+b\mathbb{P}(\text{Ruin}) = \frac{b}{a+b}

(Correction: The probability of ruin is ba+b\frac{b}{a+b}, not aa+b\frac{a}{a+b} as stated in the original content. The probability of winning is aa+b\frac{a}{a+b}.)

  • Unfair Coin (p0.5p \ne 0.5): Let ρ=1pp\rho = \frac{1-p}{p} (the odds ratio of losing to winning). The probability of ruin is:

(更正:破产的概率是ba+b\frac{b}{a+b},而不是原文中所说的aa+b\frac{a}{a+b}。获胜的概率是aa+b\frac{a}{a+b}。)

  • 不公平硬币(p0.5p \ne 0.5:让ρ=1pp\rho = \frac{1-p}{p}(输与赢的几率比)。破产的概率为:
P(Ruin)=ρaρa+b1ρa+b\mathbb{P}(\text{Ruin}) = \frac{\rho^a - \rho^{a+b}}{1 - \rho^{a+b}}

(Correction: The original formula was for the probability of reaching state a+ba+b starting from aa in a slightly different formulation. The standard ruin probability is given above.)

(更正:原始公式是从 aa 开始达到状态 a+ba+b 的概率,公式略有不同。上面给出了标准破产概率。)

补充讲解

状态选择决定难度

Choose states carefully

好的状态定义应刚好包含下一步所需信息。如果下一步转移仍然依赖过去,说明状态定义太小。

A good state contains exactly the information needed for the next step. If the next transition still depends on the past, the state definition is too small.

用第一步分析列方程

Use first-step equations

期望到达时间和吸收概率通常对第一步转移做条件分解,再在状态空间上解线性方程组。

Expected hitting times and absorption probabilities are usually solved by conditioning on the first transition, then solving a linear system over states.

区分长期占比和到达问题

Separate long run from hitting

平稳分布回答长期占比问题;到达时间方程回答何时或是否到达目标。混淆这两类问题会直接导致方程错误。

Stationary distributions answer long-run proportions. Hitting-time equations answer when or whether a target is reached. Mixing these two interpretations leads to wrong equations.