返回题库

R² 的范围

R Squared Range

专题
Machine Learning / 机器学习
难度
L4

题目详情

使用 OLS,把 yyX1X_1 回归后得到模型的 R2R^2 为 0.15。再把 yyX2X_2 回归,这次模型的 R2R^2 为 0.2。设把 yy 同时对 X1,X2X_1, X_2 回归所得模型的 R2R^2 下界和上界分别为 [min,max][\text{min}, \text{max}]。请把答案写成 maxmin\frac{\text{max}}{\text{min}}

Using OLS, we regress yy onto X1X_1 and find that the model has an R2R^2 of 0.15. We also regress yy onto X2X_2 but this time the model has an R2R^2 of 0.2. Let [min,max][\text{min}, \text{max}] denote the lower and upper-bound of the R2R^2 of a model which regresses yy onto X1,X2X_1, X_2. Express your answer as maxmin\frac{\text{max}}{\text{min}}.

解析

R12=0.15R_1^2 = 0.15 是把 yy 仅对 X1X_1 回归得到的 R2R^2R22=0.20R_2^2 = 0.20 是把 yy 仅对 X2X_2 回归得到的 R2R^2

下界: 在 OLS 中,加入解释变量不会降低 R2R^2。因此,把 yy(X1,X2)(X_1, X_2) 回归得到的 R2R^2 至少与单变量回归中较大的那个一样大: minR2=max{R12,R22}=0.20.\min R^2 = \max\{R_1^2, R_2^2\} = 0.20.

上界: 没有任何限制阻止 X1X_1X2X_2 一起把 yy 完全解释掉,即使它们各自单独对 yy 的解释力都不强。比如,yy 可能落在 (X1,X2)(X_1, X_2) 的张成空间中,但与每个变量单独对齐的程度都较低。因此,联合回归的 R2R^2 可以高达 1: maxR2=1.\max R^2 = 1.

所以, maxmin=10.20=5.\frac{\max}{\min} = \frac{1}{0.20} = 5.


Original Explanation

Let R12=0.15R_1^2 = 0.15 be the R2R^2 from regressing yy on X1X_1 alone, and R22=0.20R_2^2 = 0.20 be the R2R^2 from regressing yy on X2X_2 alone.

Lower bound: Adding regressors in OLS cannot reduce R2R^2. Therefore, the R2R^2 from regressing y on (X1,X2)(X_1, X_2) must be at least as large as the larger of the two individual R2R^2 values: minR2=max{R12,R22}=0.20.\min R^2 = \max\{R_1^2, R_2^2\} = 0.20.

Upper bound: There is no restriction that prevents X1X_1 and X2X_2 together from perfectly explaining yy, even if each variable alone explains little. For example, yy could lie in the span of (X1,X2)(X_1, X_2) while being poorly aligned with each variable individually. Hence, the R2R^2 of the joint regression can be as high as 1: maxR2=1.\max R^2 = 1.

Therefore, maxmin=10.20=5.\frac{\max}{\min} = \frac{1}{0.20} = 5.