返回题库

样本复制

Sample Duplication

专题
Statistics / 统计
难度
L2

题目详情

你在热带雨林的一个研究实验室中,正在研究温度与可可树生长之间的关系。假设你有 50 个数据点,并发现相关系数为 0.75。你决定复制 50 个数据点并重新计算相关系数。相关性增加了多少?

You are in a research lab in a tropical rainforest and are studying the relationship between temperature and the growth of Cacao trees. Assume you have 50 data points and find a correlation coefficient of 0.75. You decide to duplicate your 50 data points and re-calculate the correlation coefficient. How much did the correlation increase?

解析

回想一下,相关方程定义为: r=(xixˉ)(yiyˉ)(xixˉ)2(yiyˉ)2r = \frac{\sum{(x_i - \bar{x})(y_i - \bar{y})}}{\sqrt{\sum{(x_i - \bar{x})^2\sum(y_i - \bar{y})^2}}} 请注意,将原始样本加倍不会改变 yˉ\bar{y}xˉ\bar{x} 的值,因此不会改变相关系数。因此,相关性不会增加。


Original Explanation

Recall that the equation for correlation is defined as:

r=(xixˉ)(yiyˉ)(xixˉ)2(yiyˉ)2r = \frac{\sum{(x_i - \bar{x})(y_i - \bar{y})}}{\sqrt{\sum{(x_i - \bar{x})^2\sum(y_i - \bar{y})^2}}}

Notice that doubling the original sample will not change the value of yˉ\bar{y} or xˉ\bar{x} and therefore will not change the correlation coefficient. Therefore, the correlation will not increase.