氏族规模的估计
Clan Size
题目详情
一个国家的人被划分为若干氏族。为了估计“氏族的平均规模”,研究者随机抽取 1000 人,询问他们所属氏族的人数。
问:应如何用这些数据估计平均氏族规模?
The people in a country are partitioned into clans. In order to estimate the average size of a clan, a survey is conducted where 1000 randomly selected people are asked to state the size of the clan to which they belong. How does one compute an estimate average clan size from the data collected?
解析
注意:按“随机抽人”得到的是 按氏族规模加权后的样本(大氏族更容易被抽到)。直接对样本求平均会高估。
一种合理估计是使用逆概率加权:若样本中报告规模为 的人数为 ,则可用
作为平均氏族规模的估计(相当于对“氏族”层面去偏)。
Original Explanation
Solution
This is more of a puzzle-to-ponder rather than a puzzle to learn. In my opinion, best estimator for average is sum( n )/ sum( #n/n), where #n is the number of people with clan size as 'n', and this sum is over all the values of 'n' we receive.