Ch4. Sampling Distributions and the Central Limit Theorem
Population and Sample
Population: The entire set of subjects of interest in a study
Sample: A subset selected from the population
Why use samples?
- A census of the entire population is often impractical due to cost and time
- Samples allow us to infer population characteristics
Parameter: A characteristic of the population (μ, σ) — for example, the true mean household income of all US adults
Statistic: A value calculated from the sample (x̄, s) — for example, mean income in a BLS survey sample
Sampling Methods
| Method | Description |
|---|---|
| Simple Random Sampling | Every individual has an equal probability of selection |
| Stratified Sampling | Divide into strata (subgroups), then sample from each stratum |
| Cluster Sampling | Select clusters (groups), then take a census within each cluster |
| Systematic Sampling | Randomly select the first unit, then select every kth unit |
Distribution of the Sample Mean
When we repeatedly draw samples of size n from the same population, the sample mean x̄ has its own distribution.
Population with mean μ and variance σ²:
Sample mean x̄ ~ N(μ, σ²/n)
Expected value: E(x̄) = μ
Variance: Var(x̄) = σ²/n
Standard error: SE = σ/√n
Key insight: As sample size n increases, the variability of the sample mean (standard error) decreases.
Central Limit Theorem (CLT)
The most important theorem in statistics.
Regardless of the population’s distribution, when n is sufficiently large (typically n ≥ 30), the distribution of the sample mean approximates a normal distribution.
x̄ ~ N(μ, σ²/n) (when n is sufficiently large)
Standardization: Z = (x̄ − μ) / (σ/√n)
Significance: Even if the population is not normally distributed (e.g., uniform, binomial, skewed), a large enough sample produces a normally distributed sample mean.
Standard Error (SE)
SE = σ/√n (when population SD is known)
SE = s/√n (when estimating with sample SD)
Standard Deviation vs Standard Error:
- Standard deviation: variability of individual data points
- Standard error: variability of the sample mean (precision of the estimate)
If sample size is multiplied by 4, the standard error is cut in half.
Distribution of Sample Proportions
Sample proportion p̂ = number of successes in sample / n
Expected value: E(p̂) = p
Variance: Var(p̂) = p(1−p)/n
Standard error: √[p(1−p)/n]
By the CLT, p̂ approximates a normal distribution when n is large.
Key Concept Cards
Central Limit Theorem (CLT) ★★★★★ : Regardless of population distribution, when n ≥ 30, the sample mean approximates N(μ, σ²/n). The foundation of statistical inference. Memory tip: large n → sample mean → normal distribution
Standard Error (SE) ★★★★★ : The standard deviation of the sample mean. SE = σ/√n. Larger n → smaller SE → more precise estimates. Memory tip: SE = σ/√n; n × 4 → SE ÷ 2
Parameter vs Statistic ★★★★☆ : Parameters (μ, σ) are fixed population values. Statistics (x̄, s) are estimated values that vary from sample to sample. Memory tip: parameter = population; statistic = sample
Practice Questions
Q. A population has standard deviation 15. What is the standard error for a sample of n = 225?
SE = 15/√225 = 15/15 = 1. The variability in the sample mean is reduced to a standard deviation of 1.
Q. A population has a right-skewed distribution. What is the distribution of the sample mean for n = 50?
By the Central Limit Theorem, with n = 50, the distribution of the sample mean approximates a normal distribution regardless of the population’s shape. Expected value = population mean; standard deviation = population SD / √50.
OIYO Editorial
Content Editor지식 인큐베이터이자 전문 콘텐츠 크리에이터. 경영, 경제, 법률 및 실생활에 유용한 실무/자격증 중심의 깊이 있는 정보를 연구하고 공유합니다.