Academy Chapter 4 4 min read

Ch4. Sampling Distributions and the Central Limit Theorem

O
OIYO Editorial Contributor
4/10

Population and Sample

Population: The entire set of subjects of interest in a study
Sample: A subset selected from the population

Why use samples?

  • A census of the entire population is often impractical due to cost and time
  • Samples allow us to infer population characteristics

Parameter: A characteristic of the population (μ, σ) — for example, the true mean household income of all US adults
Statistic: A value calculated from the sample (x̄, s) — for example, mean income in a BLS survey sample


Sampling Methods

MethodDescription
Simple Random SamplingEvery individual has an equal probability of selection
Stratified SamplingDivide into strata (subgroups), then sample from each stratum
Cluster SamplingSelect clusters (groups), then take a census within each cluster
Systematic SamplingRandomly select the first unit, then select every kth unit

Distribution of the Sample Mean

When we repeatedly draw samples of size n from the same population, the sample mean x̄ has its own distribution.

Population with mean μ and variance σ²:

Sample mean x̄ ~ N(μ, σ²/n)

Expected value: E(x̄) = μ
Variance:       Var(x̄) = σ²/n
Standard error: SE = σ/√n

Key insight: As sample size n increases, the variability of the sample mean (standard error) decreases.


Central Limit Theorem (CLT)

The most important theorem in statistics.

Regardless of the population’s distribution, when n is sufficiently large (typically n ≥ 30), the distribution of the sample mean approximates a normal distribution.

x̄ ~ N(μ, σ²/n)  (when n is sufficiently large)

Standardization: Z = (x̄ − μ) / (σ/√n)

Significance: Even if the population is not normally distributed (e.g., uniform, binomial, skewed), a large enough sample produces a normally distributed sample mean.


Standard Error (SE)

SE = σ/√n  (when population SD is known)
SE = s/√n  (when estimating with sample SD)

Standard Deviation vs Standard Error:

  • Standard deviation: variability of individual data points
  • Standard error: variability of the sample mean (precision of the estimate)

If sample size is multiplied by 4, the standard error is cut in half.


Distribution of Sample Proportions

Sample proportion p̂ = number of successes in sample / n

Expected value: E(p̂) = p
Variance:       Var(p̂) = p(1−p)/n
Standard error: √[p(1−p)/n]

By the CLT, p̂ approximates a normal distribution when n is large.


Key Concept Cards

Central Limit Theorem (CLT) ★★★★★ : Regardless of population distribution, when n ≥ 30, the sample mean approximates N(μ, σ²/n). The foundation of statistical inference. Memory tip: large n → sample mean → normal distribution

Standard Error (SE) ★★★★★ : The standard deviation of the sample mean. SE = σ/√n. Larger n → smaller SE → more precise estimates. Memory tip: SE = σ/√n; n × 4 → SE ÷ 2

Parameter vs Statistic ★★★★☆ : Parameters (μ, σ) are fixed population values. Statistics (x̄, s) are estimated values that vary from sample to sample. Memory tip: parameter = population; statistic = sample


Practice Questions

Q. A population has standard deviation 15. What is the standard error for a sample of n = 225?

SE = 15/√225 = 15/15 = 1. The variability in the sample mean is reduced to a standard deviation of 1.

Q. A population has a right-skewed distribution. What is the distribution of the sample mean for n = 50?

By the Central Limit Theorem, with n = 50, the distribution of the sample mean approximates a normal distribution regardless of the population’s shape. Expected value = population mean; standard deviation = population SD / √50.

O

OIYO Editorial

Content Editor

지식 인큐베이터이자 전문 콘텐츠 크리에이터. 경영, 경제, 법률 및 실생활에 유용한 실무/자격증 중심의 깊이 있는 정보를 연구하고 공유합니다.