11 - Interval Estimation
General Info
- sample cannot tell exact population mean μ → use sample mean to predict interval of confidence that population mean will lie in → confidence interval
| note | sample | population |
|---|
| mean | xˉ | μ |
| std | s | σ |
Normal distribution of Sample mean
- Xnˉ normally distributed with mean μ and std σ/n (Practical Interpretation of Limit Theorems)
- standard normal random var / test statistic: Z=σ/nXnˉ−μ, Z∼N(0,12). (tag normal random var using z-test)
- eg. 95% confidence interval → find a in P(−a≤Z≤a)=0.95 → a≈1.96
Background Info for Confidence Interval
- to find the confidence interval containing μ with prob 1−α, find a such that:
- toExplore why it has to be 1-alpha, why dont use the confidence level directly? refer to the normal dis graph for understanding?
- P(−a≤Z≤a)=1−α→P(Z≤−a)=2α
- a is then labelled z2α, determined for any α using tech
Confidence Interval for mean of known std
Calculating CI for mean of known std
- For interval containing μ with prob 1−α, sample mean xˉ
- sample point estimate - margin of error ≤ population point est ≥ sample point estimate - margin of error
- CI=xˉ−z2αnσ≤μ≤xˉ+z2αnσ
- sample mean, if unavailable: =AVERAGE(input,range)
- Confidence level of the interval: (1−α)⋅100%
- (1−α): confidence coefficient
- margin of error: z2αnσ,
- =CONFIDENCE.NORM(α,σ,n)
- width of confidence interval: 2⋅z2αnσ
- ↑ sample size = ↓ width of interval (CI width is smaller as we have a larger sample size)
- ↑ confidence level = ↑ width of interval
Determining sample size
- using confidence intervals to estimate population means, ↓ width of interval = ↑ sample size
- n=(w2×z2ασ)2
- σ: population std, n: sample size
Confidence Interval for mean of unknown std
Calculating CI for mean of unknown std
- when σ unknown, use same sample to estimate both μ and σ
- when using sample standard deviation s to estimate σ, interval estimate for population mean is based on the t-distribution
- Interval estimate of a population mean, σ unknown:
- xˉ−tα/2ns≤μ≤xˉ+tα/2ns
- s: sample std
- Confidence level of the interval: (1−α)⋅100%
- (1−α): confidence coefficient
- tα/2: find t-value from t-distribution with n−1 degree of freedom (Excel: T.INV(21−α; df)), take the positive result
- n: sample size, n≥30 to use this expression
- margin of error: tα/2ns CONFIDENCE.T
Calculating T-distribution in EXCEL
- calculating t-distribution: T.DIST(x,df,cumulative): find probability value for a given t-value
x : test statistic, or t-value
df: n - 1
cumulative is a logical value that determines the form of the function. if true, return cdf, else, return pdf
Confidence Interval for Proportion
Calculating CI for Proportion
- true population proportion p is unknown → taking a sample to obtain sample proportion pˉ to estimate p
- pˉ=X/n
- pˉ: sample proportion
- X: number of successes in the sample
- n: sample size
- X∼B(n,p),EX=np,σx=np(1−p)
→ Epˉ=p, σ(pˉ)=np(1−p)
- By Central Limit Theorem, pˉ is approximately normally distributed if
- Confidence Interval for p:
- pˉ−z2αnpˉ(1−pˉ)≤p≤ pˉ+z2αnpˉ(1−pˉ)
- margin of error: z2αnpˉ(1−pˉ)
- width of confidence interval: 2⋅z2αnpˉ(1−pˉ)
Determining Sample size
- for z2α confidence interval, width w, preliminary proportion pˉ=p∗, sample size is
- n=(w2×z2α)2p∗(1−p∗)
- if dont have pˉ, choose p∗=0.5, as it is worst case, and maximizes p∗(1−p∗)
Test a claim using confidence interval (a, b)
Popular confidence level
| Confidence level | α | z2α |
|---|
| 90% | 0.1 | 1.645 |
| 95% | 0.05 | 1.960 |
| 98% | 0.02 | 2.326 |
| 99% | 0.01 | 2.576 |
- find z2α by using Inverse Normal (casio) with area = α/2