Fundatmental Statistics Theory Notes (9)

Sufficient Statistics

Def $x_1,x_2,\dots,x_n$ are r.v’s, and $\theta$ is unknown parameter. The function $T(x_1,x_2,\dots,x_n)$ is a sufficient statistics for $\theta$��if the conditonal distribution of $x_1,x_2,\dots,x_n$ given $T(x_1,x_2,\dots,x_n)$ does not depend on $\theta$.

To estimate $\theta$, $T(x_1,x_2,\dots,x_n)$ captures all the information about $\theta$ in the sample. Estimation of $\theta$ only depends on $x_1,x_2,\dots,x_n$ through $T(x_1,x_2,\dots,x_n)$.

How to check that a statistic is sufficient

Factorization theorem

Let $f(x_1,x_2,\dots,x_n|\theta)$ be the joint PDF/PMF.
$T(x_1,x_2,\dots,x_n)$ is sufficient for $\theta \Leftrightarrow f(x_1,x_2,\dots,x_n|\theta)$ can be factored into $g(T(x_1,x_2,\dots,x_n)|\theta)h(x_1,x_2,\dots,x_n)$

Definition:
Let I denote the indicator function.
$ f(n) =
\begin{cases}
1, & \text{if $A$ true} \\
0, & \text{otherwise}
\end{cases}$

Ex1. $x_1,x_2,\dots,x_n \overset{\text{i.i.d}}\sim Expo(\beta)$
$L(\beta|x_1,x_2,\dots,x_n)= \frac{1}{\beta^n}e^{-\frac{\sum x_i}{\beta}} \cdot I(\beta > 0)$
$I(\beta > 0)$ capture bounds on parameter space
$ =\underbrace {\frac{1}{\beta^n}e^{-\frac{\sum x_i}{\beta}}I(\beta > 0)}_{g(\sum x_i|\beta)} \cdot \underbrace{1}_{h(x_1,x_2,\dots,x_n)}$
As $g$ takes care of everything, $\sum x_i$ is sufficient for $\beta$ by factorization theorem.

Ex2. Suppose $x_1,x_2,\dots,x_n$ are iid with PDF, $f(x|\theta) = e^{-(x-\theta)},\ \theta \in \mathbb{R}, x > \theta$, find sufficient statistics for $\theta$.
Joint density $f(x_1,x_2,\dots,x_n|\theta) = \prod_{i=1}^nf(x_i|\theta)=\prod_{i=1}^ne^{-(x_i - \theta)}$
$=e^{-\sum x_i}e^{n\theta}\cdot I(\theta<min\lbrace x_i \rbrace) \Leftarrow \theta < min\lbrace x_i \rbrace$
$=\underbrace{e^{n\theta}I(x<min\lbrace x_i\rbrace)}_{g(min\lbrace x_i\rbrace |\theta)}\cdot \underbrace{e^{-\sum x_i}}_{h(x_1,x_2,\dots,x_n)}$
So $min\lbrace x_i\rbrace$ is sufficient for $\theta$.

Ex3. $x_1,x_2,\dots,x_n \overset{\text{i.i.d}}\sim Gamma(\alpha, \beta)$, where $\alpha$, $\beta$ are unknown, find sufficient statistics for $\alpha$, $\beta$.

$\begin{equation}\begin{split}
f(x_1,x_2,\dots,x_n|\alpha, \beta) & = \prod_{i=1}^{n}\frac{1}{\beta^{\alpha}\Gamma (\alpha)}x_i^{\alpha-1}e^{-x_i/\beta} \\
& =\underbrace{\frac{1}{\beta^{n\alpha}(\Gamma(\alpha))^n}(\prod_{i=1}{n}x_i)^{\alpha-1}e^{-\sum x_i /\beta}I(\alpha>0)I(\beta>0)}_{g(\prod x_i, \sum x_i, \alpha, \beta)}\cdot \underbrace{1}_{h(x_1,x_2,\dots,x_n)}
\end{split}\end{equation}$

So, $\sum x_i$, $\prod x_i$ are sufficient for $\alpha$, $\beta$.

Note:
1. Sufficient statistics cannot contain unknown parameters.
2. $x_1,x_2,\dots,x_n$ is always a (trivial) sufficient statistics.

Minimal Sufficiency : Use the fewest statistics if possible (apply factorization theorem to identify them)

We can also find MLEs for multiple unknown parameters by maximizing L over their entire parameter space (however, harder to verify it’s a maximul MLE)

Ex4. $x_1,x_2,\dots,x_n \overset{\text{i.i.d}}\sim N(\mu, \sigma^2)$, where $\mu$, $\sigma$ are unknown.
$\begin{equation}\begin{split}
L(\mu, \sigma^2|x_1,x_2,\dots,x_n) &=f(x_1,x_2,\dots,x_n|\mu, \sigma^2) \\
& = \prod_{i=1}^n \frac{1}{\sigma \sqrt{2\pi}}e^{-\frac{(x_i-\mu)^2}{2\sigma^2}} \\
&=\frac{1}{(\sigma^2 2\pi)^{n/2}}e^{-\frac{\sum (x_i-\mu)^2}{2\sigma^2}},\mu \in \mathbb{R}, \sigma^2 >0
\end{split}\end{equation}$
$l(\mu, \sigma^2)=-\frac{n}{2}log(\sigma^2)-\frac{n}{2}log(2\pi)-\frac{\sum(x_i-\mu)^2}{2\sigma^2}$
For more than one parameter, we would set $\frac{\alpha l}{\alpha \theta}=0$ for each parameter.

$\begin{equation}\begin{split}
\frac{\alpha l}{\alpha \mu} &=-\frac{1}{2\sigma^2}\sum 2(x_i-\mu)(-1) \\
&=\frac{1}{2\sigma^2}\sum(x_i-\mu)\overset{\text{set}}=0
\end{split}\end{equation}$
$\Rightarrow \sum(x_i - \mu)=\sum x_i-n\mu=0$
$\Rightarrow \mu=\frac{\sum x_i}{n} = \bar x$

$\begin{equation}\begin{split}
\frac{\alpha l}{\alpha \sigma^2} &=-\frac{1}{2}\frac{1}{\sigma^2}-(\frac{\sum(x_i-\mu)^2}{2})(-\frac{1}{\sigma^4})\\
&=-\frac{n}{2\sigma^2}+\frac{1}{2\sigma^4}\sum(x_i-\mu)^2\overset{\text{set}}=0
\end{split}\end{equation}$
$\Rightarrow -n\sigma^2 + \sum(x_i-\mu)^2=0$
$\Rightarrow \sigma^2 = \frac{\sum (x_i-\mu)^2}{n}$

To satisfy both equations, the MLE are:
$\hat \mu = \bar x$
$\hat \sigma^2 = \frac{1}{n}(x_i-\mu)^2=\frac{n-1}{n}S^2$

Verification:
Recall:

$\begin{equation}\begin{split}
\sum(x_i-\mu)^2& = \sum(x_i - \bar x +\bar x-\mu)^2\\
&=\sum(x_i-\bar x)^2 + \sum (\bar x - \mu)^2 +\underbrace{2\sum(x_i-\bar x)(\bar x -\mu)}_0 \\
&=(n-1)S^2 + n(\bar x-\mu)^2
\end{split}\end{equation}$

So the joint PDF is equal to:
$f(x_1,x_2,\dots,x_n|\mu, \sigma^2)=\underbrace{\frac{1}{(\sigma^2 2\pi)^{n/2}}e^{-\frac{1}{2\sigma^2}[(n-1)S^2+n(\bar x-\mu)^2]I(\sigma^2>0)}}_{g(\bar x, S^2|\mu, \sigma^2)}\cdot \underbrace{1}_{h(x_1,x_2,\dots,x_n)}$

If you like my article, please feel free to donate!