Fundamental Statistics Learning Note(20)

Bayes’ rule revisited

How is $p(H_O\text{ is true})$ related to $p(\text{reject } H_O|H_O\text{ is true})?$
Ans: According to Bayes’ rule $$p(\text{reject } H_O|H_O\text{ is true}) = \frac{p(\text{reject } H_O\cap H_O\text{ is true})}{p(H_O\text{ is true})} = \frac{p(H_O\text{ is true}|\text{reject } H_O)p(\text{reject } H_O)}{p(H_O\text{ is true})}$$

Limitations of classical (frequentist) statistics
For illstration suppose we have the hypothesis is $H_O: \theta \leq 1$. Then $p(H_O\text{ is true})=p(\theta\leq 1)$. But it doesn’t make sense as $\theta$ is not an random variable, it is an unknown constants. It is why we don’t directly use classical $p(H_O\text{ is true})$ and use p-value instead.

Bayesian statistics

The Bayesian approach to statistical inference is to treat all unknown as random variable. This includes both the usual random variables $X_1,X_2,\dots,X_n$ representing the data and the parameter(s) $\theta$.
The probability like $p(\theta\leq 1)$ make sense and $p(H_O\text{ is true})$ can be calculated in Bayesian statistical inference, which makes hypotehis testing more interpretable.

Batesian inference

All unknows $X_1,X_2,\dots,X_n$ (data) and $\theta$ (parameter) are random vairable, so they have a joint distribution.
$p(\theta, X_1,X_2,\dots,X_n)$ is joint distribution of data and parameter.
$$p(\theta, X_1,X_2,\dots,X_n) = \underbrace{f(X_1,X_2,\dots,X_n|\theta)}_{\text{likelihood function }}\underbrace{\pi(\theta)}_{\underbrace{\text{prior}}_{\text{before data}}}=\underbrace{\pi(\theta|X_1,X_2,\dots,X_n)}_{\underbrace{\text{posteiror}}_{\text{after data}}}\underbrace{m(X_1,X_2,\dots,X_n)}_{\underbrace{\text{marginal distribution of data}}_{\text{does not depend on }\theta\text{, ignore for estimating }\theta}}$$
Here $\pi$ is a classical statistical inference notation, $f$ and $m$ are just assumptive probability function.
$$\Rightarrow \pi(\theta|X_1,X_2,\dots,X_n) = \frac{f(X_1,X_2,\dots,X_n|\theta)\pi(\theta)}{m(X_1,X_2,\dots,X_n)} \propto f(X_1,X_2,\dots,X_n|\theta)\pi(\theta)$$

If you like my article, please feel free to donate!