Fundamental Statistics Learning Note(24)

R Square

To test $H_O: \beta_1 = 0\text{ vs } H_A:\beta\neq 0$, $W = \frac{\hat\beta}{S_{reg}/\sqrt{\sum(x-\bar x)^2}}\sim t_{n-2}$ if $H_O$ is true.
reject $H_O$ if $|W|>(1-\alpha/2)$ quantile of $t_{n-2}$

Recall $t_p^2 \sim F_{1,p} $
Define F-statistic for the regression as $F=\frac{\hat \beta^2}{S_{reg}^2/\sum(x_i-\bar x)^2}$, then $F\sim F_{1,n-2}$ if $H_O$ is true.

Def $\hat y_1 = \hat \beta_0 + \hat \beta_1 x_i$ are the fitted values. $r_i = y_i - \hat y_i$ are the residuals (the difference between the true value and fitted value).
$\Rightarrow \sum(y_i - \bar y)^2$ represents total variability in $y_i’s$ (sum of squares).

\underbrace{\sum(y_i - \bar y)^2}_{\text{SS(Total)}} & = \sum(y_i - \hat y_i + \hat y_i - \bar y) \
& = \underbrace{\sum(y_i - \hat y_i)^2}_{\text{SS(residual)}} + \underbrace{\sum(\hat y_i - \hat y)^2}_{\text{SS(reg)}}
which is called ANOVA decomposition.

$R^2 = \frac{\sum(\hat y_i - \bar y)^2}{\sum(y_i - \bar y)^2}$, fraction of variability in $y_i$ explained by the line.

If you like my article, please feel free to donate!