Fundamental Statistics Learning Note(23)

Linear Regression

This chapter will talk about using MLE to estimate the parameters in linear regression. Though the most common way nowdays is to use least square. I think it can bridge the gap between probability and regression theoretically.

Suppose we want to construct a regression model to simulate the relationship between the weight of moose and the size of its anteler.
Model: $y_i\sim N(\beta_0+\beta_1x_i,\sigma^2)$, where $y_i$ is the weight of moose and $x_i$ is the size of the anteler. Because $y_i$ is dependent on $x_i$, so $y_i$ is indepedent.
First we get likelihood function:
L(\beta_0,\beta_1,\sigma^2|y_1,\dots,y_n) & = \prod\limits_{i=1}^n \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{y_i-(\beta_0+\beta_1x_i)^2}{2\sigma^2}} \\
& = \frac{1}{\sigma^n (2\pi)^{\frac{n}{2}}}e^{-\frac{\sum(y_i-\beta_0-\beta_1x_i)^2}{2\sigma^2}}
Then get log-likelihood function
l(\beta_0,\beta_1,\sigma^2) & = -\frac{n}{2}log(\sigma^2)-\frac{n}{2}log(2\pi)-\frac{\sum(y_i-\beta_0-\beta_1x_i)^2}{2\sigma^2}
Take partial derivative and set partitals to 0
\frac{\partial l}{\partial \beta_0} & = \sum 2(y_i-\beta_0-\beta_1x_i)(-1)\overset{\text{set}}=0 \\
& \Rightarrow \sum(y_i-\beta_0-\beta_1x_i)=0 \\
& \Rightarrow \sum y_i-n\beta_0-\beta_1\sum x_i = 0 \\
& \Rightarrow n\beta_0 = \sum y_i - \beta_1\sum x_i \\
& \Rightarrow \beta_0 = \bar y - \beta_1 \bar x
which is intercept.
\frac{\partial l}{\partial \beta_1} & = \sum 2(y_i-\bar y + \beta_1\bar x+\beta_1x_i)(-1) \\
& = -\frac{1}{2\sigma^2}[\sum2\lbrace (y_1-\bar y)-\beta_1(x_i-\bar x)\rbrace(x_i-\bar x)(-1)] \overset{\text{set}}=0 \\
& \Rightarrow \sum[(y_i-\bar y)(x_i-\bar x)-\beta_1\sum(x_i-\bar x)^2=0 \\
& \Rightarrow \hat \beta_1 = \frac{\sum(y_i-\bar y)(x_i-\bar x)}{\sum(x_i-\bar x)^2}
which is slope.

If you like my article, please feel free to donate!