Linear Regression

This chapter will talk about using MLE to estimate the parameters in linear regression. Though the most common way nowdays is to use least square. I think it can bridge the gap between probability and regression theoretically.

Suppose we want to construct a regression model to simulate the relationship between the weight of moose and the size of its anteler.
Model: $y_i\sim N(\beta_0+\beta_1x_i,\sigma^2)$, where $y_i$ is the weight of moose and $x_i$ is the size of the anteler. Because $y_i$ is dependent on $x_i$, so $y_i$ is indepedent.
First we get likelihood function:
$$\begin{split} L(\beta_0,\beta_1,\sigma^2|y_1,\dots,y_n) & = \prod\limits_{i=1}^n \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{y_i-(\beta_0+\beta_1x_i)^2}{2\sigma^2}} \\ & = \frac{1}{\sigma^n (2\pi)^{\frac{n}{2}}}e^{-\frac{\sum(y_i-\beta_0-\beta_1x_i)^2}{2\sigma^2}} \end{split}$$
Then get log-likelihood function
$$\begin{split} l(\beta_0,\beta_1,\sigma^2) & = -\frac{n}{2}log(\sigma^2)-\frac{n}{2}log(2\pi)-\frac{\sum(y_i-\beta_0-\beta_1x_i)^2}{2\sigma^2} \end{split}$$
Take partial derivative and set partitals to 0
$$\begin{split} \frac{\partial l}{\partial \beta_0} & = \sum 2(y_i-\beta_0-\beta_1x_i)(-1)\overset{\text{set}}=0 \\ & \Rightarrow \sum(y_i-\beta_0-\beta_1x_i)=0 \\ & \Rightarrow \sum y_i-n\beta_0-\beta_1\sum x_i = 0 \\ & \Rightarrow n\beta_0 = \sum y_i - \beta_1\sum x_i \\ & \Rightarrow \beta_0 = \bar y - \beta_1 \bar x \end{split}$$
which is intercept.
$$\begin{split} \frac{\partial l}{\partial \beta_1} & = \sum 2(y_i-\bar y + \beta_1\bar x+\beta_1x_i)(-1) \\ & = -\frac{1}{2\sigma^2}[\sum2\lbrace (y_1-\bar y)-\beta_1(x_i-\bar x)\rbrace(x_i-\bar x)(-1)] \overset{\text{set}}=0 \\ & \Rightarrow \sum[(y_i-\bar y)(x_i-\bar x)-\beta_1\sum(x_i-\bar x)^2=0 \\ & \Rightarrow \hat \beta_1 = \frac{\sum(y_i-\bar y)(x_i-\bar x)}{\sum(x_i-\bar x)^2} \end{split}$$
which is slope.

If you like my article, please feel free to donate!