A Graduate Course in Econometrics
Introduction to matrix econometrics
One can write:
\[y_i = \beta_0 + \beta_1 x_{1i} + \ldots \beta_p x_{pi} + \epsilon_i\]this can be written in matrix terms as:
\[\mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}\]where \(\mathbf{Y}\) is a \(n \times 1\) column vector, \(\mathbf{X}\) is a \(n \times p\) matrix and \(\boldsymbol \beta\) is another \(n \times 1\) column vector.
At the end one gets for the least square estimators the following famous expression:
\[\hat{\boldsymbol \beta} = \left( \mathbf{X}^T \mathbf{X} \right)^{-1} \mathbf{X}^T \mathbf{Y}\]Variance of random vector times a matrix
\[\textrm{Var}\lbrack\mathbf{A} \mathbf{x}\rbrack = \mathbf{A} \textrm{Var}\lbrack\mathbf{x}\rbrack \mathbf{A}^T\]So for this reason the variance of the estimator is
\[\textrm{Var}\lbrack \hat{\boldsymbol{\beta}}\rbrack = \textrm{Var}\lbrack \left( \mathbf{X}^T \mathbf{X} \right)^{-1} \mathbf{X}^T \mathbf{Y} \rbrack\]and for the properties of the variance of the product of a matrix with a vector and also considered that \(\mathbf{Y}=\mathbf{X}\boldsymbol \beta + \mathbf{u}\) we have:
\[\textrm{Var}\lbrack \hat{\boldsymbol{\beta}} \rbrack =\textrm{Var}\lbrack \left( \mathbf{X}^T \mathbf{X} \right)^{-1} \mathbf{X}^T \mathbf{Y}\rbrack = \left( \mathbf{X}^T \mathbf{X} \right)^{-1} \mathbf{X}^T \textrm{Var}\lbrack \mathbf{Y} \rbrack \mathbf{X} \left( \mathbf{X}^T \mathbf{X} \right)^{-1} = \left( \mathbf{X}^T \mathbf{X} \right)^{-1} \mathbf{X}^T \sigma^2 \mathbf{X} \left( \mathbf{X}^T \mathbf{X} \right)^{-1}\]because the variance of \(\mathbf{Y}\) is homoskedastic then it’s a diagonal matrix \(\sigma^2 \mathbf{I}\).
Geometric interpretation of OLS
Least squares are represented by two steps: first identify the vector \(\hat{\mu}\)