Recall the following expression you used last week to compute the coefficient estimates for multiple regression:
\(\hat{\beta} = (X'X)^{-1} X'Y\)
There were some very nice and simple functions for performing these operations in R. While you may not have noticed it, computing the inverse of \(X'X\) can actually be quite costly. In particular, if we have a large number of predictors then \(X'X\) is very large and it can be computationally infeasible to invert without taking special care. What is this special care, you ask? Stay awhile and listen!
One such way to ease the task of computing coefficient estimates in regression is to decompose the \(X\) matrix into more manageable pieces (in this case, “more manageable” means “easier to invert”). There are multiple ways to decompose matrix depending on the properties it has, but the one we’ll explore here is called QR-decomposition.
Once again, we’re not going to dwell too much on the mathematical derivations, but now that you’ve gained some knowledge in matrix operations in R you should try to understand the general ideas here. You are not responsible for knowing the calculus and linear algebra used in the video.
Notice that this video didn’t actually involve any R!
Because \(R\) is upper triangular it is much easier to invert and so this QR-decomposition should be able to help us quite a bit!
You might still be wondering, though, how we actually compute/obtain \(Q\) and \(R\). Without going further into the mathematics behind it, check out the following reading for how to do it in R.
Notes:
qr()
functions and the other code used in this reading. There are multiple way to employ these functions to help us with regression!qr()
function return?uppertri_solve()
solve()
backsolve()
easy_invert()