Week 5: Regression Via Matrix Operations

Stat 431

Time Estimates:
     Videos: 13 min
     Readings: 30 min
     Activities: 60 min
     Check-ins: 2

Simple Linear Regression

In your prerequisite statistics coursework you undoubtedly spent some time on the topic of regression. The idea is that we have a response variable and at least one explanatory variable that is suspected to be related to the response in some way we’d like to model.

In simple linear regression there is just one explanatory variable and the model we’re estimating has an assumed straight-line form. In general, multiple regression extends this idea to accommodate many different complex relationships between a response and any number of explanatory variables, all within a single model.

It’s likely that you used software like R, JMP, or Minitab to estimate the regression model after specifying which variable was your response and which variables were your explanatory variables (predictors). There are many nice functions for performing regression, but today you’ll learn about how to do it yourself!

We’re not going to go through the mathematical derivations, but now that you’ve gained some knowledge in matrix operations in R it should be relatively straightforward to implement regression in R. The video, however, does go into the mathematics of estimating the regression coefficients. You are not responsible for knowing the calculus and linear algebra used in the video.

Required Video: Linear Regression with Matrices

Notice that this video didn’t actually involve any R! However, it makes the computations required to estimate regression coefficients extremely clear.

Check-In 1: Regression with Matrices

  1. What is the column of 1’s in the X matrix needed for?
  • To compute the residuals
  • To include an intercept in the model
  • So that we only include each observation once
  • Toothbrush
  1. The rows of the X matrix represent _____.
  • explanatory variables
  • the response variable
  • observations
  • residuals
  1. Including more than one explanatory variable in the model means the X matrix _____.
  • will have more rows
  • will have more columns
  • will have more rows and more columns
  1. After we’ve computed the coefficient estimates, what does XA represent?
  • the residuals
  • the response values
  • the fitted values
  1. In the slide shown at 3:18, there is a typo in the regression equation.
  • True
  • False

Canvas Link     

Ridge Regression

There are all sorts of things that can go wrong when performing regression! Unfortunately, it’s a bit beyond the scope of this class to go into them all. So, we’ll encourage you to explore more on your own or take more statistics courses like Cal Poly’s STAT 419 or STAT 434.

One of the things that can be challenging when performing traditional linear regression is having a very large number of explanatory variables. It’s even possible to have more explanatory variables than observations! However, if this is then case then parts of the matrix algebra needed to estimate the coefficients will not work. One method for dealing with this challenge is known as regularization or penalized regression. There are a couple of specific, but popular models that fall under this umbrella: ridge regression and the lasso.

You’re going to learn a bit more about ridge regression, including how to implement it! The basic idea is that we want to fit the regression model in such a way as to end up not including every single explanatory variable we have in our dataset. To do this, ridge regression penalizes coefficients for being large in magnitude (i.e. very non-zero) but still tries to minimize the error (sum of squared residuals) just like traditional regression did.

This may sound a bit complicated, but it actually simplifies somewhat nicely. Check out the reading below for more details!

Required Reading: Ridge Regression Introduction


  • One of the primary gems of this reading is the expression in the section titled “What We Really Want to Find”. This expression should look very familiar, though slightly different.

Check-In 2: Ridge Regression

  1. As we increase the value of \(\lambda\), what happens to the coefficient estimates?
  • They increase in magnitude
  • They decrease in magnitude
  • They stay the same
  1. In prediction, how does ridge regression generally perform relative to traditional linear regression?
  • Better
  • Worse
  • The same
  1. Will any of the regression coefficients ever actually reach zero?
  • Yes
  • No
  • Maybe?

Canvas Link     

Extra Resources: