Week 5: Regression Via Matrix Operations

Stat 431

Time Estimates:

Videos: 13 min

Readings: 30 min

Activities: 60 min

Check-ins: 2

Simple Linear Regression

In your prerequisite statistics coursework you undoubtedly spent some time on the topic of regression. The idea is that we have a response variable and at least one explanatory variable that is suspected to be related to the response in some way we’d like to model.

In simple linear regression there is just one explanatory variable and the model we’re estimating has an assumed straight-line form. In general, multiple regression extends this idea to accommodate many different complex relationships between a response and any number of explanatory variables, all within a single model.

It’s likely that you used software like R, JMP, or Minitab to estimate the regression model after specifying which variable was your response and which variables were your explanatory variables (predictors). There are many nice functions for performing regression, but today you’ll learn about how to do it yourself!

We’re not going to go through the mathematical derivations, but now that you’ve gained some knowledge in matrix operations in R it should be relatively straightforward to implement regression in R. The video, however, does go into the mathematics of estimating the regression coefficients. You are not responsible for knowing the calculus and linear algebra used in the video.

Required Video: Linear Regression with Matrices

Notice that this video didn’t actually involve any R! However, it makes the computations required to estimate regression coefficients extremely clear.

Check-In 1: Regression with Matrices

What is the column of 1’s in the X matrix needed for?

To compute the residuals
To include an intercept in the model
So that we only include each observation once
Toothbrush

The rows of the X matrix represent _____.

explanatory variables
the response variable
observations
residuals

Including more than one explanatory variable in the model means the X matrix _____.

will have more rows
will have more columns
will have more rows and more columns

After we’ve computed the coefficient estimates, what does XA represent?

the residuals
the response values
the fitted values

In the slide shown at 3:18, there is a typo in the regression equation.

True
False

Canvas Link

Ridge Regression

There are all sorts of things that can go wrong when performing regression! Unfortunately, it’s a bit beyond the scope of this class to go into them all. So, we’ll encourage you to explore more on your own or take more statistics courses like Cal Poly’s STAT 419 or STAT 434.

One of the things that can be challenging when performing traditional linear regression is having a very large number of explanatory variables. It’s even possible to have more explanatory variables than observations! However, if this is then case then parts of the matrix algebra needed to estimate the coefficients will not work. One method for dealing with this challenge is known as regularization or penalized regression. There are a couple of specific, but popular models that fall under this umbrella: ridge regression and the lasso.

You’re going to learn a bit more about ridge regression, including how to implement it! The basic idea is that we want to fit the regression model in such a way as to end up not including every single explanatory variable we have in our dataset. To do this, ridge regression penalizes coefficients for being large in magnitude (i.e. very non-zero) but still tries to minimize the error (sum of squared residuals) just like traditional regression did.

This may sound a bit complicated, but it actually simplifies somewhat nicely. Check out the reading below for more details!

Required Reading: Ridge Regression Introduction

Notes:

One of the primary gems of this reading is the expression in the section titled “What We Really Want to Find”. This expression should look very familiar, though slightly different.

Check-In 2: Ridge Regression

As we increase the value of \(\lambda\), what happens to the coefficient estimates?

They increase in magnitude
They decrease in magnitude
They stay the same

In prediction, how does ridge regression generally perform relative to traditional linear regression?

Better
Worse
The same

Will any of the regression coefficients ever actually reach zero?

Yes
No
Maybe?

Canvas Link

Extra Resources:

More Details on Ridge Regression