Introduction

In this lab, you will write functions to implement QR-decomposition and gradient descent for simple linear regression and multiple linear regression. You will be adding functionality to the regress431 package that you worked on in Lab 5.

Formulas for calculating coefficients of these regressions via these methods can be found in this week’s course materials.

This repository is structured like a package (regress431), but you do not have to do any package creation or management tasks. You only need to edit the body of the functions in the .R files to perform the correct calculations.

Click Here to find the skeleton package repo, or continue working off your Lab 5 repo.

Guidelines

  • You should not alter any of the function names, inputs, or outputs that I have provided for you. This is very important - we cannot test your work if you change these things!

  • You should not alter the package name, for the same reasons.

  • You should not alter the unit tests that are already written, for the same reasons. You may write more unit tests if this is helpful to you, but you are not required to.

  • You may (probably should) write helper functions beyond those provided. If you do, you do not have to carefully document them or test them; although you might find this helpful to your coding process.

  • You may alter the code I provided for you, if you wish, so long as the function inputs do not change and outputs (where specified) do not change.

  • You may not rely on any existing functions designed specifically for regression; including, but not limited to lm, lm.ridge, and predict. Instead, you should do the necessary matrix calculations directly from the data. (You may, however, use these functions to check or test the output of the ones you write.)

  • You may use the qr functions included in this week’s course materials.

  • You must implement gradient descent yourself from the expressions in this week’s course materials.

  • You may rely on existing functions and external packages designed for faster general computation, such as data.table or furrr.
    If you use an external package, don’t forget to include it in your package dependencies by running e.g. usethis::use_package("data.table"), and to either use the :: in your code or add an @import line in the documentation.

Tasks

1. Simple Linear Regression

In linear_regression_alt.R, edit the body of the slr_gd function to properly compute the slope and intercept of the regression line, using the gradient descent method.

Load your new code (Ctrl-Shift-L) and then run the unit tests (Ctrl-Shift-T) to confirm that your function performs as expected.

2. Multiple Regression

In linear_regression_alt.R, edit the body of the mlr_gd function to properly compute all the required coefficients, using the gradient descent method.

In linear_regression_alt.R, edit the body of the mlr_qr function to properly compute all the required coefficients, using the matrix decomposition method.

Note that this function assumes that you want to include every column of the supplied data frame in your regression, except the specified response.

Load your new code and then run the unit tests to confirm that your function performs as expected.

Challenge

This week, there is no additional Challenge. Instead, your code will be tested for speed.

I will be loading your finished packages, and running something like the following:

library(regress431)

tic()

  slr_gd(mtcars, mpg, cyl)

toc()

  mlr_gd(mtcars, mpg)
  
toc()

  mlr_qr(mtcars, mpg)
  
toc()

However I will run this code on different datasets than this example, of varying sizes.

Bonus Points will be given for:

  • +5 for fastest slr_gd

  • +5 for fastest mlr_gd

  • +5 for fastest mlr_qr

  • +10 for top 3 in overall total speed

  • +5 for top 4-10 in overall total speed