Basis functions

The unifying through-line of module 7: replace $X$ with $b_{j} (X)$ and you are still doing linear regression. Polynomial regression, step functions, and regression splines are all the same idea , a different choice of $b_{j}$ stuffed into the design matrix; OLS does the rest.

Definition (prof’s framing)

“Instead of looking at $x$ directly we’re going through basis functions, and that’s a very general term, a very powerful term that has many many versions.” - L16-beyondlinear-1

The model $y_{i} = β_{0} + β_{1} x_{i} + \dots + β_{k} x_{ik} + ε_{i}$ becomes $y_{i} = β_{0} + β_{1} b_{1} (x_{i}) + β_{2} b_{2} (x_{i}) + \dots + β_{k} b_{k} (x_{i}) + ε_{i} .$

The $b_{j}$ are fixed, known transformations chosen ahead of time. The design matrix is built from $b_{j} (x_{i})$ values; everything else (closed form $\hat{β} = (X^{T} X)^{- 1} X^{T} y$ , sampling distribution, CIs, F-tests) carries over unchanged because the model is linear in $β$ .

Notation & setup

$X$ : scalar predictor (the slide deck does the one-predictor case for clarity; multivariate version is just the GAM).
$b_{1}, \dots, b_{k}$ : the $k$ chosen basis functions. The intercept is the implicit " $b_{0} (x) = 1$ ".
Design matrix dimensions $n \times (k + 1)$ , same shape as MLR.

X = 11 ⋮ 1 b_{1} (x_{1}) b_{1} (x_{2}) ⋮ b_{1} (x_{n}) \dots \dots \dots b_{k} (x_{1}) b_{k} (x_{2}) ⋮ b_{k} (x_{n})

Each column is one basis function evaluated across the data; each row is one observation.

The four module-7 instances

Method	Basis functions $b_{j} (x)$
Polynomial regression (degree $d$ )	$x, x^{2}, \dots, x^{d}$
Step functions ( $K$ cutpoints)	$1 (c_{j - 1} \leq x < c_{j})$
Cubic regression spline ( $K$ knots)	$x, x^{2}, x^{3}, (x - c_{1})_{+}^{3}, \dots, (x - c_{K})_{+}^{3}$
Natural cubic spline ( $K$ interior knots)	re-parametrised version with linearity enforced past the boundary knots

Smoothing splines and local regression drop the basis-function frame , they minimize over a function $g$ directly rather than over $β$ . They are deliberately separated by the prof from the basis-function methods.

Insights & mental models

“It’s nonlinear, but linear. It’s linear in the parameters $β$ , but it’s nonlinear in what you get.” - L16-beyondlinear-1

This slogan covers all of polynomial regression, step functions, and regression splines. The whole module is one trick repeated with richer columns in $X$ .

The book also collects this idea into one section (ISLP §7.3): polynomials, indicators, splines, wavelets, Fourier , all just choices of $b_{j}$ . Once the design matrix is built, every linear-model tool from module 3 (least squares, $t$ / $F$ -tests, CIs, residual diagnostics) is in scope.

Exam signals

“There’s a commonality, right? We’re talking about basis functions, but all you have to do is fit it with regression.” - L16-beyondlinear-1

“We’re now going to do something more than just finding betas.” - L16-beyondlinear-1 (introducing smoothing splines as the break with the basis-function frame)

The contrast is itself a fair-game exam point: which methods in module 7 fit by OLS on a basis (poly, step, regression spline), which need a different objective (smoothing spline, local regression).

Pitfalls

The design matrix gets wide fast , $K$ knots in a cubic spline already gives $K + 4$ columns (intercept included). High-degree polynomial × many knots = collinearity, instability.
“Linear in $β$ ” does not mean the fitted curve is linear in $x$ . Don’t confuse the two.
The truncated-power basis $(x - c_{j})_{+}^{3}$ is the textbook basis but R uses bs() (B-spline) which gives the same fit with different columns. The prof: “I don’t know why they call it BS.” Cosmetic , same predictions.
For the natural-spline basis (Exercise 7.3), the textbook formula is asymmetric: $b_{1} (x) = x$ , then $b_{k + 2} (x) = d_{k} (x) - d_{K} (x)$ with $d_{k} (x) = [(x - c_{k})_{+}^{3} - (x - c_{K + 1})_{+}^{3}] / (c_{K + 1} - c_{k})$ . Easy to mis-write the indexing.

Scope vs ISLP

In scope: the basis-function frame as the unifier of module 7; the design-matrix construction; that OLS / its inference toolbox carries over.
Look up in ISLP: §7.3 (the explicit “polynomial and step are special cases” framing); §7.4 for the spline-basis derivation.
Skip in ISLP: wavelets and Fourier-basis examples , name-checked in §7.3, not in lecture.

Exercise instances

Exercise 7.3: derive $X_{2}$ for a natural cubic spline in year with one interior knot at 2006 from the textbook basis formula. Pure design-matrix construction, no fitting.
Exercise 7.4: write mybs(), myns(), myfactor() to build the full additive design $X = (1, X_{1}, X_{2}, X_{3})$ by hand and verify $\overset{y}{^} = X (X^{T} X)^{- 1} X^{T} y$ matches what gam() returns. The “two different design matrices, same prediction” punchline = different bases, same column space.

How it might appear on the exam

“Given a basis $b_{1}, \dots, b_{k}$ , write down the design matrix” , pure construction, like Exercise 7.3.
“Why is fitting a polynomial / cubic spline / step function still ‘linear regression’?” , recite the linear-in- $β$ slogan.
“How many parameters does a degree- $d$ polynomial / step function with $K$ cuts / cubic spline with $K$ knots have?” , degree-of-freedom counting (1 + $d$ for poly, $K + 1$ for step including intercept, $K + 4$ for cubic spline).
Method-comparison: given two different bases that give the same fitted values, explain why (same column space).

polynomial-regression: basis functions $b_{j} (x) = x^{j}$ ; the simplest instance.
step-functions: basis functions = indicators of intervals.
regression-splines: truncated-power basis, the headline module-7 application.
generalized-additive-models: the multivariate generalisation: stack basis-function blocks for each predictor.
linear-regression: the host model that the basis-function trick rides on.
design-matrix-and-hat-matrix: what’s actually being constructed.

statistical.dog

Explorer

basis-functions

Basis functions

Definition (prof’s framing)

Notation & setup

The four module-7 instances

Insights & mental models

Exam signals

Pitfalls

Scope vs ISLP

Exercise instances

How it might appear on the exam

Graph View

Table of Contents

Backlinks

statistical.dog

Explorer

basis-functions

Basis functions

Definition (prof’s framing)

Notation & setup

The four module-7 instances

Insights & mental models

Exam signals

Pitfalls

Scope vs ISLP

Exercise instances

How it might appear on the exam

Related

Graph View

Table of Contents

Backlinks