Validation set approach

The prof’s “lazy version” of cross-validation: random 50/50 split, fit once, evaluate once. High variance across splits, biased upward, but conservative and easy to explain. Use it when you have lots of data and don’t want to think hard.

Definition (prof’s framing)

Randomly partition the data into two halves: a training set (fit the model) and a validation set (estimate prediction error on held-out points). Used to compare models or pick a hyperparameter by reading off the validation MSE / misclassification rate.

“If you have a lot of data… screw it, just do this, because this is going to be more conservative than the other approaches. And it’s also very easy to explain.” - L10-resample-1

Strictly speaking it isn’t cross-validation; there’s only one split. The prof grouped it with CV because it’s the simplest member of the family.

Notation & setup

  • Sample size , split into training of size and validation of size (often the test set is held out separately, making it more like 33/33/33 in three-way framing).
  • Fit the model on the training half, predict on the validation half, compute MSE (regression) or misclassification rate (classification).
  • One run = one number per candidate model.

Drawbacks (verbatim from slides + L10)

  1. High variability of the validation error across splits. The prof’s Auto data demo: same polynomial regression run 10 times with 10 different random splits → 10 different curves, “no consensus which model really gives the lowest validation set MSE.” - L10-resample-1
  2. Smaller training set: only half the data fits the model → tends to overestimate the error rate of the model fit on all the data. (Models trained on more data generally do better; you’re handicapping yourself.)

In bias-variance terms (per L11-resample-2 recap): high bias (small training set), but probably low variance only because the validation half is large; “you’re kind of unnecessarily hurting yourself.” - L11-resample-2

Insights & mental models

  • The “many-curves” picture is the canonical visual: rerun the random split 10 times for polynomial degrees 1–10 on Auto data → 10 wildly different curves. This is the data-driven justification for needing k-fold CV (which averages multiple folds to stabilize the estimate).
  • Conservative is a feature, not a bug, the upward-biased error estimate keeps you from being overconfident. The prof’s quick verdict: “if you have a lot of data… just do this.”
  • Validation set ≈ 2-fold CV? Almost, but not quite: 2-fold CV averages the two splits. The single-split validation set approach uses only one of the two roles. “The validation set-approach is the same as 2-fold CV” is on CE1 problem 4b as a true/false; treat it as roughly true in spirit but technically not identical (see CE1 problem 4b). The cleaner answer: validation set ≈ ½ of 2-fold CV.

Exam signals

“If you have a lot of data… screw it, just do this, because this is going to be more conservative than the other approaches. And it’s also very easy to explain.” - L10-resample-1

“No consensus which model really gives the lowest validation set MSE.” - L10-resample-1 (slide commentary on the 10-rerun Auto example)

Pitfalls

  • Treating the single-split number as final. It’s noisy. If you only ran it once, you got lucky or unlucky.
  • Confusing it with cross-validation. The prof was deliberate: “not strictly a cross-validation approach”; there’s no rotation of the validation role across the data.
  • Spatial / temporal correlation breaks the “random” partition: nearby points leak between train and validation. “Two points right next to each other, one in your training, one in your validation, it’s the same damn thing.” - L10-resample-1. Same independence trap as for k-fold and LOOCV.

Scope vs ISLP

  • In scope: definition, the two drawbacks, when to use it (“plenty of data, don’t want to think hard”), how it differs from k-fold/LOOCV, the independence trap.
  • Look up in ISLP: §5.1.1, pp. 198–200; Figure 5.2 (the right panel showing the 10 rerun curves) is the slide image. ISLP labs in §5.3.1.

Exercise instances

No dedicated recommended-exercise problem. Comparison with k-fold and LOOCV is in:

  • Exercise5.2: compare k-fold to validation-set and LOOCV (bias / variance / compute), see k-fold-cv for that atom’s coverage.
  • CE1 problem 4b: true/false statement: “The validation set approach is the same as 2-fold CV”, covered in k-fold-cv and leave-one-out-cv.

How it might appear on the exam

  • True/false on the bias-variance properties: “The validation set approach gives a less variable estimate of test error than 5-fold CV” → false; rerunning splits gives wildly different answers (slide demo).
  • Compare-and-contrast with k-fold or LOOCV: small table of bias / variance / compute. Validation set is the high-bias, single-shot, cheapest entry.
  • “Why do we need k-fold CV at all if we have validation set?” → because of variability across splits + smaller training sample → overestimate of error.