Discriminant score and decision-boundary derivation

The procedural atom for the prof’s most-flagged exam pattern in module 4: “given $π_{k}$ , $μ_{k}$ , $Σ$ (or $Σ_{k}$ ), solve for the decision boundary.” The discriminant score $δ_{k} (x)$ is what you maximize over $k$ ; setting $δ_{k} = δ_{ℓ}$ and solving gives the boundary. Said twice on the exam-relevance flag list.

Definition (prof’s framing)

“Same exact trick as finding the point that divides the classes in two. You do the same thing, only now in a higher dimension and then you get a line out of it, or a plane, whatever it is.” - L09-classif-3

The discriminant score $δ_{k} (x)$ is what’s left of $lo g (π_{k} f_{k} (x))$ after dropping every term that doesn’t depend on $k$ . It is not a probability or likelihood, but it preserves the $ar g max$ , same classification.

Notation & setup

$π_{k}$ : class prior.
$f_{k} (x)$ : class-conditional density (multivariate Gaussian for LDA/QDA).
$δ_{k} (x)$ : discriminant score for class $k$ . Bigger $δ_{k}$ → class $k$ wins.
Decision boundary between classes $k$ and $ℓ$ : locus where $δ_{k} (x) = δ_{ℓ} (x)$ .

Formula(s) to know cold

LDA discriminant (1D):

$δ_{k} (x) = x \cdot \frac{μ _{k}}{σ ^{2}} - \frac{μ _{k}^{2}}{2 σ ^{2}} + lo g π_{k}$

LDA discriminant (multivariate):

$δ_{k} (x) = x^{⊤} Σ^{- 1} μ_{k} - \frac{1}{2} μ_{k}^{⊤} Σ^{- 1} μ_{k} + lo g π_{k}$

QDA discriminant (multivariate):

$δ_{k} (x) = - \frac{1}{2} x^{⊤} Σ_{k}^{- 1} x + x^{⊤} Σ_{k}^{- 1} μ_{k} - \frac{1}{2} μ_{k}^{⊤} Σ_{k}^{- 1} μ_{k} - \frac{1}{2} lo g ∣ Σ_{k} ∣ + lo g π_{k}$

Decision boundary (binary, classes 0 and 1):

$δ_{0} (x) = δ_{1} (x)$

LDA → linear in $x$ (a hyperplane). QDA → quadratic in $x$ (a conic).

Derivation recipe

The standard play, applied for either LDA or QDA:

Start from Bayes’ rule. $Pr (Y = k ∣ X = x) \propto π_{k} f_{k} (x)$ (denominator is $k$ -independent , drop it).
Take logs. $lo g Pr (Y = k ∣ X = x) = lo g π_{k} + lo g f_{k} (x) + const_{x}$ .
Plug in the Gaussian density. $lo g f_{k} (x) = - \frac{p}{2} lo g (2 π) - \frac{1}{2} lo g ∣ Σ_{k} ∣ - \frac{1}{2} (x - μ_{k})^{⊤} Σ_{k}^{- 1} (x - μ_{k})$ (use $Σ$ for LDA).
Drop everything not depending on $k$ . The $- \frac{p}{2} lo g (2 π)$ goes. In LDA, $- \frac{1}{2} lo g ∣Σ∣$ also goes (no $k$ ); in QDA it stays. The $- \frac{1}{2} x^{⊤} Σ^{- 1} x$ piece: in LDA it has no $k$ , drop; in QDA it has $k$ , keep (this is where the quadratic comes from).
Expand the surviving cross term: $- \frac{1}{2} (x - μ_{k})^{⊤} Σ^{- 1} (x - μ_{k})$ → $x^{⊤} Σ^{- 1} μ_{k} - \frac{1}{2} μ_{k}^{⊤} Σ^{- 1} μ_{k}$ (after dropping the $k$ -free $x^{⊤} Σ^{- 1} x$ piece in LDA).
What’s left is $δ_{k} (x)$ .

For the boundary, set $δ_{k} (x) = δ_{ℓ} (x)$ and solve for $x$ . For LDA in two classes:

$x^{⊤} Σ^{- 1} (μ_{0} - μ_{1}) - \frac{1}{2} μ_{0}^{⊤} Σ^{- 1} μ_{0} + \frac{1}{2} μ_{1}^{⊤} Σ^{- 1} μ_{1} + lo g π_{0} - lo g π_{1} = 0$

A linear equation in $x$ → a hyperplane.

Worked example , the prof’s recurring template

$μ_{A} = (1, 1)$ , $μ_{B} = (3, 3)$ , $Σ = 2 I$ , equal priors.

$Σ^{- 1} = \frac{1}{2} I$ . The cross-term coefficient is $Σ^{- 1} (μ_{A} - μ_{B}) = \frac{1}{2} (- 2, - 2) = (- 1, - 1)$ .

The intercept terms:

$- \frac{1}{2} μ_{A}^{⊤} Σ^{- 1} μ_{A} = - \frac{1}{2} \cdot \frac{1}{2} (1 + 1) = - \frac{1}{2}$ .
$+ \frac{1}{2} μ_{B}^{⊤} Σ^{- 1} μ_{B} = + \frac{1}{2} \cdot \frac{1}{2} (9 + 9) = + \frac{9}{2}$ .

Sum: $- \frac{1}{2} + \frac{9}{2} = 4$ . Equal priors → $lo g π_{A} - lo g π_{B} = 0$ .

Boundary: $- x_{1} - x_{2} + 4 = 0$ , i.e. $x_{2} = 4 - x_{1}$ .

“I didn’t ask for a line. I just equated the two things and then I solved for them and then it became a line. I didn’t tell the math to give me a line.” - L09-classif-3

The prof flubbed this live (claimed two terms cancelled when they didn’t), came back from break to fix it. Lesson:

“When I was doing my PhD, one of my co-authors said he doesn’t make a habit of doing algebra in public. I’ve heard his voice in my head a million times. It’s a bit of a fool to do public algebra , your brain shuts off, you look like an idiot.” - L09-classif-3

Show your work step-by-step on the exam.

Insights & mental models

Boundary is a consequence, not a parameter. In logistic regression you fit the boundary’s slope. In LDA/QDA you fit class densities and the boundary falls out from $δ_{k} = δ_{ℓ}$ . - L09-classif-3
Equal priors, equal $σ$ , two classes, 1D: boundary at $(μ_{1} + μ_{2}) /2$ , the midpoint. Verifiable shortcut.
Priors literally move the boundary. Increasing $π_{k}$ by some amount slides the boundary away from class $k$ (more area is classified as $k$ ). Compute via $lo g π_{k} - lo g π_{ℓ}$ in the boundary equation.
In multivariate LDA, the boundary depends on $μ_{0}, μ_{1}, Σ$ jointly. Specifically, the boundary’s normal direction is $Σ^{- 1} (μ_{0} - μ_{1})$ , not just $μ_{0} - μ_{1}$ . Σ rotates and rescales.
For QDA, the cleanest representation is the compact form $δ_{k} (x) = - \frac{1}{2} (x - μ_{k})^{⊤} Σ_{k}^{- 1} (x - μ_{k}) - \frac{1}{2} lo g ∣ Σ_{k} ∣ + lo g π_{k}$ , it’s just (squared Mahalanobis distance) + (volume penalty) + (log-prior).

Exam signals

“And that’s often an exam question. Or that would be a typical exam question. So you would, given an LDA setting and here are the values for the parameters , you have the pies, you have the mu’s, and you’d have a value for the standard deviation , and then you would solve for where is the decision point.” - L09-classif-3

“This would be another kind of question that you could ask on an exam , like find the equation for the decision boundary between these two categories. Then you solve for the equation for the line or the plane or whatever it happens to be depending on the dimension of the X’s.” - L09-classif-3

“That would be another question one could ask if I was so inspired , show that this leads to this thing. I don’t know if that’s a very interesting question to ask, but one could ask it.” - L09-classif-3 (re: deriving $δ_{k}$ itself)

CE1 problem 3e is exactly this exam pattern in compulsory form: derive $δ_{k}$ , solve $δ_{0} = δ_{1}$ , get $a x_{1} + b x_{2} + c = 0$ , solve for $x_{2}$ .

Pitfalls

Public-algebra slips. The prof himself did one in lecture , write each line carefully on the exam, don’t try to do steps in your head.
Forgetting to drop the $- \frac{1}{2} x^{⊤} Σ^{- 1} x$ in LDA. It’s $k$ -independent there; drop it. In QDA, it depends on $k$ via $Σ_{k}$ ; keep it.
Mixing $Σ$ with $Σ^{- 1}$ . The discriminant uses the inverse (precision). Don’t drop the inverse.
Forgetting the $- \frac{1}{2} lo g ∣ Σ_{k} ∣$ term in QDA. It survives because $∣ Σ_{k} ∣$ depends on $k$ .
Sign error on the prior term. $+ lo g π_{k}$ , bigger prior → bigger $δ_{k}$ → more area classified as $k$ .
Treating the boundary as $μ_{0} - μ_{1}$ in 2D. No , it’s $Σ^{- 1} (μ_{0} - μ_{1})$ that gives the normal direction. Only when $Σ = I$ (or scalar multiple) do they coincide.

Scope vs ISLP

In scope: Discriminant-score derivation from $lo g (π_{k} f_{k} (x))$ , why LDA is linear and QDA quadratic, decision-boundary derivation, the $μ_{A} = (1, 1)$ / $μ_{B} = (3, 3)$ / $Σ = 2 I$ worked example, prior-shift effect.
Look up in ISLP: §4.4.1 (1D LDA derivation, eq. 4.18), §4.4.2 (multivariate, eq. 4.24), §4.4.3 (QDA, eq. 4.28). pp. 145–155.
Skip in ISLP: Fisher’s eigenvalue derivation (slide deck “Optional”, prof never lectured); detailed naive-Bayes-as-LDA-with-diagonal-Σ algebra (covered abstractly in naive-bayes).

Exercise instances

CE1 problem 3e: full procedural application: derive $δ_{k}$ from Bayes’ rule, solve $δ_{0} = δ_{1}$ for the boundary in $a x_{1} + b x_{2} + c = 0$ form, plot it.

(All other CE1.3 / Exercise4.2 / Exercise4.6 problems on LDA/QDA implicitly require this derivation as a sub-step , see linear-discriminant-analysis / quadratic-discriminant-analysis for the full per-method exercise lists.)

How it might appear on the exam

The flagged pattern (twice): Given $π_{k}$ , $μ_{k}$ , $Σ$ (or $Σ_{k}$ ), derive the boundary equation. Solve in the form $x_{2} = b x_{1} + a$ (2D) or just the threshold $x = c$ (1D).
Discriminant-score derivation: “Show that for LDA, $δ_{k} (x) = x^{⊤} Σ^{- 1} μ_{k} - \frac{1}{2} μ_{k}^{⊤} Σ^{- 1} μ_{k} + lo g π_{k}$ .” Step-by-step: log of $π_{k} f_{k} (x)$ , drop $k$ -free terms, expand the cross term. Emphasizes why the $x^{2}$ drops (no $k$ ).
“Where does the quadratic come from in QDA?”: the contrast question. Walk through what cancels in LDA but doesn’t in QDA.
Prior-shift question: “If $π_{k}$ increases from 0.5 to 0.7, in which direction does the boundary move?” → away from class $k$ .
Hand-classification: Given $δ_{A} (x_{0})$ and $δ_{B} (x_{0})$ values for a specific test point, pick the larger.

linear-discriminant-analysis: the model + estimators + scope discussion.
quadratic-discriminant-analysis: same exercise, kept the $x^{2}$ piece.
multivariate-normal: the density whose log produces $δ_{k}$ .
logistic-regression: same linear log-odds form, different fitting; not derived this way but shares the linear boundary.
classification-setup: Bayes-classifier framing that motivates maximizing $δ_{k}$ .

statistical.dog

Explorer

discriminant-score-and-decision-boundary

Discriminant score and decision-boundary derivation

Definition (prof’s framing)

Notation & setup

Formula(s) to know cold

Derivation recipe

Worked example , the prof’s recurring template

Insights & mental models

Exam signals

Pitfalls

Scope vs ISLP

Exercise instances

How it might appear on the exam

Graph View

Table of Contents

Backlinks

statistical.dog

Explorer

discriminant-score-and-decision-boundary

Discriminant score and decision-boundary derivation

Definition (prof’s framing)

Notation & setup

Formula(s) to know cold

Derivation recipe

Worked example , the prof’s recurring template

Insights & mental models

Exam signals

Pitfalls

Scope vs ISLP

Exercise instances

How it might appear on the exam

Related

Graph View

Table of Contents

Backlinks