Paced learning guide excerpt
Gradient Descent: First Principles
A public excerpt from the current Peras sample matrix showing a first-principles path into gradient descent.
We now shift from comparing training failure cases to understanding their underlying causes. At this point, you've seen two distinct patterns in how a model's loss and gradient norms evolve during training: one steady and promising, the other erratic and concerning. The next step is not just to distinguish them, but to explain them causally.