Tldr

Regularization is a way to reduce overfitting by penalizing complexity and reducing variance.

  • is the regularization strength
  • is the penalty term that discourages complex models

L1 Regularization (Lasso)

  • Adds absolute value of weights to the loss
  • Encourages sparsity — many weights become exactly zero
  • Performs feature selection by eliminating irrelevant features
  • Not differentiable at 0 → uses subgradient methods or coordinate descent

L2 Regularization (Ridge)

  • Adds squared magnitude of weights to the loss
  • Tends to spread weight across features