Tldr
Regularization is a way to reduce overfitting by penalizing complexity and reducing variance.
- is the regularization strength
- is the penalty term that discourages complex models
L1 Regularization (Lasso)
- Adds absolute value of weights to the loss
- Encourages sparsity — many weights become exactly zero
- Performs feature selection by eliminating irrelevant features
- Not differentiable at 0 → uses subgradient methods or coordinate descent
L2 Regularization (Ridge)
- Adds squared magnitude of weights to the loss
- Tends to spread weight across features