ResNet

Summary

A technique used to help deep neural networks learn more effectively by allowing them to focus on “what’s different” rather than having to learn everything from scratch in each layer.

Fail

In a deep network, layers are stacked on top of each other to learn complex patterns. However, when we add too many layers, networks sometimes struggle to learn effectively—they may even start degrading the accuracy instead of improving it. Ideally, extra layers should be able to simply “do nothing” when they don’t need to transform the input (e.g., learn an identity function that outputs the same input). However, deep layers often struggle to learn this identity function naturally.

Important

Residual connections, or skip connections, address this problem by letting each layer output the difference (or residual) between the layer’s input and its desired output.

Instead of learning the entire output from scratch, each layer only has to learn what it should add or subtract from the input to get closer to the target. This is much easier and faster for the network to learn, especially in deep networks.

Important

In practice, a residual block lets each layer pass its output plus the original input forward. Mathematically, a residual block for input xxx and function F(x)F(x)F(x) can be written as: $O u tp u t = F (x) + x$

This shortcut connection, $+ x$ , enables each layer to modify only what’s necessary while leaving the rest of the input unchanged.

Info

Residuals make it easier for networks to learn patterns by helping them quickly recognize when to keep inputs the same, which is useful when extra layers don’t need to make major changes.

Note

Residual connections also allow gradients to flow through the network more easily, preventing them from becoming too large or too small. Solving the Exploding and Vanishing Gradients problem. This helps even the deeper layers train just as effectively as the layers closer to the output.

Brayden Zhang

Explorer

ResNet

Graph View

Backlinks