Tldr
UNet is a convolutionanal neural network architecture designed for image segmentation tasks. It consists of an encoder-decoder structure with skip connections, allowing for high-resolution feature extraction and precise localization. The architecture is particularly effective in biomedical image analysis, where it has been widely adopted.
Model
Architecture Overview
- Contracting Path (Encoder)
- Series of convolutional layers with ReLU activation followed by max pooling.
- Extracts progressively abstract, high-level features.
- Each downsampling step increases the number of feature channels, allowing the network to encode complex representations.
- Expanding Path (Decoder)
- Uses up-convolutions (transposed convolutions) to increase spatial resolution.
- Mirrors the contracting path to reconstruct the segmentation map.
- Skip Connections
- Feature maps from each encoder level are concatenated with decoder layers at corresponding resolutions.
- These connections preserve spatial information lost during downsampling and enhance localization accuracy.
- Channel Expansion A distinguishing feature is the use of a large number of feature channels in the decoder, enabling the propagation of rich contextual information to finer layers.
Applications
- Remote sensing (land cover classification, urban area segmentation)
- Biomedical Image segmentation (cell segmentation, tumor detection)