Tldr

UNet is a convolutionanal neural network architecture designed for image segmentation tasks. It consists of an encoder-decoder structure with skip connections, allowing for high-resolution feature extraction and precise localization. The architecture is particularly effective in biomedical image analysis, where it has been widely adopted.

Model

Architecture Overview

  1. Contracting Path (Encoder)
  • Series of convolutional layers with ReLU activation followed by max pooling.
  • Extracts progressively abstract, high-level features.
  • Each downsampling step increases the number of feature channels, allowing the network to encode complex representations.
  1. Expanding Path (Decoder)
  • Uses up-convolutions (transposed convolutions) to increase spatial resolution.
  • Mirrors the contracting path to reconstruct the segmentation map.
  1. Skip Connections
  • Feature maps from each encoder level are concatenated with decoder layers at corresponding resolutions.
  • These connections preserve spatial information lost during downsampling and enhance localization accuracy.
  1. Channel Expansion A distinguishing feature is the use of a large number of feature channels in the decoder, enabling the propagation of rich contextual information to finer layers.

Applications

  • Remote sensing (land cover classification, urban area segmentation)
  • Biomedical Image segmentation (cell segmentation, tumor detection)