R-CNN

for object detection and instance segmentation

Identify multiple objects in an image, locate them with bounding boxes, provide pixel-level segmentation, giving each detected object a detailed mask that outlines its shape

Tldr

Backbone Network (Feature Extraction)

Typically a CNN (ResNet)

Region Proposal Network

Takes in the feature map from the backbone, and generates region proposals

torch has *torchvision.models.detection.rpn*

RPN slides over feature map, proposing regions with object-like features

Region of Interest Alignment

Resizes the regions of interest to a fixed side while preserving spatial details (important for segmentation)

Heads for Detection

Object classification and bounding box regression

Region Proposal:

divide the input image into multiple regions that are likely to contain objects. This is done by external methods such as Selective Search or edgeBoxes

Region Classification: For each proposed region, a CNN is used to extract features and classify the object. Uses separate networks for region proposal and classification.

Brayden Zhang

Explorer

R-CNN

Object classification and bounding box regression

Graph View

Backlinks