Deep Learning · Bachelor's Thesis · FIB-UPC

Polyp Detection with Generative Data Augmentation

Train Faster R-CNN, RetinaNet, and SSD detectors on colonoscopy data, augmented with synthetic images from CycleGAN and SPADE generative models. Hyperparameter search via Optuna & Ray Tune.

End-to-End Pipeline
Bachelor's Thesis · FIB-UPC
📋
LDPolypVideo
Colonoscopy frames + bbox annotations
🎭
CycleGAN / SPADE
Mask → synthetic polyp images
📊
Augmented dataset
Real + generated training data
🧠
Faster R-CNN
Object detection training
🔍
Optuna / Ray Tune
Hyperparameter optimization
COCO evaluation
AP / AR / F1 metrics
PyTorch 2.1Faster R-CNNCycleGANSPADEOptunaRay TuneCOCO evalLDPolypVideo
🔬

Simulated inference

Browser mock · Faster R-CNN

Simulates Faster R-CNN inference on a colonoscopy frame.
The bounding boxes are a browser mock, not model output.

🎭

CycleGAN translation

Unpaired image-to-image

CycleGAN learns unpaired mask ↔ polyp translation. SPADE uses spatially-adaptive normalization for mask → polyp synthesis.

Binary mask
Generated polyp

Illustrations are schematic. Real CycleGAN outputs are photorealistic colonoscopy images.

📊

Model comparison

10 Faster R-CNN configurations from Optuna HPO

Sort:
bs=4 lr=5.3e-3 ep=3
0.1156
bs=2 lr=1.7e-3 ep=7
0.1111
bs=4 lr=1.6e-4 ep=9
0.1024
bs=4 lr=2.3e-3 ep=1
0.0928
bs=8 lr=2.7e-3 ep=5
0.0739
bs=2 lr=3.1e-2 ep=2
0.0694
bs=2 lr=8.0e-3 ep=1
0.0357
bs=8 lr=6.0e-5 ep=3
0.0267
bs=4 lr=2.0e-5 ep=3
0.0126
bs=4 lr=3.8e-2 ep=1
0.0032

All models are Faster R-CNN (ResNet-50 FPN). Metrics from COCO evaluation on the LDPolypVideo test set.

Detection architectures

ModelBackboneNotes
Faster R-CNNResNet-50 FPNPrimary detector, best results
RetinaNetResNet-50 FPN v2Single-stage anchor-based
SSD LiteMobileNet V3Lightweight / mobile

Local Dashboard

The project includes a Streamlit dashboard for interactive model exploration and inference. Clone the repo and run streamlit run src/app.py from code/.

Author
Pol Casacuberta · FIB-UPC