MACHINE LEARNING | PROJECT
PolypVision AI
Automated Polyp Detection
Computer-aided polyp detection for clinicians
Vinod Kumar Prajapat | BT23ECE122 · Apoorv Deshmukh | BT23ECE013
Dept. of Electronics & Communication | ML Project
9,035
Dataset Images
2.3M
YOLOv11n Params
INTRODUCTION
Introduction & Objective
The Clinical Problem
20–25%
Polyp miss rate in routine colonoscopies.
Our Objective
Build a clinical AI system for colonoscopy images.
"A secondary AI eye in the room — so no polyp goes unnoticed."
DATA
Data Collection & Preprocessing
Data Collection
Images from multiple sources provide diversity in morphology, lighting, and equipment.
[Roboflow] polyp-kntak
[Roboflow] polyp-detection-xdae2
9,035
Total Images — 100 %
6,502
Training Set ~ 72 %
902
Validation Set ~ 10%
1,631
Test Set ~ 18%
Data Pipeline
Input
RGB images, 224×224 to 1920×1080
Letterbox
Resize to 640×640, pad gray 114
Augmentation
HSV jitter; flip 50%
Images are letterboxed to 640×640 with gray padding (114), preserving aspect ratio. Training uses HSV jitter every batch and 50% horizontal flip.
~884K unique visual inputs — 136 epochs × 6,502 training images.
ARCHITECTURE
YOLOv11n Architecture — Built From Scratch
~2.3M parameters — a lightweight detection backbone for medical imagery.
TRAINING
Training Pipeline
Optimization Strategy
Loss Functions
Task
Loss Function
Classification
BCE + TAL Assigner
Bounding Box
CIoU + DFL
Hardware: NVIDIA RTX GPU
RESULTS
Evaluation Results — Test Set Performance
Evaluated on 1,631 independent test images held out during development.
0.9343
mAP@0.5
Mean Average Precision at IoU 0.5
0.9749
Precision
True positive rate of predicted polyps
0.9299
Recall
Detected actual polyps
0.9519
F1 Score
Precision-recall balance
CPU Inference
~10.48 FPS
FUTURE WORK
Future Directions
Model Pruning
Cut parameters for mobile edge deployment and point-of-care use without GPU infrastructure.
Grad-CAM Explainability
Show saliency heatmaps that highlight the image regions driving each prediction.
Multi-Class Extension
Classify polyp subtypes—adenomatous vs. hyperplastic—for better risk stratification.
Multi-View Frame Fusion
Fuse sequential frames to reduce flicker and improve temporal consistency in live colonoscopy.
CONCLUSION
Summary & Impact
Built from Scratch
Created a full YOLOv11n pipeline independently, with complete architectural control.
Clinical Results
Reached 0.9343 mAP@0.5 and 0.9749 Precision on 1,631 test images from real colonoscopy data.
Clinical Value
Supports early colorectal cancer detection by helping ensure polyps are not missed.
"A high-accuracy model for early detection."
Vinod (BT23ECE122) · Apoorv (BT23ECE013) | Dept. of Electronics & Communication | ML Project