Carcinoma Classification
OxML 2023 Cases
Dr. M. Singh
Mr. A. Asgharpoor
Dataset
Imbalanced Dataset
186 Biopsy Slides
62 Labeled Images
- Normal: 36 (58%)
- Benign: 14 (22.6%)
- Malignant: 12 (19.4%)
Methods to Handle Imbalanced Datasets
An overview of methods to handle imbalanced datasets
Oversampling: Increasing the number of instances in the minority class
Undersampling: Reduce number of instances in majority class
Class Weighting: Gives more importance to minority class
Ensemble Methods: Combine multiple classifiers to improve performance
Dataset Preprocessing
Data Augmentation
K-Fold Cross-Validation
What is K-fold Cross-Validation?
Why Stratified K-fold cross-validation?
Benefits of using Stratified K-fold:
Training set
Test set
K Iterations
Weighted Sampling
Why should it be used?
�
Imbalanced Dataset:
Applying a naive classification would led to bias
Minority Class Importance:
It is crucial to correctly identify samples from the minority classes.
Performance Improvement:
Mitigating the issue of class imbalance
With vs Without
�
Few Shot Learning
Credit: IARAI
Ensemble Learning
Dataset
Resnet50
Inspection V3
Googlenet
Efficientnet V2
Combine
Check Point
Hyperparameters
Here are the hyperparameter values we used:
- `noise_std`: 0.1
- `max_height`: value determined by finding the maximum height among the images
- `max_width`: value determined by finding the maximum width among the images
- `num_classes`: 3
- `batch_size`: 8
- `k_folds`: 8
- `num_epochs`: 5
- `learning_rate` (for each optimizer):
- `optimizer_resnet`: 0.001
- `optimizer_efficientnet`: 0.001
- `optimizer_inception`: 0.001
- `optimizer_googlenet`: 0.001
Limitations
1. Small Training Data
2. Unbalanced Dataset
3. Preprocessing Challenges
4. Limited Model Training
5. Evaluation Metric
6. Limited Experimentation.
Future Works
1. Use Multi-Model Approach:
- Carcinoma: ⊖ OR ⊕
- If ⊕ : Benign OR Malignant
2. White Padding Approach:
- We only tried Black (Low Contrast with Cancer Cells)
Any Questions?
Thank you for your time and attention 🙂