ABSTRACT
RESEARCH HIGHLIGHTS
CONTACT
Automatic Segmentation Of Nasopharyngeal Carcinoma On
CT Images Using Efficient UNet 2.5d Ensemble With Semi-supervised Pretext Task Training
Domoguen JKL, Manuel JA, Cañal JPA and Naval PC Jr (2022) Automatic segmentation of nasopharyngeal carcinoma on CT images using efficient UNet‐2.5D ensemble with semi‐ supervised pretext task pretraining. Front. Oncol. 12:980312. doi: 10.3389/fonc.2022.980312
Nasopharyngeal carcinoma (NPC) is primarily treated with radiation therapy. Accurate delineation of target volumes and organs at risk is critical. However, manual delineation is time-consuming, variable and subjective depending on the experience of the radiation oncologist. This work explores the use of deep learning methods to automate the segmentation of NPC primary gross tumor volume (GTVp) in planning computed tomography images. A total of 63 patients diagnosed with NPC were included in this study. To tackle this problem of limited data, we propose 2 sequential approaches. First, we propose a much simpler architecture which follows the Unet design but using 2D convolutional network for 3D segmentation. We highlight its efficacy over other more popular and modern architecture by achieving significantly higher performance. To further improve performance, we trained the model using multi-scale dataset to create an ensemble of models. Building on top of this proposed architecture, we employ the use of semi-supervised learning by combining 3D rotation and 3D relative-patch location pre-text tasks to pretrain the feature extractor. By semi-supervised pretraining, the feature extractor can be frozen after pretraining, which essentially makes it more efficient in terms of the number of parameters. Finally, it is also efficient in terms of data, seen when the pretrained model with only a portion of the labelled training data was able to achiever very close performance to the model trained with the full labelled data.
Joie Cañal, MD – jdadevosocanal@up.edu.ph, +639178387569
Training images: normal, unlabeled and labelled
Data augmentation: rotation, flipping and transposition
Multi-scale training