Differentiating malignant and benign eyelid lesions using deep … – Nature.com

Study participants and data extraction

Patients with eyelid lesions were searched through a retrospective review of electronic medical records using international classification of diseases-10-clinical modification diagnosis codes (Supplementary Table S1) or operation codes at Seoul National University Hospital (SNUH). The diagnosis was annotated based on histopathological or clinical findings. All lesions were initially diagnosed by an experienced oculoplastic surgeon (SIK) based on the clinical characteristics. For biopsied lesions, the diagnosis was annotated based on a histopathological report. If the eyelid lesion had not been biopsied, clinical diagnosis was made based on agreement: Eyelid photographs were reviewed by another experienced oculoplastic surgeon (MJL) at the time of enrollment, and the diagnosis was confirmed when the diagnosis agreed with the initial diagnosis.

Eyelid or whole face photographs of these patients, which were taken from October 2004 to April 2020, were included in this study. Photographs were retrieved from the SNUH photograph database in JPEG format with a minimum pixel resolution of 1183690. Poor-quality photographs, photographs inadequate for clinical diagnosis, and postoperative photographs were excluded from the database. When the clinical diagnosis was inconsistent between the two oculoplastic surgeons, the cases were also excluded. In cases with whole face images, two eyelid photos were created by cropping a square with each eyelid separately. For patients with unilateral eyelid lesions, images of the contralateral eyelid photograph without any significant eyelid lesions were annotated as no lesion photographs. This study was approved by the Institutional Review Boards of SNUH (No. 1805-175-949) and Hallym University Sacred Heart Hospital (No. 2020-03-026). The protocol of this study adhered to the tenets of the Declaration of Helsinki. Informed consent was waived by the Institutional Review Boards of Seoul National University Hospital, in view of the retrospective nature of the study and the de-identification of patients data. Signed statements of informed consent to publish patient photographs were obtained from all identifiable persons.

All eyelid photos were classified into three categories based on a histopathologic diagnosis or a clinical diagnosis with expert agreement: malignant lesions, benign lesions, and no lesion categories. The entire dataset was divided into a training dataset and a test dataset with a ratio of 9:1 by random sampling. Dataset splitting was conducted for each class. Random selection was conducted using the patient ID as a key to avoid the simultaneous inclusion of the same patients image in the test and training datasets for each class. The training dataset was then further divided into a proper training dataset and a tuning dataset for training with a ratio of 8:1. In the test dataset, only one image was randomly selected per one patient for each class. This proper training/tuning split was applied three times, and the CNN model was trained and evaluated three times independently, using three different training and tuning datasets.

All images were resized into a single size format with a pixel resolution of 690460 and then normalized using the means ([0.485, 0.456, 0.406]) and standard deviation ([0.229, 0.224, 0.225]) of the ImageNet Dataset14. A contrast enhancement was applied using contrast-limited adaptive histogram equalization15,16,17 in all channels of the image. The numbers of images in the malignant lesion and no lesion classes of the training dataset were much smaller than those of the benign lesion class training dataset, and thus the images in the malignant lesion and no lesion classes of the training dataset were augmented four times by zooming-in at 5%, 10%, and 20%. The entire training dataset was then further augmented twice through horizontal flipping (Supplementary Fig. S1). The Python libraries opencv (version 4.1.2) and imgaug (version 0.4.0; available at https://github.com/aleju/imgaug; accessed on November 3, 2020) were used for image preprocessing and augmentation.

We designed two types of eyelid image classification: a ternary classification (malignant lesion versus benign lesion versus no lesion) and a binary classification (malignant lesion versus benign lesion). For each design, two different CNN architectures, DenseNet-161 and EfficientNetV2-M, were adopted: the former as a widely-used model for medical images and the latter as a state-of-the-art model. The detailed features of the CNN architectures are described elsewhere18,19,20. Briefly, DenseNet is characterized by Denseblock, which concatenates the feature map of the previous layers18. EfficientNetV2 finds the optimal CNN architecture using neural architecture search like EfficientNet, and uses progressive learning which changes augmentation magnitude based on the image size19,20. The pre-trained models were downloaded from the pytorch website, and the links are as follows: https://pytorch.org/vision/stable/models/generated/torchvision.models.densenet161.html#torchvision.models.DenseNet161_Weights and https://pytorch.org/vision/stable/models/generated/torchvision.models.efficientnet_v2_m.html#torchvision.models.EfficientNet_V2_M_Weights (accessed on Aug 15, 2022).

DenseNet-161 and EfficientNetV2-M were pre-trained with the ImageNet dataset and fine-tuned unfixing the weights. We set all layers unfixed, so every layer was fine-tuned. Categorical cross-entropy was used as the loss function, and the Adam optimizer was applied21. The batch size was 3 because it was the maximum batch size that the GPU memory of our server could handle working with EfficientNetV2 model. The learning rate was initially 1e4, and then reduced by multiplying by 0.1 every 10 epochs until the learning rate reached 1e7. We adopted early stopping after the 30th epoch with a patience value of 30 according to the loss for the tuning dataset, or the validation loss value. In the training process, at epochs when the validation loss value was greater than the minimum validation loss so far, the model was not updated. Thus, the model that was saved at the epoch showing the minimum validation loss in the training stage was selected as the final model to prevent overfitting. The training server was implemented with six NVIDIA GTX 1080ti graphic processing units, dual Intel Xeon E5-2690 central processing units, 128GB RAM, and a customized water-cooling system.

After constructing the CNN models using the training dataset, the diagnostic performance of the models was evaluated using the test set. The main outcomes were the discrimination performance of the established CNN models for ternary or binary classification.

The diagnostic performances of the established CNN models and nine clinicians were compared. A panel of human clinicians was constructed, including three oculoplastic specialists, three board-certified ophthalmologists, and three ophthalmology residents.

To visualize the pixels of interest, a saliency map was created using a gradient-weighted class activation map (Grad-CAM). Grad-CAM uses the gradient information flowing into the last convolutional layer and is applicable without altering the CNN architecture22. It produces a localization heatmap overlapping the existing image, and its visualization outperforms previous approaches on interpretability and faithfulness to the original model.

The area under the receiver operating characteristic curve (AUC) of the CNN model was calculated and compared using the DeLong test. In addition, the sensitivity, specificity, positive predictive value, and negative predictive value for a binary classification were calculated at the point having Youdens J statistic maximized. Analyses were conducted using IBM SPSS Statistics version 24.0. (IBM Co., New York, USA) and MedCalc version 19.0.4 (MedCalc Software Ltd., Ostend, Belgium).

Presented as an e-poster at the American Academy of Ophthalmology 2020 Virtual Meeting, November 2020.

Presented as an oral presentation at the American Society of Ophthalmic Plastic and Reconstructive Surgery 51st Annual Fall Scientific Symposium.

Follow this link:
Differentiating malignant and benign eyelid lesions using deep ... - Nature.com

Related Posts

Comments are closed.