Utilizing deep learning from mobile phone photos for early detection of horizontal strabismus: a screening approach

Abstract

To develop and validate an artificial intelligence pipeline for binary screening of horizontal strabismus versus orthotropia using smartphone-acquired facial images and geometric landmark analysis. This two-stage system combines Real-Time Detection Transformer (RT-DETR) to localize nine ocular landmarks per eye across three gaze directions (left, center, right), and supervised machine learning classifiers. A feature set of five biometric ratios was derived from coordinates including the canthi, limbi, and corneal light reflexes. The model was trained on facial images from 150 participants (96 with strabismus and 54 controls). To address class imbalance and improve generalizability, Synthetic Minority Oversampling Technique (SMOTE) and 4-fold cross-validation were applied. RT-DETR achieved an intersection over union of 0.62 and a mean center-point error of 6.52 pixels in landmark localization. The Random Forest classifier achieved an accuracy of 0.95, sensitivity of 0.96, specificity of 0.94, positive predictive value of 0.97, and negative predictive value of 0.92. This study demonstrates the feasibility of combining transformer-based landmark detection with geometric ratios for strabismus screening. The framework shows high performance under controlled conditions. While the use of biometric ratios allows for feature-level inspection, further research is required to establish full clinical interpretability and performance in uncontrolled environments.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By