Diagnostics of ophthalmological and systemic diseases from fundus images using a multimodal transformer model
https://doi.org/10.21516/2072-0076-2025-18-3-supplement-8-11
Abstract
Purpose: to evaluate of the potential for diagnosing ophthalmological and systemic diseases from fundus images using a multimodal transformer model trained on an open dataset.
Material and methods. An open RFMiD dataset containing 3200 fundus images annotated across 29 disease classes was used for training and validation. A pre-trained multimodal transformer architecture was used and fine-tuned on this dataset.
Results. The model demonstrated stable convergence and high accuracy in identifying 29 disease classes from fundus images, achieving a test AUC of 0.9155 without signs of overfitting.
Conclusion. The obtained results show high performance of the multimodal transformer-based model for the task of multiclass disease classification from fundus images.
About the Authors
K. D. AksenovRussian Federation
Kirill D. Aksenov — CEO; researcher
Admiral Serebryakov Emb., 49, Novorossiysk, Krasnodar Region, 353905
20, Karl Marx St., Novorossiysk, Krasnodar Region, 353900
L. E. Aksenova
Russian Federation
Lyubov E. Aksenova — researcher
Admiral Serebryakov Emb., 49, Novorossiysk, Krasnodar Region, 353905
20, Karl Marx St., Novorossiysk, Krasnodar Region, 353900
References
1. Abràmoff M, Garvin M, Sonka M. Retinal imaging and image analysis. IEEE RevBiomed Eng. 2010; 3: 169–208. doi: 10.1109/RBME.2010.2084567
2. Issa P, Troeger E, Finger R, et al. Structure-function correlation of the human central retina. PLoS One. 2010 Sep 22; 5 (9): e12864. doi: 10.1371/journal.pone.0012864
3. Patton N, Aslam T, MacGillivray T, et al. Retinal image analysis: concepts, applications and potential. Prog Retin Eye Res. 2006 Jan; 25 (1): 99–127. doi: 10.1016/j.preteyeres.2005.07.001
4. Li LY, Isaksen AA, Lebiecka-Johansen B, et al. Prediction of cardiovascular markers and diseases using retinal fundus images and deep learning: a systematic scoping review. Eur Heart J Digit Health. 2024 Sep 10; 5 (6): 660–9. doi: 10.1093/ehjdh/ztae068
5. Ting DSW, Cheung CY, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. 2017 Dec 12; 318 (22): 2211–23. doi: 10.1001/jama.2017.18152
6. Pachade S, Porwal P, Thulkar D, et al. Retinal fundus multi-disease image dataset (RFMiD). IEEE Dataport. 2020 November, 25. doi:10.21227/s3g7-st65
7. Zhou Y, Chia MA, Wagner SK, et al. A foundation model for generalizable disease detection from retinal images. Nature. 2023; 622: 156–63.
Review
For citations:
Aksenov K.D., Aksenova L.E. Diagnostics of ophthalmological and systemic diseases from fundus images using a multimodal transformer model. Russian Ophthalmological Journal. 2025;18(3):8-11. (In Russ.) https://doi.org/10.21516/2072-0076-2025-18-3-supplement-8-11


























