Preview

Russian Ophthalmological Journal

Advanced search

Diagnostics of ophthalmological and systemic diseases from fundus images using a multimodal transformer model

https://doi.org/10.21516/2072-0076-2025-18-3-supplement-8-11

Abstract

Purpose: to evaluate of the potential for diagnosing ophthalmological and systemic diseases from fundus images using a multimodal transformer model trained on an open dataset.

Material and methods. An open RFMiD dataset containing 3200 fundus images annotated across 29 disease classes was used for training and validation. A pre-trained multimodal transformer architecture was used and fine-tuned on this dataset.

Results. The model demonstrated stable convergence and high accuracy in identifying 29 disease classes from fundus images, achieving a test AUC of 0.9155 without signs of overfitting.

Conclusion. The obtained results show high performance of the multimodal transformer-based model for the task of multiclass disease classification from fundus images.

About the Authors

K. D. Aksenov
PREDICT SPACE LLC; Novorossiysk Polytechnic Institute (branch) of the Federal State Budgetary Educational Institution of Higher Professional Education “KubSTU”
Russian Federation

Kirill D. Aksenov — CEO; researcher

Admiral Serebryakov Emb., 49, Novorossiysk, Krasnodar Region, 353905

20, Karl Marx St., Novorossiysk, Krasnodar Region, 353900



L. E. Aksenova
PREDICT SPACE LLC; Novorossiysk Polytechnic Institute (branch) of the Federal State Budgetary Educational Institution of Higher Professional Education “KubSTU”
Russian Federation

Lyubov E. Aksenova — researcher 

Admiral Serebryakov Emb., 49, Novorossiysk, Krasnodar Region, 353905

20, Karl Marx St., Novorossiysk, Krasnodar Region, 353900



References

1. Abràmoff M, Garvin M, Sonka M. Retinal imaging and image analysis. IEEE RevBiomed Eng. 2010; 3: 169–208. doi: 10.1109/RBME.2010.2084567

2. Issa P, Troeger E, Finger R, et al. Structure-function correlation of the human central retina. PLoS One. 2010 Sep 22; 5 (9): e12864. doi: 10.1371/journal.pone.0012864

3. Patton N, Aslam T, MacGillivray T, et al. Retinal image analysis: concepts, applications and potential. Prog Retin Eye Res. 2006 Jan; 25 (1): 99–127. doi: 10.1016/j.preteyeres.2005.07.001

4. Li LY, Isaksen AA, Lebiecka-Johansen B, et al. Prediction of cardiovascular markers and diseases using retinal fundus images and deep learning: a systematic scoping review. Eur Heart J Digit Health. 2024 Sep 10; 5 (6): 660–9. doi: 10.1093/ehjdh/ztae068

5. Ting DSW, Cheung CY, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. 2017 Dec 12; 318 (22): 2211–23. doi: 10.1001/jama.2017.18152

6. Pachade S, Porwal P, Thulkar D, et al. Retinal fundus multi-disease image dataset (RFMiD). IEEE Dataport. 2020 November, 25. doi:10.21227/s3g7-st65

7. Zhou Y, Chia MA, Wagner SK, et al. A foundation model for generalizable disease detection from retinal images. Nature. 2023; 622: 156–63.


Review

For citations:


Aksenov K.D., Aksenova L.E. Diagnostics of ophthalmological and systemic diseases from fundus images using a multimodal transformer model. Russian Ophthalmological Journal. 2025;18(3):8-11. (In Russ.) https://doi.org/10.21516/2072-0076-2025-18-3-supplement-8-11

Views: 30


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2072-0076 (Print)
ISSN 2587-5760 (Online)