A NOVEL TRANSFORMER METHOD PRETRAINED WITH MASKED AUTOENCODERS AND FRACTAL DIMENSION FOR DIABETIC RETINOPATHY CLASSIFICATION,Fractals

当前位置： X-MOL 学术 › Fractals › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A NOVEL TRANSFORMER METHOD PRETRAINED WITH MASKED AUTOENCODERS AND FRACTAL DIMENSION FOR DIABETIC RETINOPATHY CLASSIFICATION
Fractals ( IF 4.7 ) Pub Date : 2024-03-27 , DOI: 10.1142/s0218348x24500609
YAOMING YANG ₁ , ZHAO ZHA ₂ , CHENNAN ZHOU ₁ , LIDA ZHANG ₁ , SHUXIA QIU ₁ , PENG XU _{1,

3}

Affiliation

Diabetic retinopathy (DR) is one of the leading causes of blindness in a significant portion of the working population, and its damage on vision is irreversible. Therefore, rapid diagnosis on DR is crucial for saving the patient’s eyesight. Since Transformer shows superior performance in the field of computer vision compared with Convolutional Neural Networks (CNNs), it has been proposed and applied in computer aided diagnosis of DR. However, a large number of images should be used for training due to the lack of inductive bias in Transformers. It has been demonstrated that the retinal vessels follow self-similar fractal scaling law, and the fractal dimension of DR patients shows an evident difference from that of normal people. Based on this, the fractal dimension is introduced as a prior into Transformers to mitigate the adverse influence of lack of inductive bias on model performance. A new Transformer method pretrained with Masked Autoencoders and fractal dimension (MAEFD) is developed and proposed in this paper. The experiments on the APTOS dataset show that the classification performance for DR by the proposed MAEFD can be substantially improved. Additionally, the present model pretrained with 100,000 retinal images outperforms that pretrained with 1 million natural images in terms of DR classification performance.

中文翻译：

一种使用掩蔽自动编码器和分形维数进行预训练的新型 Transformer 方法，用于糖尿病视网膜病变分类

糖尿病视网膜病变 (DR) 是导致很大一部分工作人群失明的主要原因之一，其对视力的损害是不可逆转的。因此，快速诊断DR对于挽救患者的视力至关重要。由于 Transformer 与卷积神经网络（CNN）相比在计算机视觉领域表现出优越的性能，因此被提出并应用于 DR 的计算机辅助诊断。然而，由于 Transformers 缺乏归纳偏差，需要使用大量图像进行训练。研究表明，视网膜血管遵循自相似分形尺度规律，DR患者的分形维数与正常人存在明显差异。基于此，将分形维数作为先验引入到 Transformers 中，以减轻缺乏归纳偏差对模型性能的不利影响。本文开发并提出了一种使用 Masked Autoencoders 和分形维数 (MAEFD) 进行预训练的新 Transformer 方法。 APTOS数据集上的实验表明，所提出的MAEFD对DR的分类性能可以得到显着提高。此外，就 DR 分类性能而言，使用 100,000 张视网膜图像预训练的当前模型优于使用 100 万张自然图像预训练的模型。

更新日期：2024-03-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>