From Resolution to Explanation: Real-ESRGANand LIME Analysis of Vision Transformers andCNNs for Brain Tumor MRI Classification

Authors

  • Md. Mirazul Hasan
  • Md. Hasan Al Mahmud Nafis
  • Sikhul Islam Shihab
  • Marishat Tasmim
  • Md. Mazid-Ul-Haque
  • Abhijit Bhowmik
  • S M Abdullah Shafi

DOI:

https://doi.org/10.53799/jaay3z78

Keywords:

Vision Transformers, Convolutional Neural Networks, ResNet50, RealESRGAN, LIME

Abstract

This study compares the performance of Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) for brain tumor classification on MRI scans from the Kaggle Brain Tumor MRI dataset, establishing a comprehensive benchmark for evaluation. To enhance visual fidelity without altering spatial resolution, Real-Enhanced Super Resolution Generative Adversarial Networks (Real-ESRGAN) were employed for preprocessing. After applying Real-ESRGAN, a significant improvement in classification accuracy and feature clarity was observed across all models, indicating the importance of high-quality input in medical imaging tasks. Five transformer-based models—Swin-Tiny, ViT, DeiT, Mobile-ViT, and PiT—were benchmarked against five C NN architectures, including ResNet50, EfficientNet-B0, VGG16, AlexNet, and DenseNet-121. Building on these results, a modified late-fusion ensemble combining ResNet50 and Vision Transformer was developed to integrate both global and local feature extraction capabilities. The proposed hybrid architecture achieved superior classification performance, outperforming all individual ViT and CNN models. Furthermore, Explainable AI techniques were applied using Local Interpretable Model-agnostic Explanations (LIME) to visualize decision patterns, revealing that ViTs and the late-fusion ensemble exploit broader contextual regions for tumor localization, while CNNs concentrate on more confined spatial areas. The integrated framework of Real-ESRGAN enhancement, late-fusion ensembleing, and LIME-based interpretation collectively advances both accuracy and explainability, offering a promising direction for reliable and interpretable brain tumor diagnosis in clinical applications.

Downloads

Published

04/30/2026

How to Cite

[1]
“From Resolution to Explanation: Real-ESRGANand LIME Analysis of Vision Transformers andCNNs for Brain Tumor MRI Classification”, AJSE, vol. 24, no. 1, pp. 75–85, Apr. 2026, doi: 10.53799/jaay3z78.

Most read articles by the same author(s)