Multiscale Fusion at What Cost? Quantifying Efficiency-Accuracy Trade-offs in Hybrid Models

Thomas Kinyanjui Njoroge

doi:10.52783/jisem.v10i54s.11055

PDF

Published: Jun 6, 2025

DOI: https://doi.org/10.52783/jisem.v10i54s.11055

Keywords:

Cost-Aware Deep Learning, Dynamic Feature Aggregation, Hybrid CNN-Transformer Models, Multiscale Fusion, Crop disease detection

Thomas Kinyanjui Njoroge, Kelvin Mugoye, Rachael Kibuku

Abstract

Multiscale feature fusion enhances deep vision models but often introduces computational overhead—an under-quantified challenge in hybrid CNN-Transformer architectures, especially for edge-based agricultural deployments. This study proposes an adaptive hybrid framework combining MobileNetV2, EfficientNetV2, and Transformers, trained on 76 classes across 22 crop diseases using Kaggle and field-sourced images. To address the efficiency-accuracy trade-off, we incorporate Squeeze-and-Excitation (SE) blocks (<1% parameter increase), gating mechanisms that reduce scale bias and improve small-object detection with marginal FLOPs cost, and hierarchical fusion, which raises FLOPs by 15% but yields diminishing returns on high-resolution data. The model achieved strong convergence (Training: 0.9957, validation: 0.9868) and 97.97% accuracy on 249 unseen field images. Final metrics (Accuracy: 0.992, AUC: 0.999998) surpassed standalone CNNs and Transformers—yet only when scale diversity was present. Statistical validation via confidence variance analysis and Kruskal-Wallis testing (H = 597.40, p = 8.48e−126) revealed the proposed model had the lowest variance (0.000010), confirming stable predictions. Most pairwise comparisons were significant at p < 0.05. ANOVA and bootstrapping further validated fusion's non-linear cost scaling. We demonstrated Pareto-efficient frontiers where hybrid models outperform their standalone counterparts only under certain conditions. This work challenges the notion that "more fusion is better," advocating context-aware fusion. Fusion is viable for cloud/server systems but must be pruned for edge deployment. We offer design guidelines for building cost-efficient, high-accuracy vision models in resource-constrained agricultural environments.

Issue

Vol. 10 No. 54s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

Multiscale Fusion at What Cost? Quantifying Efficiency-Accuracy Trade-offs in Hybrid Models

Abstract

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details