A Novel Hybrid CNN-Mamba Framework with DySample-Enhanced YOLOv11 for Automated Pediatric Wrist Fracture Detection
Keywords:
Wrist Fractures Detection, Object Localization, CNN-Mamba Framework, Hybrid Deep Learning, Medical Imaging, Feature FusionAbstract
Wrist fractures, particularly distal radius and ulna fractures, are among the most common injuries in pediatric populations. Early and accurate detection of these injuries is critical for preventing long-term complications, yet interpreting pediatric wrist radiographs remains a challenging task due to the subtle nature of some abnormalities. In response to this challenge, we propose a novel hybrid framework for automated medical image detection, combining the strengths of convolutional neural networks (CNNs) and Mamba-based encoders to capture both local and global feature dependencies. To address the challenges in fusing features from these two distinct architectures, we introduce the Feature Aggregation Attention Module (FAAM), which dynamically combines the feature maps for more robust representation. Additionally, we enhance the YOLOv11 framework by replacing conventional upsampling in the neck with the Dysample technique, which improves feature propagation and refinement. We evaluate our method on the GRAZPEDWRI-DX dataset, a comprehensive collection of pediatric wrist trauma X-rays, demonstrating significant improvements in fracture detection. Our approach achieves an mAP@0.5 of 69.12% and an mAP@0.95 of 48.4%, showcasing its effectiveness in both general and challenging detection scenarios.
Downloads
Downloads
Published
Submitted
Revised
Accepted
Issue
Section
License
Copyright (c) 2025 Mahdi Zarrin (Author); Jafar Tanha; Haniyeh Nikkhah (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.