Title:“An Integrated Approach using YOLOv8 and ResNet, SeResNet & Vision Transformer (ViT) Algorithms based on ROI Fracture Prediction in X-ray Images of the Elbow”
Volume: 20
Author(s): Taukir Alam, Wei-Cheng Yeh*, Fang Rong Hsu*, Shia Wei-Chung, A. Robert Singh, Taimoor Hassan, Wenru Lin, Hong-Ye Yang and Tahir Hussain
Affiliation:
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 407, Taiwan
- Department of Medical Imaging Chang Bing Show Chwan Memorial Hospital Diagnostic Radiology Specialist, Changhua, Taiwan
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 407, Taiwan
Keywords:
YOLO, ROI, ResNet, SeResNet, Vision transformer, Fracture, X-ray, Medical images.
Abstract:
Introduction:
In this study, we harnessed three cutting-edge algorithms' capabilities to refine the elbow fracture prediction process through X-ray image analysis.
Employing the YOLOv8 (You only look once) algorithm, we first identified Regions of Interest (ROI) within the X-ray images, significantly
augmenting fracture prediction accuracy.
Methods:
Subsequently, we integrated and compared the ResNet, the SeResNet (Squeeze-and-Excitation Residual Network) ViT (Vision Transformer)
algorithms to refine our predictive capabilities. Furthermore, to ensure optimal precision, we implemented a series of meticulous refinements. This
included recalibrating ROI regions to enable finer-grained identification of diagnostically significant areas within the X-ray images. Additionally,
advanced image enhancement techniques were applied to optimize the X-ray images' visual quality and structural clarity.
Results:
These methodological enhancements synergistically contributed to a substantial improvement in the overall accuracy of our fracture predictions.
The dataset utilized for training, testing & validation, and comprehensive evaluation exclusively comprised elbow X-ray images, where predicting
the fracture with three algorithms: Resnet50; accuracy 0.97, precision 1, recall 0.95, SeResnet50; accuracy 0.97, precision 1, recall 0.95 & ViTB-
16 with high accuracy of 0.99, precision same as the other two algorithms, with a recall of 0.95.
Conclusion:
This approach has the potential to increase the precision of diagnoses, lessen the burden of radiologists, easily integrate into current medical
imaging systems, and assist clinical decision-making, all of which could lead to better patient care and health outcomes overall.