Novel transfer learning based bone fracture detection using radiographic images

Alam, Aneeza; Al-Shamayleh, Ahmad Sami; Thalji, Nisrean; Raza, Ali; Morales Barajas, Edgar Anibal; Thompson, Ernesto Bautista; de la Torre Diez, Isabel; Ashraf, Imran

doi:10.1186/s12880-024-01546-4

Research
Open access
Published: 03 January 2025

Novel transfer learning based bone fracture detection using radiographic images

Aneeza Alam¹,
Ahmad Sami Al-Shamayleh²,
Nisrean Thalji³,
Ali Raza⁴,
Edgar Anibal Morales Barajas^5,6,7,
Ernesto Bautista Thompson^5,6,8,
Isabel de la Torre Diez⁹ &
…
Imran Ashraf¹⁰

BMC Medical Imaging volume 25, Article number: 5 (2025) Cite this article

1908 Accesses
Metrics details

Abstract

A bone fracture is a medical condition characterized by a partial or complete break in the continuity of the bone. Fractures are primarily caused by injuries and accidents, affecting millions of people worldwide. The healing process for a fracture can take anywhere from one month to one year, leading to significant economic and psychological challenges for patients. The detection of bone fractures is crucial, and radiographic images are often relied on for accurate assessment. An efficient neural network method is essential for the early detection and timely treatment of fractures. In this study, we propose a novel transfer learning-based approach called MobLG-Net for feature engineering purposes. Initially, the spatial features are extracted from bone X-ray images using a transfer model, MobileNet, and then input into a tree-based light gradient boosting machine (LGBM) model for the generation of class probability features. Several machine learning (ML) techniques are applied to the subsets of newly generated transfer features to compare the results. K-nearest neighbor (KNN), LGBM, logistic regression (LR), and random forest (RF) are implemented using the novel features with optimized hyperparameters. The LGBM and LR models trained on proposed MobLG-Net (MobileNet-LGBM) based features outperformed others, achieving an accuracy of 99% in predicting bone fractures. A cross-validation mechanism is used to evaluate the performance of each model. The proposed study can improve the detection of bone fractures using X-ray images.

Peer Review reports

Introduction

The human body contains 206 bones of different shapes, sizes, and strengths. Detecting internal injuries, especially minor ones, is a challenging task for orthopedics. Machine learning (ML) is becoming prominent in analyzing radiographic images to detect bone injuries. The features extracted from X-ray images help identify the fracture and healthy bones [1]. These features help doctors diagnose and treat ailments in a timely manner.

In 2019, the Global Burden of Diseases(GBD) reported 178 million new fractures only, an average increase of 33.4% more than in 1990. There were 445 million cases of acute or prevalent cases of fractures [2]. Fractures impact patients’ quality of life and put an economic burden on the health system. The average global cost for one year of health and social care for hip fracture is $43,669 [3]. These high costs of treatment emphasize the importance of timely recovery, as prolonged diseases have a significant financial burden on patients.

Traditionally, X-ray reports are examined by trained medical practitioners, mostly radiologists [4]. This way of fracture diagnosis is prone to human error and inconsistencies. There is a viable chance that a fracture may be overlooked. The traditional method is time-consuming, and delayed diagnosis may lead to improper handling. The accuracy of these diagnoses depends entirely on the medical professional’s experience, and years of practice are needed for accurate interpretation. Also, one person can attend to a limited number of patients, delaying the diagnosis and treatment process. So, there is a need for a more reliable and quicker way of radiographic fracture identification.

ML in medical fields has revolutionized healthcare by improving accuracy, treatment, and personal care [5, 6]. Image processing and ML techniques have eased disease and drug prediction, improving clinical predictions [7]. Using ML techniques promises a bright future with better diagnoses, personalized treatments, and improved patient outcomes [8]. Recently, transfer learning-based feature engineering has become popular for improving the prediction accuracy of image datasets [9]. Pre-trained neural networks are used in transfer learning to get the benefits of the best features from the image dataset.

The main contributions of this study using transfer learning-based feature extraction are given as:

A novel transfer learning-based MobLG-Net feature extraction approach is introduced, which gives the most relevant features that are further used for training ML models.
Five ML models are applied to both spatial and MobLG-Net features to get an unbiased comparison of both techniques.
Hyperparameter tuning is done to get the most optimized results from ML models from both sets of features.
A comparison of applied models and state-of-the-art techniques is presented to give a better view of how important the research findings are.

The remaining paper is organized as follows: Literature review section covers the previous investigations done to detect fractures in bones using X-ray images. The proposed novel feature extraction and applied machine-learning models are discussed in Proposed methodology section. Results and discussions section has detailed performance metrics of the applied methods on both sets of features, along with a comparison with previous studies. The conclusion and the prospects of the study are discussed in the final section.

Literature review

Artificial intelligence has been used in the past to detect fractures using clinical and publicly available data. The literature contains fracture detection studies using artificial intelligence [10] on open-source and clinical radiographic bone image datasets.

An average precision of 62.04% is achieved by Guan et al. [11] using 4,000 arm x-ray images. They handled the low-quality, noisy dataset, which had images with a dark background. However, the precision achieved is not very high, it is better than many modern deep learning (DL) methods.

Analysis performed by Kim and MacKinnon [12] achieved an accuracy of 95%. They used a dataset of 11,112 radiographic images of wrists. The dataset was divided into 80:10:10 training, validation, and testing groups. They trained CNN based on Tensorflow (1.0) using 20,000 iterations. The testing phase concluded that transfer learning using deep CNN can be applied for radiographic fracture detection. Training on plain X-ray images is highly transferable and can be used in medical imaging to reduce clinical risks.

Tanzi et al. [13] studied bone fracture detection and classification using 95 images augmented to create a dataset of 4476 images. They applied R-CNN to classify and find the location of the fracture. They achieved an accuracy of 96% with a precision of 0.866 in finding the precise location of the fracture. They used the VGG16 neural network to predict and locate fractures. The accuracy to identify the fracture is 95%.

An average accuracy of 86.76% is achieved by Lee et al. using a dataset of 786 images [14]. The images were resized to 512 x 512 pixels using Bi-LSTM decoded. The models applied were GoogLeNet-inception v3 and two proposed M1 and M2 models. The accuracy achieved by the proposed methods is not that high, and the dataset size is small.

Wrist fracture detection using X-ray images was done by Hardalac et al. [15] utilizing 26 different artificial intelligence models. Twenty distinct fracture identification processes were carried out on the wrist X-ray images dataset from Gazi University Hospital. The experiment is conducted using object detection models with different backbones. An average precision of 86.39% is achieved with a uniquely designed wrist fracture detection-combo (WFD-C) model.

The study [16] proposed DL techniques for fracture detection. A DL-based convolution neural network was developed using plain X-ray images of a biplane. The training and validation datasets consisted of 3245 fractured and 3210 normal wrist images, respectively. An accuracy of 98.0% is achieved using CNN based on the VGG16 model. The diagnosis identifies the fracture and points to its location using a heat map. This diagnosis method reduced the error of fracture detection to 47% in the trauma centers. As the dataset used for the study is only from adults above 18 years, fracture detection in children is impossible as their bones have intact epiphyseal lines.

The survey conducted by Sahin [17] used 176 radigraphic images with 105 normal and 71 cracked ones. They used color space conversion to convert the images to grayscale format. Canny edge detection was performed for edge detection. For the classification of normal and fractured bone, 12 different machine-learning models were utilized. The highest accuracy of 89% was achieved by the LDA classifier, followed by 86% for logistic regression and 85% for the RF classifier, respectively.

The study conducted by Ahmed et al. [18] used a small dataset of 270 lower-leg X-ray images. The dataset was pre-processed via various stages, i.e., noise cancellation, contrast improvement, and edge detection. Gray Level Co-occurrence Matrix (GLCM) used five properties (correlation, dissimilarity, energy, homogeneity, and contrast), four distances, and seven angles for feature extraction. This technique extracted 140 features for each image. Five different machine-learning algorithms were used after splitting the dataset into 80:20. Highest accuracy, i.e. 92.85, in the current scenario is achieved by SVM.

A recent study [19] on long bone fracture detection uses a dataset of 3000 X-ray images of fractured and normal bones. The dataset was divided into 7:2:1 training, validation, and test datasets for applying machine-learning algorithms. The highest accuracy 96.5% is achieved by the ResNet50 Fine Tune model with a loss score of 0.16. Binary classification achieved an accuracy of 87.7% for four classes, which is low compared to other models. This can be due to the reduced dataset size, which was split into four classes.

Table 1 shows a summary of the analyzed works. Numerous studies have been conducted to recognize fractures using radiographic images. A thorough look into the literature has revealed some limitations that need to be addressed.

Table 1 The analyzed literature review comparison

Full size table

We found that most of the studies used a specific region of the body to identify the fracture. Despite existing studies, the fracture detection accuracy is low. In addition, feature engineering part using transfer learning is under investigated. There are minimal studies that can generalize fractures from all body parts. So, there is a need for a study that can identify common fractures so that findings can be practically implemented in medical care centers.

Proposed methodology

Figure 1 shows the methodology followed for fracture detection using radiographic images. Two types of features are extracted from the existing X-ray images dataset. We used CNN and a novel proposed MobLG-Net method to get useful features from the given dataset. Each set of features is utilized for training machine learning models. The extracted features are split into 80% training and 20% test datasets. The outperformed approach is used to detect fractures in bones using an X-ray image dataset.

Radiographic images dataset

The publically available benchmark bone fracture multi-region x-ray images dataset is utilized in this study [20]. The dataset contains 9,463 X-ray images of fractured and normal bones from all body parts, including knees, limbs, hips, lumber, etc. The dataset consists of two classes, i.e., fractured and non-fractured, as shown in Fig. 2. The number of training and test dataset images is 8,863 and 600, respectively.

Image preprocessing and formation

To get reliable results from machine learning models, it is necessary to have a clean dataset without any noise or null values [21]. The dataset is imported first using OpenCV, and basic pre-processing is applied to standardize the input data for further analysis that includes image resizing. The first step is to resize images to uniform 224 x 224 pixels to ensure they are compatible with the neural network architecture. Then, the dataset labels are encoded as ’fractured’:0 and ’non-fractured’:1 to ease supervised learning. Using a balanced dataset for training is advisable. Figure 3 shows the number of images in each class. The dataset is balanced as the number of images in both classes is almost similar.

Exploratory data analysis

The dataset used for training and testing the ensemble fracture detection model consists of 10,580 images having a uniform resolution of 224x224. The average size of images is 12 kb with 160 dpi average resolution. The shape of the training set is 8863,224,224,3.

Novel transfer learning-based feature engineering

Using the radiographic images dataset, we have suggested a novel transfer learning-based feature engineering method, i.e. MobLG-Net, to detect fractured and healthy bone. The feature extraction mechanism is illustrated in Fig. 4. The novel MobLG-Net approach used three methods to get spatial features from the radiographic images dataset. First, the sequential model is implemented with the first layer of convo2D having 64 filters of 3x3. The linear stack of the sequential model is followed by max pooling 2D, dropout, flatten, and dense layers. A total of 790,337 trainable features are extracted. Then we applied MobileNet, which is based on CNN, and used pre-trained ’Imagenet’ weights [22]. The use of a pre-trained model gave better results than the sequential model. Finally, a novel MobLG-Net model, a sequential model with the first layer of MobileNet (based on CNN), is implemented. Figure 4 illustrates the feature engineering techniques used.

The uniqueness of the proposed MobLG-Net lies in its innovative approach to feature extraction. Unlike traditional methods, MobLG-Net combines the strengths of sequential modeling and transfer learning. Initially, a sequential model extracts a rich set of 790,337 trainable features using convolutional layers, max pooling, dropout, and dense layers. While effective, this approach is outperformed by MobileNet, a lightweight convolutional neural network pre-trained on the ’Imagenet’ dataset, which leveraged pre-trained weights to capture high-level spatial features with greater precision. By integrating the first layer of MobileNet with a custom sequential model, MobLG-Net maximizes the strengths of both feature extraction techniques. This novel combination enables the extraction of more nuanced spatial features from radiographic images, leading to superior performance across various machine learning classifiers.

Algorithm 1 shows the sequential flow of the proposed transfer learning method. Algorithm 2 contains the Pseudo Code for MobLG-Net method.

Applied artificial intelligence approaches

Most of the studies use ML and DL methods in image classification jobs [23]. Convolution neural networks are the most adopted DL algorithm for image detection and prediction [24]. They apply numerous layers of filters to take out beneficial features from the dataset and understand hidden patterns in the data. Besides classical CNNs, transfer-based feature extraction is used to get the most useful features from the radiographic images. We have applied various ML techniques to both spatial and deep features extracted using CNNs and transfer learning. The advanced ML models have performed well in the detection and prediction of cracks in bones when trained and tested on image datasets [25].

CNN: is a famous neural network whose architecture has been designed to read and absorb visual data such as images by learning the same way humans do. The network captures different levels of abstraction by using a sequence of convolutional layers, from the simple edges and textures in early layers to more complex patterns, such as bone structures or fractures in deeper ones. CNNs are well suited for medical imaging where we benefit from this hierarchical feature extraction, which is useful as a bone fracture in X-ray image detection [26]. When properly trained, the CNN can classify new images correctly by identifying familiar patterns, thus helping in decision-making for diagnosis.
MobileNet: is the proposed novel feature extraction approach. It uses a sequential implementation of MobileNet and LGBM to get the most relevant and effective features from the radiographic images dataset. The proposed method is lightweight and is beneficial where resources like processing power and memory are limited. This is done using a depthwise separable convolution introduced by MobileNet. A standard convolution has a filter layer that receives the input feature, reshapes it based on filter size, and then applies operation for each location. The input channels are convolved with filters separately to produce the output of each filter in a channel, and then pointwise convolution combines these outputs. This has the effect of decently reducing both parameters and computational complexity while only losing very little accuracy at all.
KNN is widely applied for classification tasks, especially involving image datasets [27]. KNN classifies data points based on the proximity to their closest neighbor. KNN is a non-parametric ML model based on Euclidean distance to calculate the nearest neighbors. Fracture detection involves local patterns or irregularities within X-ray images. KNN can easily identify such irregular patterns as the classification relies on local neighbors [28]. This makes KNN a good choice for fracture detection in bones using X-ray images.
LGBM is an influential ML algorithm with excellent speed and prediction power. LightGBM uses gradient boosting to detect complex patterns within the data without losing efficiency. LightGBM can handle large datasets and scattered features, which makes it an excellent choice for detecting fractures in bones.
LR is mostly used in binary classification tasks. LR models the possibility of a certain class by analyzing the input features like the intensity of pixels or extracted features [29]. This helps identify if a fracture is present or not in the given dataset. LR is well suited for classifying fractures from X-ray images due to its simplicity and capability to handle linearly separable data.
RF is an ensemble model that combines the estimations of several decision trees and makes a final prediction. By constructing several decision trees on different portions of the training dataset, RF averages their output, reducing the risk of overfitting [30]. Using multiple trees enhances the generalization ability of the ML model, making it highly effective for fracture classification tasks. RF is prominent for its capability to tackle nosy data and complex patterns, giving more accurate results.

Hyperparameter tuning

Choosing the optimized parameters for training ML models is crucial as parameters can affect the reliability of the research findings [23]. Hyperparameter tuning involves trying a combination of parameters that give the best performance while testing the trained models [31]. Parameter optimization is performed using the randomized search parameters and cross-validation-based mechanism, which improved the predictive capabilities of ML models [32]. We have used various techniques twice: once with spatial features extracted using CNNs and once with the novel MobLG-Net features. The hyperparameters used in the applied ML techniques are given in Table 2.

Table 2 The optimized hyperparameters used for DL and ML techniques

Full size table

Results and discussions

This section gives a review of the performance of applied ML techniques. The Results and discussions section details the experimental setup and presents a comparison of results from the classical CNN approach and the proposed MobLG-Net approach. The accuracy, precision, recall, and f1-score give a comparative analysis of models, and these performance metrics are included in the Results and discussions section. The Results and discussions section is the essence of the whole experimental setup and extensively describes the effectiveness of ML and DL models in detecting fractures using radiographic images.

Experimental setup

The research experiment is conducted using state-of-the-art Python libraries like sklearn, and TensorFlow. ML models are trained using a powerful GPU with Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz 1.90 GHz processor and 15.9 GB RAM. Programs are written using Google Colaboratory (a free cloud-based platform for Python) in Python 3 for training and validation of ML models. The performance indicators like Accuracy (Acc), recall, precision, and F1 score are extracted to compare the results of the applied ML model.

In the context of machine learning for bone fracture detection, Acc measures the overall correctness of the model by calculating the proportion of correctly classified instances among all predictions. Recall focuses on the model’s ability to correctly identify fractures, ensuring that as many true cases as possible are detected. Precision assesses the reliability of fracture predictions, indicating the proportion of correct fracture predictions out of all predicted fractures. Finally, the F1 score combines precision and recall into a single metric, providing a balanced measure of the model’s performance, especially useful in cases of class imbalance.

Results with ML models

The training and testing of ML models is done on two sets of features, i.e. spatial features and novel MobileNet extracted features. Time series analysis shows the variation in learning rates during the training time [33]. Spatial features are extracted using CNNs, and Fig. 5 shows the training and validation accuracy using CNNs. The training accuracy improved with each epoch, and the highest accuracy of 91% was achieved. On the other hand, validation results are not that good. The highest value of validation accuracy is 81%, which is reached in epoch seven and decreases after that.

Figure 6 shows the training and validation loss across 10 epochs during the training of CNNs. Figure 6a shows that training loss sharply drops from around 60 to almost zero after just one epoch and remains zero throughout the remaining process. When training loss reduces rapidly to zero and becomes low, indicating that the model is overfitting and training too well.

The validation loss starts at 1.6 and decreases up to around epoch 5. After this, the value of validation loss fluctuates between 1 and 1.2 but does not drop significantly as training loss. The model initially improved in handling unseen data, but fluctuations after epoch 5 indicate that the model started to overfit the training data.

Results with MobileNet

The results obtained with the MobileNet approach are given in Fig. 7. The training accuracy is initially between 78% and 87%. The accuracy improved with time and reached 90% during epoch 2. The training accuracy reached 95% during epoch nine and remained constant during epoch 10. The validation accuracy shows fluctuations throughout the validations. The value of accuracy started from 91% and improved to 95% from epoch 1 to epoch 4. The validation accuracy sharply dropped to 86% at epoch 5, indicating that model performance decreased significantly on the validation data. After epoch 5, the validation accuracy again started rising up to 96%, which is 1% higher than the training accuracy.

The training and validation loss scores are illustrated in Fig. 8. The training loss starts at a high value of 0.85 and gradually decreases from epoch 1 to 4. The value of training loss then remains low between 0.4 and 0.2 throughout up to the end. This consistently low value indicates that the model is learning well from the training data.

The validation loss is given in Fig. 8b, and its value is 0.3, which is not very high compared to the starting value of train loss. The loss gradually decreases from epoch 1 to 4. Just like fluctuations in validation accuracy, the validation loss also started to rise during epoch 5. The validation loss started to decrease after epoch five and reached a minimum value of 0.0 around epochs 9 and 10.

Classification results using CNN and MobileNet

Table 3 gives the classification of fractured and normal bone images. The CNN model achieves an accuracy of 81%, showing a suitable performance. The proposed MobileNet shows a high classification accuracy of 97%. MobLG-Net clearly outperforms CNN across all performance metrics. High accuracy, precision, recall, and f1-score values suggest that MobileNet is suitable for classifying fractured and non-fractured bones. The ability of MobileNet to classify correctly makes it reliable. However, still, there is room for performance improvement.

Table 3 The classification of fractured and non-fractured bone images using CNN and MobileNet

Full size table

Results with only spatial features

The performance of ML models on these classically extracted spatial features is given in Table 4. The highest accuracy of 93% is achieved with the LGBM model, followed by random forest having 89% accuracy. The results of the classical approach are acceptable, and reasonable values of accuracies have been achieved, yet there is room for improvement. So, we applied the ML models to the features extracted from the proposed novel approach. The results are discussed in the next section.

Table 4 The evaluation of the effectiveness of ML techniques using spatial features on test data

Full size table

Results with proposed approach

Most optimized features, extracted from the proposed MobLG-Net, are used for training machine learning models. With the same hyperparameters used for the classical approach, novel transfer learning-based features performed well. The light gradient boosting and logistic regression outperformed the KNN and random forest by 1%. An accuracy of 99% is achieved by both algorithms, followed by 98% accuracy of KNN and RF. The precision, recall, and f1-scores are also above 97, showing the generalization ability of applied models on novel extracted features. Table 5 details each target label’s performance metrics. The proposed transfer learning approach extracted better features than the classical CNN approach. The proposed method outperformed the classical approach by 5%, proving the pre-trained models’ effectiveness.

Table 5 The performance evaluation of applied ML techniques on features extracted using novel MobLG-Net

Full size table

Figure 9 provides the confusion matrix study of the applied methods. The given matrix shows the truth table of predictions made by the applied models. The confusion matrix analysis illustrates the strengths and weaknesses of the models. KNC and LGBMC have a lesser number of false predictions of 22 samples than RF and LR, although the accuracy of LR is high. The RF model achieved a high false prediction error of 37 samples. The same is the case with the LR model. This confusion matrix analysis shows that applied ML models have performed well on transfer-based features.

Computational complexity analysis

The time taken by each applied ML model to train on spatial and MobLG-Net features is the computational complexity. The time taken by RF is 70 seconds, as shown in Table 6, which is the highest time for training on spatial features. Also, the KNN and LR took less than a second to predict fractures in radiographic images. An average performance is shown with the spatial features, with LGB and RF taking too much time. When the time comparison is made with the novel extracted features, the proposed approach also outperformed in computational cost. The highest time taken is by random forest, which is 3.9 seconds, followed by KNN, RGB, and LR, taking a fraction of a second. Moreover, the accuracy of LDB and LR is the highest, with the lowest computation cost, making the best models for generalization.

Table 6 The runtime computational complexity analysis of applied ML models

Full size table

Performance analysis using cross-validation

To ensure the reliability and robustness of the proposed methods, we conducted a comprehensive cross-validation-based performance analysis. The results, summarized in Table 7, demonstrate that the models achieved high accuracy with minimal standard deviation. Among the applied methods, LR and LGBM emerged as the top performers, both achieving a k-fold accuracy of 0.985 with standard deviations of 0.0034 and 0.0035, respectively. The RF and KNC models also exhibited strong performance, with accuracies of 0.977 and 0.976 and standard deviations of 0.0043 and 0.0042, respectively. These results highlight the effectiveness of the selected models and the robustness of our approach to detecting bone fractures.

Table 7 The cross-validation-based performance analysis of applied methods

Full size table

Comparison with state-of-the-art approaches

The comparative analysis of the proposed approach with the previous state-of-the-art studies is presented in Table 8. For an honest comparison, we have taken the most recent studies between the years 2020 and 2024. The highest accuracy achieved by previous research is 98%, which used only hand-wrist images and has very limited application. Also, the classical CNN model is used in the study to get this high accuracy. Following this, the remaining studies have not shown extraordinary performance to be noted. The proposed novel feature extraction has yielded good results, and with 99% accuracy, the approach stood prominent in detecting fracture in radiographic images.

Table 8 The contrast between the proposed and other state-of-the-art studies in predicting fractures using radiographic images

Full size table

Ablation study

The ablation study provides a comprehensive analysis of the performance improvements achieved by the proposed MobLG-Net method compared to classical approaches. Table 9 highlights the accuracy achieved using both techniques across various machine learning methods. The classical combination of MobileNet and LGBM demonstrated solid performance, with accuracy values ranging from 0.64 to 0.93 depending on the method. However, the proposed MobLG-Net approach significantly enhanced these results, achieving near-perfect accuracy values between 0.98 and 0.99. Notably, methods like LR and LGBM saw the most dramatic improvement, with accuracy jumping from 0.64 to 0.99 and 0.93 to 0.99, respectively. This analysis clearly underscores the efficacy of MobLG-Net, demonstrating its potential to set a new benchmark for robust and accurate feature extraction in machine learning pipelines.

Table 9 Performance comparison of ML models and proposed approach as ablation study analysis

Full size table

Study limitations

While our study demonstrates the effectiveness of the proposed MobLG-Net method for classifying fractured and normal bones using a dataset of 9,463 X-ray images, it is not without limitations. One significant challenge is the dataset size, which, while substantial, may not fully capture the diversity of fracture patterns across different populations. This limitation could potentially affect the model’s generalizability, particularly when applied to underrepresented groups, such as pediatric patients, whose bone structure and fracture characteristics differ from adults. Additionally, our study focuses on binary classification, distinguishing between fractured and normal bones. Expanding this work to include multi-class classification for different types of fractures such as greenstick, comminuted, and spiral fractures-could enhance the clinical applicability of the model.

Conclusion and future work

Fracture detection using X-ray images and transfer learning-based feature extraction is presented in this study. The spatial and MobLG-Net features are extracted using CNNs and the proposed MobLG-Net method. During the research experiments, both sets of extracted features are fed to various ML methods, and their performance is compared. The results indicate that a low performance is observed with the spatial features. However, the models trained on the proposed MobLG-Net feature outperformed and gave a high-performance score. We analyzed the performance matrices of applied ML methods and noted that the proposed approach has satisfactory results in all matrices. The time complexity analysis of the applied models is done to check the time taken by each. Performance is also validated using a cross-validation mechanism. The study shows that the models trained using MobLG-Net features took less learning time. The research results are compared with other state-of-the-art studies.

The study to detect fractures yielded satisfactory results and generalized the unseen data well. In future work, we aim to extend the research dataset size, including more radiographic images, so that a more robust prediction can be made. Further, instead of binary classification (i.e., fracture or normal), we will introduce multi-class classification, which will identify the types of fractures. By addressing this area, we will be able to implement this research practically and improve the clinical diagnosis of fractures.

Data availability

The dataset used in this study is publicly available at the following link: https://www.kaggle.com/datasets/devbatrax/fracture-detection-using-x-ray-images/data.

References

AlGhaithi A, Al Maskari S. Artificial intelligence application in bone fracture detection. J Musculoskelet Surg Res. 2021;5:4.
Article Google Scholar
Wu AM, Bisignano C, James SL, Abady GG, Abedi A, Abu-Gharbieh E, et al. Global, regional, and national burden of bone fractures in 204 countries and territories, 1990–2019: a systematic analysis from the Global Burden of Disease Study 2019. Lancet Health Longev. 2021;2(9):e580–92.
Article Google Scholar
Williamson S, Landeiro F, McConnell T, Fulford-Smith L, Javaid MK, Judge A, et al. Costs of fragility hip fractures globally: a systematic review and meta-regression analysis. Osteoporos Int. 2017;28:2791–800.
Article CAS PubMed Google Scholar
Ayesa SL, Katelaris AG, Brennan PC, Grieve SM. Medical imaging education opportunities for junior doctors and non-radiologist clinicians: A review. J Med Imaging Radiat Oncol. 2021;65(6):710–8.
Article PubMed Google Scholar
Kwakernaak S, van Mens K, Cahn W, Janssen R, Investigators G, et al. Using machine learning to predict mental healthcare consumption in non-affective psychosis. Schizophr Res. 2020;218:166–72.
Article PubMed Google Scholar
Ahammed M, Mamun MA, Uddin MS. A machine learning approach for skin disease detection and classification using image segmentation. Healthc Analytics. 2022;2:100122. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.health.2022.100122.
Article Google Scholar
Ramesh J, Aburukba R, Sagahyroon A. A remote healthcare monitoring framework for diabetes prediction using machine learning. Healthc Technol Lett. 2021;8(3):45–57.
Article PubMed PubMed Central Google Scholar
Thalji N, Aljarrah E, Almomani MH, Raza A, Migdady H, Abualigah L. Segmented X-ray image data for diagnosing dental periapical diseases using deep learning. Data Brief. 2024;54:110539.
Article CAS PubMed PubMed Central Google Scholar
Das NN, Kumar N, Kaur M, Kumar V, Singh D. Automated deep transfer learning-based approach for detection of COVID-19 infection in chest X-rays. Irbm. 2022;43(2):114–9.
Article Google Scholar
Kaya O, Taşcı B. A Pyramid Deep Feature Extraction Model for the Automatic Classification of Upper Extremity Fractures. Diagnostics. 2023;13(21):3317.
Article PubMed PubMed Central Google Scholar
Guan B, Zhang G, Yao J, Wang X, Wang M. Arm fracture detection in X-rays based on improved deep convolutional neural network. Comput Electr Eng. 2020;81:106530.
Article Google Scholar
Kim D, MacKinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol. 2018;73(5):439–45.
Article CAS PubMed Google Scholar
Tanzi L, Vezzetti E, Moreno R, Moos S. X-ray bone fracture classification using deep learning: a baseline for designing a reliable approach. Appl Sci. 2020;10(4):1507.
Article Google Scholar
Lee C, Jang J, Lee S, Kim YS, Jo HJ, Kim Y. Classification of femur fracture in pelvic X-ray images using meta-learned deep neural network. Sci Rep. 2020;10(1):13694.
Article CAS PubMed PubMed Central Google Scholar
Hardalaç F, Uysal F, Peker O, Çiçeklidağ M, Tolunay T, Tokgöz N, et al. Fracture detection in wrist X-ray images using deep learning-based object detection models. Sensors. 2022;22(3):1285.
Article PubMed PubMed Central Google Scholar
Oka K, Shiode R, Yoshii Y, Tanaka H, Iwahashi T, Murase T. Artificial intelligence to diagnosis distal radius fracture using biplane plain X-rays. J Orthop Surg Res. 2021;16:1–7.
Article Google Scholar
Sahin ME. Image processing and machine learning-based bone fracture detection and classification using X-ray images. Int J Imaging Syst Technol. 2023;33(3):853–65.
Article Google Scholar
Ahmed KD, Hawezi R. Detection of bone fracture based on machine learning techniques. Meas Sensors. 2023;27:100723.
Article Google Scholar
Ali SNE, Sherif HM, Hassan SM, El Marakby AAER. Long bones x-ray fracture classification using machine learning. J Al-Azhar Univ Eng Sect. 2024;19:121–33.
Batra D. Fracture detection using x-ray images. https://www.kaggle.com/datasets/devbatrax/fracture-detection-using-x-ray-images/data. Accessed 01 Aug 2024.
Khalid M, Raza A, Younas F, Rustam F, Villar MG, Ashraf I, et al. Novel Sentiment Majority Voting Classifier and Transfer Learning-based Feature Engineering for Sentiment Analysis of Deepfake Tweets. IEEE Access; 2024;12:67117–29.
Simonyan K. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014.
Raza A, Younas F, Siddiqui HUR, Rustam F, Villar MG, Alvarado ES, et al. An improved deep convolutional neural network-based YouTube video classification using textual features. Heliyon. 2024;10:1–16.
Article Google Scholar
Terziyan V, Vitko O. Causality-aware convolutional neural networks for advanced image classification and generation. Procedia Comput Sci. 2023;217:495–506.
Article Google Scholar
Meena T, Roy S. Bone fracture detection using deep supervised learning from radiological images: A paradigm shift. Diagnostics. 2022;12(10):2420.
Article PubMed PubMed Central Google Scholar
Hussein HI, Mohammed AO, Hassan MM, Mstafa RJ. Lightweight deep CNN-based models for early detection of COVID-19 patients from chest X-ray images. Expert Syst Appl. 2023;223:119900.
Article PubMed PubMed Central Google Scholar
Shah AA, Parah SA, Rashid M, Elhoseny M. Efficient image encryption scheme based on generalized logistic map for real time image processing. J Real-Time Image Proc. 2020;17(6):2139–51.
Article Google Scholar
Nasim S, Al-Shamayleh AS, Thalji N, Raza A, Abualigah L, Alzahrani AI, et al. Novel Meta Learning Approach for Detecting Postpartum Depression Disorder Using Questionnaire Data. IEEE Access. 2024.
Younas F, Raza A, Thalji N, Abualigah L, Zitar RA, Jia H. An efficient artificial intelligence approach for early detection of cross-site scripting attacks. Decis Analytics J. 2024;11:100466.
Article Google Scholar
Tanveer MU, Munir K, Raza A, Almutairi MS. Novel artificial intelligence assisted Landsat-8 imagery analysis for mango orchard detection and area mapping. PLoS ONE. 2024;19(6):e0304450.
Article CAS PubMed PubMed Central Google Scholar
Haider M, Hashmi MSA, Raza A, Ibrahim M, Fitriyani NL, Syafrudin M, et al. Novel Ensemble Learning Algorithm for Early Detection of Lower Back Pain Using Spinal Anomalies. Mathematics. 2024;12(13):1955.
Article Google Scholar
Weerts HJ, Mueller AC, Vanschoren J. Importance of tuning hyperparameters of machine learning algorithms. arXiv preprint arXiv:2007.07588. 2020.
Sayed MS, Rony MAT, Islam MS, Raza A, Tabassum S, Daoud MS, et al. A Novel Deep Learning Approach for Forecasting Myocardial Infarction Occurrences with Time Series Patient Data. J Med Syst. 2024;48(1):53.
Article PubMed Google Scholar

Download references

Funding

This study is supported by the European University of Atlantic.

Author information

Authors and Affiliations

Faculty of Computer Science and Information Technology, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan, Pakistan
Aneeza Alam
Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, 19328, Jordan
Ahmad Sami Al-Shamayleh
Faculty of Computer Studies, Arab Open University, Amman, Jordan
Nisrean Thalji
Department of Software Engineering, University of Lahore, Lahore, 54000, Pakistan
Ali Raza
Universidad Europea del Atlantico, Santander, 39011, Spain
Edgar Anibal Morales Barajas & Ernesto Bautista Thompson
Universidad Internacional Iberoamericana, Campeche, 24560, Mexico
Edgar Anibal Morales Barajas & Ernesto Bautista Thompson
Universidad de La Romana, La Romana, República Dominicana
Edgar Anibal Morales Barajas
Universidade Internacional do Cuanza, Cuito, Bie, Angola
Ernesto Bautista Thompson
Department of Signal Theory, Communications and Telematics Engineering, Unviersity of Valladolid, Paseo de Belen, 15, 47011, Valladolid, Spain
Isabel de la Torre Diez
Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, 38541, Republic of Korea
Imran Ashraf

Authors

Aneeza Alam
View author publications
You can also search for this author inPubMed Google Scholar
Ahmad Sami Al-Shamayleh
View author publications
You can also search for this author inPubMed Google Scholar
Nisrean Thalji
View author publications
You can also search for this author inPubMed Google Scholar
Ali Raza
View author publications
You can also search for this author inPubMed Google Scholar
Edgar Anibal Morales Barajas
View author publications
You can also search for this author inPubMed Google Scholar
Ernesto Bautista Thompson
View author publications
You can also search for this author inPubMed Google Scholar
Isabel de la Torre Diez
View author publications
You can also search for this author inPubMed Google Scholar
Imran Ashraf
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

AA conceptualization, data curation, writing manuscript. ASAS conceptualization, formal analysis, writing manuscript. MNS methodology, formal analysis, data curation. AR software, project administration, methodology EAMB funding acquisition, visualization, investigation. EBT visualization, formal analysis, software. IdlTZ investigation, resources, validation. IA supervision, validation, writing and editing the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Imran Ashraf.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Alam, A., Al-Shamayleh, A.S., Thalji, N. et al. Novel transfer learning based bone fracture detection using radiographic images. BMC Med Imaging 25, 5 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12880-024-01546-4

Download citation

Received: 13 September 2024
Accepted: 23 December 2024
Published: 03 January 2025
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12880-024-01546-4

Novel transfer learning based bone fracture detection using radiographic images

Abstract

Introduction

Literature review

Proposed methodology

Radiographic images dataset

Image preprocessing and formation

Exploratory data analysis

Novel transfer learning-based feature engineering

Applied artificial intelligence approaches

Hyperparameter tuning

Results and discussions

Experimental setup

Results with ML models

Results with MobileNet

Classification results using CNN and MobileNet

Results with only spatial features

Results with proposed approach

Computational complexity analysis

Performance analysis using cross-validation

Comparison with state-of-the-art approaches

Ablation study

Study limitations

Conclusion and future work

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Medical Imaging

Contact us