Classification of Cervical Disc Herniation Disease using Muscle Fatigue based surface EMG signals by Artificial Neural Networks

This study presents the classification of cervical disc herniation patients and healthy persons by using muscle fatigue information. Cervical disc herniation patients suffer from neck pain and muscle fatigue in the neck increases these aches. Neck pain is the most common pain type encountered after back pain. The discomforts that occur in the neck region affect the daily quality of life, so the number of researches done in this area is increasing. In this study surface Electromyography (EMG) signals were used to examine muscle fatigue. EMG signals were obtained from Trapezius and Sternocleidomastoid (SCM) muscles in the cervical region of 10 control subject and 10 cervical disc herniation patients. Surface EMG was preferred because it is a noninvasive method. In the first step of this study, EMG signals were filtered and adapted for analysis. In the second step, muscle fatigue was determined using Median and Mean frequency values obtained by Fourier Transform and Welch methods. Feature extraction was the third step which was performed by Short Time Fourier Transform (STFT), Discrete Wavelet Transform (DWT) and Autoregressive method (AR). Finally, Artificial Neural Network (ANN) was used for classification. Training and test data were created by using feature vectors to classify patients with ANN. According to the results, the superior feature extraction method was investigated on patient classification using muscle fatigue information. The best results were obtained by AR method with %99 classification accuracy. Also, the best results were obtained by DWT with %100 classification accuracy for SCM muscle. This study has contributed that AR and DWT are a suitable feature extraction methods for surface EMG signals by providing high accuracy classification with artificial intelligence methods for cervical disc herniation disease. Besides, it is shown that muscle fatigue distinguishes cervical disc herniation patients from healthy people.


Introduction
The electromyogram is defined as the electrical activity that occurs when muscles are resting and contracting. Movements and positions of limbs are controlled by the conduction of electrical signals between the muscles and the central and peripheral nervous system. When a pathological condition occurs in the motor system (spinal cord, neurons, muscles, neuromuscular connections), the characteristics of the electrical signals in the muscles change. The recording of electrical signals in the muscles provides important information in the diagnosis of anomalies in both muscles and motor system. Electromyography (EMG) is the recording and interpretation of muscle action potentials [1]. Surface electromyography (sEMG) is a non-invasive technique that recognizes signals with time-related characteristics and it is very useful for understanding stimulus responses of muscles. Surface electrodes are placed on the skin and measure the combined activity from many motor units [2]. sEMG is widely used for muscle strength estimation, muscle fatigue measurement and ergonomics, sports physiology, and diagnostic tools in rehabilitation medicine [3]. It is also a non-invasive indication of muscle activation level, therefore, it can be used directly to identify weak muscles [4]. EMG signals are also used to detect and measure muscle fatigue [5], [6], [7], [8]. There are differences between the sEMG signals obtained from a muscle with local muscle fatigue and the signals obtained whereas there is no fatigue. The two most general changes are the shift in the frequency component of the signal toward the end of the power spectrum and increasing in the amplitude [9]. Spectral analysis methods are used to show the presence of fatigue in the majority of studies. The results of analysis have shown that the frequency shift and the increase in amplitude in the spectrum are related to muscle fatigue. Muscle fatigue produces a maximum reduction in power and the power spectrum generally shift towards low frequencies [10], [11], [12]. The studies on 'muscle fatigue' show that the changes in the spectral characteristic and amplitude are caused by physiological reasons. In addition to this, almost all of the shift in the frequency is dependent on the propagation speed of action potentials [13]. Mean Frequency (MNF) and Median Frequency (MDF) are spectral variables commonly used in muscle fatigue studies [10], [14], [15]. Both variables have some disadvantages. MDF is less sensitive to noise whereas MNF is more sensitive and stable to changes in the spectrum [16]. In addition to the shift in the power spectrum, muscle fatigue causes an increase in the amplitude of the sEMG signal [12], [10], [17]. Muscle fatigue can occur in various muscles and it can be measured. Researches in the neck muscles confirm the changes in the spectral variables. Mousavi et al. investigated the relationship between muscle fatigue and muscle's functional role in Trapezius muscle during isotonic and isometric contractions. In their studies, EMG signals were simultaneously recorded from upper Trapezius and middle deltoid muscles with surface electrodes from 8 healthy subjects. For each second of recorded EMG signals; square root means square (RMS) and mean power frequency (MPF) values were calculated. The results show that the MPF is shifting to lower frequencies and RMS values increase when the Trapezius is in equilibrium and decreases when it is attractive [18]. McLean et al. recorded EMG signals from the lumbar and cervical regions with surface electrodes to examine muscle fatigue during prolonged postural contractions. The records were obtained from six subjects sitting in a seat without waist and armlet for two hours. In their studies, they used MDF and MNF which are myoelectric spectral parameters. As a result of this study, they showed that both the mean frequency value shifts towards lower frequencies as fatigue duration increases and an increase in the amplitude of the signal occurs [19]. Subaşı ve Kıymık aimed to determine muscle fatigue in biceps by time-frequency methods and independent component analysis (ICA). For this purpose; the EMG signals were obtained from 14 normal young people during phasic voluntary movements. EMG signals were analyzed in the time-frequency domain for the detection of muscle fatigue. In the study, muscle fatigue, which occurs on the upper extremity, is determined using multi-layered artificial neural networks (MLPP). Feature extraction operations were performed with STFT, Wigner-Ville and Continuous Wavelet Transform. The dimensions of the extracted signals are then reduced by ICA and the unknown EMG signal is classified by Levenberg-Marquart (LM) and Gradient Descent (GD) algorithms. According to the results, it is observed that the specificity and sensitivity values are found to be over 90% and the MLPP classifier using the LM algorithm could be used for muscle fatigue studies [20]. In this study, determination of local muscle fatigue in cervical disc herniation patients was investigated. As mentioned in the literature; it was necessary to detect the shift of the power spectrum density towards low frequencies. For this purpose, MDF and MNF values were used to examine muscle fatigue in the neck region. The power spectrum of the filtered EMG signals was obtained by the Welch method and MDF and MNF values for Resting/Working/Fatigue cases were calculated for each healthy/patient subject. The other part of this study is patient classification. It is aimed to differentiate patients from healthy subjects with EMG signals recorded during muscle fatigue. The main outlines of this work are pre-processing, feature extraction and classification. STFT, DWT and AR methods were used to extract the feature from the EMG segments. The power spectrum density (p) is used as a feature for STFT. Coefficients of CA6, CD6, CD5, CD4, CD3 obtained using Daubechies wavelets of dB2, dB3, dB4 were selected as the features for DWT. In AR method, the coefficients obtained with Burg, Yule-Walker and Covariance methods were used as features. In this study, a Multilayer Artificial Neural network (MANN) was used to classify sEMG signals. Multi-layered networks were created from three layers: the input layer, the hidden layer, and the output layer. 'Logarithmic sigmoid' network structure was used as an activation function for the hidden layer and output layer. This network structure was trained by momentum and adaptive learning rate backpropagation algorithm (Traingdx). For each of the classification stages, optimum values were sought in the test procedure in order to obtain the best result. The test error was taken into consideration after the optimum values were determined. Also, each classification period was recorded. Classification accuracy, sensitivity, and specificity were determined to evaluate the classification results with optimum values. This study compared the performance of different feature extraction methods in the same ANN architecture. It was observed that the STFT is not preferred for nonstationary signals due to the imbalance between time and frequency resolution. Generally, DWT which is time scaled method is preferred instead of STFT. The classification performance shows that STFT is more unsuccessful than other methods for this study. AR and DWT methods are more effective tools for non-stationary signals such as EMG.

Materials and methods
This study consists of the steps of recording sEMG signals, converting the signals to the appropriate form for analysis, filtering, muscle fatigue analysis, feature extraction, and classification. The sEMG data used in this study were recorded from 10 healthy subjects (4 male-6 female, age: 19-48) and 10 patients (8 male, 2 female, age: 17-67). Patients who participated in the study were diagnosed with a cervical disc herniation by a doctor. Healthy subjects were selected from those who did not have any complaints from the neck region. sEMG data were recorded by a doctor and a technician in the neurology clinic with the approval of Selcuk University Medical Faculty ethics committee. The recording process was performed from Trapezius and SCM muscles simultaneously using two channels of the multi-channel EMG device. Two Ag-AgCl surface electrodes were used for each muscle. In addition, the reference electrode was placed on the other fixed stationary arm to reduce the noise problem. In this study, the surface electrode was chosen instead of the needle electrode, so the recording process was carried out non-invasively. sEMG recordings were obtained in three steps for each subject. For the first step called resting state; the participant was asked to wait 20 seconds without moving the right arm in the upright sitting position. In the second step, the subject held a pound of weight in the right hand and turned his right arm full direction right to left for 20 s. Simultaneously a constant force was applied in the direction of the left arm to the neck of the subject while at the same time trying to rotate the neck to the right side for 20 seconds. Simultaneously a constant force was applied in the direction of the left arm to the neck of the subject while he was trying to rotate the neck to the right side for 20 seconds. This step is called working. The purpose of working step is to incorporate the Trapezius and SCM muscles in motion at the same moment. The third step is fatigue process. For this step, the subject was asked to hold his weight for 20 seconds, parallel to the right arm without moving. Fig.1. sEMG recording process [21] As a result of these operations, 60 sEMG data were recorded from 20 participants. sEMG signals were recorded with a sampling rate of 10 kHz. Figure 1 shows the sEMG recording system

Dataset and data pre-processing
Recording procedures of sEMG signals consist of three phases: resting-working-fatigue. Each stage takes 20 seconds. A timer was used to hold the time for recording. Records were obtained manually by pressing the start / stop buttons of the EMG device both the beginning and the end of 20 s. Each record typically contains 202496 samples. Since the timer was used manually, 2496 samples were obtained extra for 10 kHz in 20 seconds. After the extra samples were removed from the signal, filtering process was applied to remove the noise components. A Butterworth filter was used for the filtering the sEMG signals. In the second order Butterworth filter, a low pass and high pass filter were designed to use signals in the frequency range of 3-1000 Hz. Meaningful EMG signals are known to locate between 20 and 500 Hz of the frequency band. However, in this study; 3-1000 Hz band was used to protect the fatigue parameters. Figure 2 shows the frequency spectrum of original sEMG signal and its filtered frequency spectrum for a healthy subject. Figure 3 shows the frequency spectrum of original sEMG signal and its filtered frequency spectrum for a patient subject.

Examination of muscular fatigue in the neck region
The muscle fatigue occurs as a result of the muscular activity which is characterized by changes in EMG in the time or frequency domain. Generally, local muscle fatigue occurs after prolonged and relatively strong muscle activity. Because of the varying muscle characteristics of the p, there is no simple function of muscle load and timing that define muscle fatigue threshold [22]. The most common changes in muscle fatigue for EMG signal is the shift in the frequency component of the signal towards the end of the power spectrum and the increase in amplitude [9]. MDF and MNF are commonly used as spectral variables in muscle fatigue studies [10], [14], [15]. Analysis of the EMG signals in the frequency domain is done by measuring and calculating the parameters that determine the properties of the signals in the frequency spectrum. Usually, the Fast Fourier Transform (FFT) is used to determine the power spectrum density (PSD). Mean frequency expresses the average frequency of the power spectrum. The median frequency represents the frequency dividing the power spectrum into two parts of equal power. EMG signals are non-stationary signals. However, it can be regarded as stationary when examined in short time intervals [23]. FFT is not appropriate for non-stationary signals. For this reason, the Welch method is preferred for analysing the power spectrum of EMG signal. In the process of Welch method, overlapped (overlapping) segments are used and windowing is applied to all segments [24]. MDF and MNF values used to define muscle fatigue were calculated by the Welch method. 40 simultaneous recordings were obtained from Trapezius and SCM muscles of 20 subjects and MNF and MDF frequency values were calculated for resting-working-fatigue steps for each recording. According to the Table 1 and Table 2, it is observed that MNF values were insufficient to define muscle fatigue for the Trapezius muscle, however more suitable results were obtained for the SCM muscle. In addition, MDF provided the most stable results for both muscles. Through muscle fatigue analysis, it was determined that muscle fatigue occurred in both muscles by the experimental protocol which was developed for this study. So that, in the next step feature extraction and classification operations were performed using sEMG signals obtained during the fatigue phase.

Feature extraction methods for sEMG signals
Three different methods were used to extract features from filtered surface sEMG signals. These are STFT, DWT and AR methods. The extracted features were used as input vectors for ANN.

Short Time Fourier Transform (STFT)
Fast Fourier Transform is a method used to extract frequency characteristics from signals. Since it is not appropriate to use FFT on nonstationary signals, STFT which uses windowing method is recommended instead of FFT. The windowing method is used to handle a small piece of the signal in the time domain. Besides, it is used to express the sign in two dimensions as a function of time and frequency. In this transformation method, a certain part of the signal is assumed to be stationary, and Fourier analysis is performed by passing through a window [26]. There are two major problems with STFT. First, it is not possible to select the optimum window size for data segments with different properties. The second is; time-frequency imbalance. Increasing the time resolution reduces the frequency resolution. Also shortening of the data segments leads to the loss of low-frequency components. This imbalance between time-frequency resolutions can be solved by time-scale methods. In this study, hamming window structure with 512 lengths is used to perform STFT. The amount of shift in the original signal of the window is chosen 50%. The power spectrum density of each sEMG segment is calculated and used as a feature for ANN input. Thus, sEMG segments obtained for each subject were obtained as 257*395 for 512*395 of input data by calculating power vector. As a result, 257*3950 input data were generated for 10 healthy subjects. The same values are also valid for fatigue step obtained from patients.

Discrete Wavelet Transform (DWT)
The second feature extraction method is Discrete Wavelet Transform (DWT). The wavelet transform, a time-scaled method, is used to remove the STFT's resolution-related problems. DWT is based on the principle of providing more useful information from the original signal by dividing time and frequency into specific scales [24]. DWT uses a large window for areas where the bandwidth is narrow at low frequencies, a compressed, scalable window to analyse high-frequency details [27]. DWT is an effective tool for obtaining more variable time-frequency information from non-stationary signals such as EMG [28]. DWT provides high temporal resolution and low-frequency resolution at high frequencies, high-frequency resolution at low frequencies.
Thus, compared to the uniform time resolution of all frequencies of STFT low time resolution can be obtained. In the process of DWT method, a two-channel low-pass filter is used for signal processing. In the first order filtering, the signal is decomposed into low frequency components (A, Approximation) and high frequency components (D, Detail) according to an arithmetic rule [29]. After one level of decomposing, the whole signal is represented by half the number of samples, so the resolution decreases as well. However, the resolution of the frequency is increased because the frequency band of the obtained signal is half of the frequency band of the signal at the upper level. Thus, the uncertainty in the frequency is reduced by half. Better time resolution is obtained at high frequencies and better frequency resolution at low frequencies [30]. In analysing the signals with DWT, the number of appropriate wavelet selection and decomposition levels is very important. The number of decomposition levels is determined by the dominant frequency components [31]. In this study, decomposition level was set to six to protect fatigue-related values in the sEMG signals. Thus, each sEMG segment was divided into as detail sub bands D1 to D6 and Approximate band A6. In this study, features were obtained using 2nd level, 3rd level and 4th level Daubechies wavelets. For using 2 nd level Daubechies (dB2), the third, fourth, fifth and sixth level detail wavelet coefficients (66 + 34 + 18 + 10 coefficients) and sixth level approximation wavelet coefficients (10 coefficients) were calculated for each sEMG segment and a total of 138 wavelet coefficients were obtained. Looking at the wavelet frequency ranges, the first and second wavelet coefficients are eliminated because there is no significant sEMG signal in the D1 and D2 wavelets. Thus, 138*395 input data were obtained for each subject. As a result, 138*3950 input data were generated for 10 healthy subjects. For using 3rd level Daubechies (dB3), the sum of the wavelet coefficients (68 + 36 + 20 + 12 + 12 coefficients) was calculated as 148. Thus, 148*395 input data were obtained for each subject. As a result, 148*3950 input data were generated for 10 healthy subjects. For using 4th level Daubechies (dB4) using, the sum of the wavelet coefficients (70 + 38 + 22 + 14 + 14 coefficients) was calculated as 158. Thus, 158*395 input data was obtained for each subject. As a result, 158*3950 input data were generated for 10 healthy subjects. The same values are also valid for fatigue step obtained from patients.

Autoregressive Model (AR)
AR method is the third feature extraction method used in this study. Spectrum estimation methods are relatively easy to understand and can be easily calculated using the FFT algorithm. However, in order to obtain high-frequency resolution with these methods, it is necessary to work with long data records. Moreover, these methods are specific for finite data records and they are adversely affected by windows from spectral leakage effects. Often spectral leaks mask weak signals in the data. The model based approaches remove the requirement for window functions. As a result, parametric (model-based) power spectrum methods don't have the spectral leakage problem and provide better resolution than parametric methods [32]. In this study, AR method was used to handle the windowing problem. Fourth order coefficient vectors of yule-walker, Burg and Covariant estimators were selected as features for AR method. As a result of the calculations, 5*395 input data was obtained by three AR estimator for each subject. So that, 5*3950 input data were generated for 10 healthy subjects. The same values are also valid for fatigue step obtained from patients.

Artificial neural network (ANN)
In this study, multi-layer artificial neural network (MANN), which is widely used for estimation problems in engineering applications, was selected. Multilayer networks are composed of three layers: input layer, hidden layer and output layer. The hidden layer activation function and the output layer activation function were chosen as 'logarithmic sigmoid'. This network structure is trained by momentum and adaptive learning rate backpropagation algorithm (Traingdx). For each of the classification stages, optimum values were sought in the test procedure in order to obtain the best results. First, optimum hidden node number was searched by keeping the momentum constant (mc) and learning rate (lr) constant. Second, the optimum learning rate was calculated by keeping the number of optimum hidden nodes and mc constant. In the third step, optimum momentum constant was found according to the number of hidden node and learning rate. After finding optimum values, the artificial neural network was trained with input vectors and the average training error was calculated. Equation (1) shows the average training error (T.E), where training result is e(i) and training target is h1(i) [33]. (1) For Equation (1); k is the target data pattern number, m is the training data set number, and n is the network output number.
The test error has similar to the calculation of the training error in Equation (2). Equation (2) shows the average test error (Ts.E), where test result is t(i) and test target is h1(i). (2) For Equation y, k is the number of test data patterns, m the number of sets for test and n is the number of network outputs [33]. Confusion matrix is used to determine classification accuracy, sensitivity and specificity for evaluating the classification results. Table 2 shows the necessary information for the complexity matrix. For Table 2, the classification accuracy (CA) is calculated with equation (3), sensitivity (SE) with (4) and specificity (SP) with (5) [34]. (3)

Results
This study aimed to examine muscle fatigue in the cervical region and to show that local muscle fatigue of cervical disc herniation patients is different from healthy subjects. Simultaneous surface sEMG signals were obtained from the Trapezius and SCM muscles in the cervical region using surface electrodes from 10 healthy and 10 patient subjects. sEMG data were recorded from each subject for resting-working-fatigue steps. The recorded sEMG signals were filtered and adapted for analysis.  In this study, a multi-layer artificial neural network was used to classify sEMG signals. In order for ANN training to be more efficient, the set of input vectors normalized to [0, 1.0]. In this study, ANN has two output vectors as patient and healthy. The threshold values for the classification results in the ANN model were accepted as 0.9 and 0.1. So that, output vectors are assumed to be 1 when α≥0.9 and output vectors are assumed to be 0 when outputs are α ≤0.1. Table 3 presents the results of ANN according  to the features obtained with STFT, DWT and AR method for  Trapezius muscle. As shown in Table 3, dB3 wavelet and AR-Cov parameters used to classify healthy and patient subjects were more successful than others for Trapezius muscle. The AR-Cov method gave the best results with 99% classification accuracy. The shortest classification time was obtained by the AR-yule-walker. Table 4 shows the results of ANN according to the features obtained with STFT, DWT and AR method for SCM muscle. As shown in Table 4, dB3 wavelet and AR-Burg parameters used to classify healthy and patient subjects were more successful than others for SCM muscle. The dB3 coefficients gave the best results with 100% classification accuracy. The shortest classification time was obtained by the Ar-Burg method. This study compared the performance of different feature extraction methods for the same ANN architecture. According to the results; AR method provided the best classification accuracy for the Trapezius muscle and DWT gave the best classification accuracy for the SCM muscle.

Conclusion
This study was designed to examine muscle fatigue in the neck region and to classify cervical disc herniation patients using muscle fatigue. The recording procedure is non-invasive since EMG data is recorded with surface electrodes. In addition, simultaneous recordings were made using the multi-channel EMG device from two different muscle.
In the first step of the study, muscle fatigue was determined by MDF and MNF values obtained using the Welch methods. In the second step; feature vectors of sEMG signals were obtained with STFT, DWT and AR method. Training and test data were prepared for ANN using feature vectors. Finally, the superior feature extraction method for classification using muscle fatigue was investigated. According to the classification performance, STFT did not give sufficient results. In addition, AR and DWT methods were observed to be more effective tools for obtaining time-frequency information from nonstationary signals such as sEMG.

Discussion
In this study, sEMG signals from two different muscles were recorded with the multi-channel EMG device to examine muscle fatigue in the neck region. In addition, the experimental procedure was specifically developed to fatigue the neck muscles for this study. According to the experimental procedure, the Trapezius muscle participated muscle actions in more. SCM is in motion as long as a force is applied to it. In other studies, multichannel EMG can be used for more muscle, and muscle fatigue can be examined by changing the procedure of the experiment. So that, more muscle fatigue information can be obtained from more muscles in the neck region. In this study, sEMG records were obtained from volunteers who came to the neurology clinic. It is difficult and timeconsuming to collect data in the same registration procedure from each subject in policlinic. At this point, it is very important that the participants and the staff who registers the data are informed about the registration. In addition to this, increasing the number of healthy and patient participants can provide more stable results. This study shows that muscle fatigue characteristic is different in cervical disc herniation patients from healthy people.