1. Introduction
At present, pedicle screw fixation, which is the most popular technique in spinal fixation surgery [
1], is often used in vertebral dysfunction caused by spinal deformity, injury or pathology. In the screw fixation process, screw drilling is one of the most critical steps. There are also many organs, tissues and nerves distributed around the spine [
2]. To avoid the irreversible damages to them, and meanwhile ensure good stiffness as well as biomechanical properties of the spine [
3,
4], depth of the drilling bit into bone layers has to be accurately controlled in the operation while the angle between the drilling bit and the bone must be well chosen at the beginning of the process [
5]. Traditional spine surgery mainly depends on surgeons’ experience and skills in performing the surgery, so tiny mistakes of a doctor are likely to cause failure of the surgery.
During the drilling process, the invisibility of the osseous structure, along with consistent and stable friction between the pedicle and the cutting edge of the tool, makes it hard to directly measure the penetration depth. Therefore, it’s only possible to acquire different types of signals during the drilling process and extract features to estimate the depth of penetration. Signals which can be used for analysis are mainly image signals, force/torque signals, current signals and sound signals.
Systems using image signals for navigation can significantly improve accuracy of pedicle screw insertion and allow a surgeon to evaluate screw trajectory by using preoperative or intraoperative acquired images at the same time [
6,
7,
8]. However, lots of existing devices and methods rely on force and torque transducers to measure thrust forces and torques, and then detect bone layers and changes between the different bone layers from the detected signals. Online monitoring of the state of bone drilling is essential to improve operational safety and the surgeons’ operating skills. Based on the estimated cutting resistance that was detected by a motion sensor, the hand-held bone cutting system, which was proposed by Osa et al., could automatically stop moving and avoid the penetration of inner cortical bone [
9]. Lee et al. proposed a force controller to automatically stop at critical points. In addition, Hu et al. and Mohd et al. proposed a real-time force sensing algorithm based on force signals to identify different bone layers [
10,
11,
12]. By measuring force signals, Kaburlasos et al. estimated tarsal thickness in the surgery [
13] and Federspil et al. developed a force controlled robot to detect the dura in neurosurgery [
14]. Also, Kim et al. proposed a force feedback scheme to reduce the force exerted by positioning robot and provide force feedback to the surgeon through a dual moment sensor in long bone fracture operation [
15].
However, the force signal is noisy and filtering introduces a certain delay. The delay is determined by the filtering method and the equipment they used. Also, determination of the bone layers by force signals is influenced by the threshold value used. A high threshold will cause the identification to be too late while a low threshold will lead to false detections. Accini et al. proposed a novel algorithm to detect position of the drilling bit and realize the control of bone drilling only based on a position sensor [
16]. Dai et al. pointed out that bone status can be monitored by analyzing the vibration signals of bones and presented a non-contact system to collect and analyze the vibration signals by using a laser displacement sensor, and eventually the system achieved real-time detection of the drilling state in thoracic surgery, overcoming the shortcomings of any kinds of contact sensor [
17].
Some people have gradually turned their attention to the study of acoustic emission (AE) signals. AE signals are well studied and applied in the classification of composite material damage mechanisms [
18,
19], tool wear analysis [
20,
21] and detection of crack damage based on ultrasonic testing [
22]. However, in the aspect of identification and recognition of bone layers, the AE signals are still in a research phase. Boesnach et al. analyzed the AE signals in the process of spinal drilling and proposed that sound signals have a strong correlation with bone mineral density [
23]. Pohl et al. used a sound sensor to collect the AE signals to detect the state of penetrating of mice skulls [
24], and the bone layer was successfully identified. In 2014, Sun et al. analyzed the AE signals collected during drilling process through Fast Fourier Transform (FFT) and used Exponential Mean Amplitude and Hurst Exponent to verify energy characteristics and stabilities of the AE signals, and then developed a real-time algorithm to identify the bone layers by detecting the AE signals [
25]. Liao et al. analyzed histological structure and mechanical properties of the bone layers and studied the relationships between the AE signals and the aspects including formation of chips, depth of penetration and cutting faults in the drilling process. They declared that the AE signals contain much useful information and have good potential for studies and applications [
26].
The purpose of this study is to explore an innovative method of monitoring key points based on AE signals during pedicle screw drilling. Firstly, the AE signals in the bone drilling process are collected by a sound sensor and preprocessed by a recursive fast Fourier transformation (FFT), and then the preprocessing results will be further processed by a frequency distribution-based algorithm (FDB). The coefficients obtained from the FDB algorithm will be used in characterizing the shapes of fitting functions, which can indirectly characterize the recognition of the bone layers. Finally, a neural network is used to train and test to verify accuracy of the proposed method.
The structure of this paper is organized as follows:
Section 2 details bone layer analysis and AE signal acquisition.
Section 3 presents the proposed FDB algorithm in detail and briefly introduces the structure and parameters of the neural network. Experiments are carried out and the results are analyzed in
Section 4, then conclusions are given in
Section 5.
3. The Algorithm for the State Recognition
3.1. Pre-Processing with FFT
The characteristics of the AE signal in the time domain are not intuitive. To get more information, the signal needs to be transformed into the frequency domain for analysis. In order to realize real-time identification of the bone layers, it is necessary to analyze the frequency-domain features at each moment. Fast Fourier Transform (FFT) is an ideal way to achieve this.
Since the FFT reflects global features that can’t represent the spectral characteristics of features at each moment, the signal is processed by a recursive FFT. To reduce or eliminate spectral energy leakage and fence effects, different interception functions, also called window functions, can be used to truncate the signals. We choose a Hamming window with a frame size of 512 data points and a frame shift with 100 data points. The chosen area, which is denoted by a red dotted box and contains a certain number of data points in
Figure 3, can be considered as the original data for time
t, which is a moment randomly chosen for further analysis. By referring to the conclusions, which were obtained from many sets of experiments we conducted, the frequency of the drilling bone process is mainly distributed in range of 10 kHz and 15 kHz. The signals in the frequency of this range are less affected by noise interference for the reason that normal noise are not distributed in this frequency range. Further study of the signals in the selected frequency confirms it can well represent the drilling process. By referring to force signals collected simultaneously, we chose the moments of the signals for different bone layers and use the recursive FFT to process them. The frequency distributions of the chosen moments are shown in
Figure 4.
By comparing the distribution for amplitudes in
Figure 4a–d, it can been seen that the amplitude distribution for cortical bone layer is concentrated in high frequency bands while the distribution for cancellous bone layer is concentrated in lower frequency bands. However, compared with the former graphs, the amplitude distribution of the transition region is relatively uniform for the reason that the local maxima are smaller than the values in the former graphs and the distributions have little differences during the changes of values. In another words, the distribution can also be considered as be concentrated in intermediate frequencies for the transition region. This conclusion can be further analyzed and some certain characteristics can be found to describe the regularities.
3.2. Analysis of the Energy in Bone Layer
Different bone components have different sound transmissibility, and cortical bone is denser, so it can form a better reflection even for shorter wavelength signal components, so the transmittance is poor. Cancellous bone has a loose porous structure, so high-frequency components can be diffused within the bone through these pores and be attenuated by vibration to achieve better transmittance, so a medium with low transmittance will reflect more high frequency components, which include greater energy and can be collected by the sensor. Meanwhile, the medium with better transmittance will reflect a small amount of any high-frequency components, so the high-frequency energy received by the sensor is much smaller than the former.
For any moment
t, we select a fixed number of sampling points that include the current moment for the FFT transformation to obtain a frequency spectrum at this moment. The frequencies in the spectrum are distributed between 0–22,000 Hz. The above analysis shows that the cortical bone contains more high-frequency energy and the cancellous bone contains more low-frequency energy, while the energy is evenly distributed in the transitional region. Then we select s frequency between 10 kHz and 15 kHz from the spectrum signal, so
k pieces of data corresponding to moment
t can be obtained, that is:
where
xki are values of the amplitude and
Fi is the whole sequence of the moment
i. The energy of bone layers is proportional to the sum of square of the amplitudes.
3.3. Algorithm Based on Frequency Distribution
When drilling different layers of bone, the energy in different frequency bands will have different distributions, which means a large difference in the energy value of the signal in the selected frequency range. The most intuitive feature at this point is that the curve composed of Fi differs in the frequency domain and the shape of the curve varies when the layers of the bone are different. By describing the shapes of various curves, it is possible to show the characteristics of the curve to a certain extent and then reflect the energy distribution.
3.3.1. Normalization and Sorting
In order to eliminate the influence of the energy factor in the time domain and to unify different data on the same scale for subsequent analysis and processing, the number of
k data obtained above is normalized by Equation (2):
where
Fi is the chosen sequence and max (
Fi) is the largest value of
Fi, the result
Fni is a new sequence after normalization. To describe the sequence more precisely, a method that sorts the sequence
Fni from small to large in a certain order is discovered in Equation (3):
The result
Uni is an ascending curve in the unit domain and it can continuously reflect the relative distribution of data at various energy levels. After using the proposed method, the curves in
Figure 4 can be transformed separately into the curves in
Figure 5.
The curve corresponding to
y =
x, which is also the diagonal of the figure, is defined as an “original line”. When the energy distribution in the drilling process is relatively uniform, that is, no large transmission occurs, the distribution of energy in all frequency bands tends to be uniform as a whole. The ascending curve is distributed near the “original line”, which corresponds to the curve indicated by the dark green solid line with dots and the brown dot line in
Figure 5.
When the energy distribution in the process of drilling bone is concentrated in some certain frequency bands, after normalization and sorting processing, the curve deviates from the “original line”. However, the function curve can’t distinguish the signals of high-frequency energy bands from low-frequency energy bands. The reason is that when the energy is concentrated in low-frequency bands, after processing, the role of the low-frequency energy segment is more significant and will be concentrated in the latter part of the curve, which is the same as the situation when the energy is mainly distributed in high frequency bands. Just as the
Figure 5 shows, it’s hard to distinguish the red solid line which represents the cortical bone from the blue dotted line that represents the cancellous bone.
However, the main objective of this study is to identify the key points in transitional regions between cancellous and inner cortical bone. Compared with cortical and cancellous parts, the energy distribution in transitional regions is relatively uniform, so the transition area can be distinguished from the whole drilling process. However, there is still a problem to be solved. A function, whose coefficients can represent the characteristics of the curve and identification of the bone layer, is needed to fit the data.
3.3.2. Coordinate System Conversion
By studying a number of experimental data, it’s found that the ascending curve is distributed on both sides of the straight line
y = x in coordinate
C1, and the corresponding different layers of the bone exhibit a certain regularity. However, it is difficult to fit the curve in
Figure 5. It is possible to transform the curves distributed on both sides of
y = x to the distribution along the
x-axis by coordinate system conversion, which means the curve in
xoy coordinate system
C1 can be transformed into
x0oy0 coordinate system
C2 for analysis, just as
Figure 6 indicates.
For any point (
x1,
y1) on the curve, the relationship between the two coordinate systems can be established using Equation (4):
The solutions of Equation (5) are:
so the relationship between the coordinate
C1 and
C2 is:
where
A =
is the transformation matrix between the two coordinates
C1 and
C2.
By using Equation (6), the ascending curve representing the relationship between the amplitude and the frequency in
Figure 5 can be converted into a new curve, just as
Figure 7 shows.
3.3.3. Fitting Function Selection and Evaluation
As can be seen in
Figure 7, the curve is distributed on both sides of
x-axis and appears to be in the shape of a sine curve. Therefore, the fitting function should include a sine term that is across the points (0, 0) and (1, 0) and try to keep the function in a sine cycle. In addition, the change of coefficient of the sine function should also be able to control the intersection of curve and
x-axis, so the sine function can be expressed in the form:
where
c is the coefficient to control the location of the intersection point.
At a different time
t, the amplitude of the function curve will change accordingly. In addition, the amplitude of the curve increases when the value of
x increases, which is more obvious especially when the value of
x is large enough to make the curve be in the lower half of the
x-axis, so the amplitude of the function should also be related to the argument
x. An exponential relationship can be found between the variable
x and the amplitude. Thus the fitting function also has an item in form
aebx. Then the most intuitive performance is that the amplitude ratio between the amplitude below and above the
x-axis is related with the coefficient
b. To sum up, the function of the form (8) has a good fitting effect:
where
a,
b,
c and
d are coefficients of the fitting function, which are determined by the original data and fitting function together. The coefficients will be used to represent the identifications of different bone layers.
We selected two groups of data corresponding to different moments which represent the cortical bone layer and transition region for fitting and then evaluated the fitting process. The curves are shown in
Figure 8, and the results of the judging process and evaluation results are shown in
Table 1.
In
Table 1, SSE is the sum of squares due to error, so the closer the SSE is to zero, the better the effect of model selection and fitting is, and the more successful the data prediction is. R-square is the coefficient of determination and is used to characterize the goodness of fit and normality. The range of values is between 0 and 1, and the closer it is to 1, the better the model at fitting the data. RMSE is the root mean square error, which is also called fitting standard error of the regression system. It is the square root of the mean of squares of point errors of the prediction data and the original data. The smaller the value is, the better the fitting function is. It can be seen from the data in
Table 1 that the chosen fitting function has a good fitting effect and strong ability to explain the original curve. Therefore, the coefficient of the fitting function can well characterize the changes of the curve.
3.3.4. Analysis of the Algorithm’s Coefficients
The coefficients of the fitting function can well characterize the changes of the curve. When the drilling bit goes through the cortical bone and cancellous bone, the energy is more concentrated. After normalizing and sorting, the distribution of the latter part of the upward curve is more concentrated, while the distribution of the former part is sparser. Then for the rising curve, the number of the data above the “original line” is quite small. After the conversion of the coordinate system, the number of data above the x-axis is small, the most intuitive performance of which is that the intersection of the curve and x-axis is distributed in the front of the interval [0, 1], and the amplitude above the x-axis is small while the amplitude ratio mentioned above is quite large.
When the drilling bit is in the transitional region, the energy is distributed in the high-frequency and the low-frequency part at the same time. After processing, the distribution is scattered and relatively uniform around the “original line”, leading to the phenomenon that the intersection of the curve and x-axis is distributed in the middle part of the interval [0, 1] and the amplitude above x-axis is larger while the amplitude ratio is smaller.
It’s found from the study of fitting function (8) that the coefficients of the fitting function is mainly used to represent the amplitude of the curve above the
x-axis, so when crossing cortical and cancellous bone, the amplitude above the
x-axis is small, leading to a small value
a. Meanwhile, when passing through transitional region, the amplitude above
x-axis is larger, resulting in a bigger value
a. The coefficient
b shows a significant effect on the amplitude ratio of the curve. Term
b is quite large corresponding to cortical and cancellous bone, while it decreases significantly when it meets with the transitional region. Coefficients
c and
d jointly characterize the intersection of the curve with
x-axis. When drilling through cortical and cancellous bones, the intersection meets the front of the interval, the value of
c is large and the value of
d is small. In the transition region, the intersection point is in the middle of the interval where
c is small and
d is large. The situation of the coefficients corresponding to different bone layers is shown in
Table 2 and changes in shapes of the functions by different coefficients are displayed in
Figure 9.
In
Table 2, “s” represents small and “h” represents large. It can be seen clearly that the characteristics of the transition region differ a lot from the cortical and cancellous bone, which can be used as the results of the algorithm for bone layer identification.
Figure 9 indicates that coefficient
a mainly controls the amplitude of the curve and coefficient
b shows the ratio of the amplitude, while the coefficients
c and
d together control the intersection between the curve and the
x-axis.
To sum up, the whole process can be summarized in several simple steps. Firstly, we collect the AE signals and use the recursive FFT method to transfer the signals into the frequency domain for analysis. Then, we use the proposed FDB algorithm to deal with and get the sets of coefficients as results. Lastly, the bone layer identification can be realized by the coefficients. The whole process can be seen in
Figure 10.
3.4. Neural Network Model and Parameter Setting
In order to avoid accidental result and prove universal significance of the proposed method, a large amount of experimental data needs to be used to prove it. For the sake of generality, a part of the experimental data can be trained and then the results of the identification will be obtained by testing with other untrained data. The result of recognition rate can be used as an index to evaluate the effectiveness of the algorithm. Among many methods, neural network is a better choice.
When the coefficients of fitting function are taken as inputs and the recognition result of the bone layer is taken as the output, the input layer of network model has four inputs and the output layer only has one output. According to selection rules of network nodes, the error back propagation (BP) neural network with implicit layer containing three neurons is selected. The neural network model used to train and recognize data collected in experiments, is shown in
Figure 11.
In
Figure 11b,
x1–
x4 are input values of the eigenvectors,
wij is the weight corresponding to input
xi and
bj is the corresponding threshold or offset value. The weight and threshold value will continuously change according to the error feedback in the network training process to let the network achieve the best conditions. The fitting function coefficients are used as eigenvectors. In order to facilitate the processing and comparison of data, the eigenvectors need to be normalized to make the Euclidean norm equal to one. For the coefficients
a,
b,
c,
d, we use the maximum-minimum normalization (9) to process the signals:
where
x is replaced by the coefficients of the fitting function, respectively. The obtained coefficients are distributed in the interval [0, 1].
The transfer functions are selected as follows: the log-sigmoid-type transfer function “logsig”, whose output is between 0 and 1, is selected in hidden layer and the linear transfer function “purelin”, whose input and output values can take any value, is selected in the output layer. However, since the output of the hidden layer is the input of the output layer, the output value of the network is between 0 and 1. The “trainlm” function, which represents the “Levenberg–Marquardt” method and has the fastest training speed in the middle scale of a feed-forward network, is chosen as training function. We set the number training times of the network at 2000, and 10−6 as the convergence error and 0.001 as the learning rate. The neural network model can meet the requirements.
In addition, according to the collected force signal, the fitting coefficients at different times after treatment are marked differently. The time when layers are characterized as cortical or cancellous bone by force signal and the coefficients is marked as 0 while the other moments are marked as 1 for transition regions. Due to the fact a linear transfer function is used in output layer, the output value must be distributed between [0, 1] instead of only 0 and 1 values. In order to have a good measure of training and recognition results, we select a function to set the values between 0 and 0.1 to 0 and set the values between 0.9 and 1 to 1.