1. Introduction
BSR (Buzz, Squeak, Rattle) noise is a common quality issue found in interior parts for automobiles, with over 50% of these issues occurring in panels, seats, and doors of automobiles [
1,
2,
3]. Addressing consumer complaints resulting from BSR noise requires substantial costs for improvements. In the structural domain, BSR noise is linked to performance degradation and durability issues in components. BSR noise is classified into Buzz, Squeak, and Rattle problems. Buzz and squeak problems have a clearly structural mechanism, allowing for established theoretical and interpretive approaches to improvement methods. Particularly, studies on friction-induced noise focus on dynamic instability mechanisms that occur in systems based on linear theory, leading to mechanism-based solutions.
Kang et al. [
4] developed a comprehensive mathematical mechanism for dynamic instability in brakes, providing a theoretical understanding of friction-induced noise. Nam et al. [
5] analyzed the mechanism of the point contact friction model for friction-induced noise using a pin-on-disk friction system, which efficiently described the cause of friction-induced noise by describing the characteristics of the friction curve through experiments. Despite extensive research on friction-induced noise based on linearization theory through various applications, analyzing the mechanism of rattle noise remains challenging because of its nonlinear characteristics.
Rattle can be described mathematically through an impact oscillator that includes Hertz’s contact model. However, researching rattle is challenging owing to extreme nonlinearities such as chaos [
6,
7]. Shin et al. [
8] introduced a dynamic stiffness analysis technique, a degradation BSR analysis technique, and a direct virtual method developed from the BSR perspective to ensure the robustness of the BIW (Body-in-White) body system—a load transfer medium—and the corresponding modules for each part. Lee et al. [
9] improved the E-Line method, commonly used to predict BSR noise, by utilizing a statistical method to determine the tolerances between parts expressed as dispersion and dynamic deformations. To directly express the behavior of rattle noise in the seats of autonomous vehicles, Kim et al. [
10] calibrated an analysis model based on sinusoidal wave experiments and described the location and characteristics of impact noise through explicit analysis. Despite advancements in hardware and software enabling quicker analytical approaches to simulation-based studies on the rattle, significant time and cost are still required for these analyses. BSR mainly occurs in automotive interior parts and is evaluated during the final phase of performance verification of automotive seats to evaluate quality. Choi et al. [
11] analyzed BSR characteristics after performing the excitation and operating durability tests on automotive seats and tracked the major noise sources. In another study, they analyzed the BSR vibration characteristics of the seat cushion frame before and after durability tests to assess how changes in the stiffness of the frame affected the BSR characteristics [
12]. Wan et al. [
13] conducted a study on an efficient noise diagnostic method using the STRE-VK method, which calculated measurement criteria for identifying various types of BSR by separating signal components and demonstrated that BSR could be predicted based on signal processing.
Predicting BSR is challenging and does not provide a clear solution. Furthermore, comprehensive system analysis requires significant cost and time. Solutions are mainly carried out using experimental measurement methods, which require complex systems, expensive equipment, and engineers with expertise because the relevant regulations and calculation methods are complex.
Deep learning has surpassed human cognitive abilities in various fields through its rapid advancements. Algorithms built on nonlinearity achieve highly accurate predictions for unstructured data. Predictions using deep learning do not require equations of motion and rely solely on data, encompassing uncertainty and nonlinearity without the need for complex calculations. Wiercioch et al. [
14] proposed a novel deep neural network (DNN)-based model to predict the characteristics of molecules and demonstrated accurate prediction for chemical characteristics. Additionally, Yu et al. [
15] proposed a strategy to compromise the correlation between output variables through shared and separated parts by suggesting an MD-DNN (Multi-channel Decoupled DNN) model. DNNs are widely used in various fields to predict nonlinear systems.
In vibration analysis, deep learning is used to predict and analyze vibrations in numerous applications. Nam et al. [
16] visualized the chaos phenomenon—the most complex phenomenon in dynamical systems—using various signal processing methods. They also described how recurrence plots can be used to classify chaos phenomena utilizing convolutional neural networks (CNNs). Recurrence plots require a reconstructed phase space to address self-crossing issues, but it is challenging to reconstruct the ideal geometric dimensions of complex trajectories with noise, such as real-world phenomena [
17]. Thus, predictions and classifications based on experimental data are expected to provide the most straightforward and purposeful direction for BSR studies. Huang et al. [
18] proposed a theoretical architecture for diagnosing acoustic faults based on time-frequency analysis and machine learning using Support Vector Machine (SVM) techniques. They presented research results on fault identification based on signals measured using smartphones and discussed the accuracy of their results.
BSR evaluation is determined based on Loudness N10, a quantitative metric. Since the calculation method for BSR evaluation is complex and requires expensive equipment and specialized software, predicting and verifying BSR characteristics in the design and development phases of components proves challenging. Conversely, it is impossible to obtain substantial BSR data from similar systems except from relevant development companies. Furthermore, even for these companies, it is unlikely to acquire substantial data through measurement methods. The sound source in the field inherently includes variability, which can differ from the ideally measured noise. Therefore, making predictions based on machine learning faces the following challenges: insufficient data, issues with informal data such as noise, and data classification problems.
In this study, we simulated real-world noise to reconstruct seat noise to predict Loudness N10 a quantitative metric used to evaluate BSR noise. Particularly, we aimed to estimate the BSR characteristics of the developed system by predicting Loudness N10 a quantitative metric for BSR in an anechoic chamber through simple field tests using noise containing sound sources. Loudness N10 predictions are based on statistics, and in this study, we described a method for predicting quantitative metrics solely based on the characteristics of physical quantities without requiring special equipment or calculations. We analyzed significant physical quantities from a statistical perspective and the characteristics of Loudness N10 through correlation analysis and derived two significant physical quantities. Data augmentation was not utilized as it can distort data, and the method of increasing the amount of data was not used because it is a common method for enhancing the performance of models. Instead, we employed the K-fold cross-validation technique to address data limitations. Loudness N10 predictions were made using the physical quantities analyzed through a DNN.
Figure 1 illustrates the flow diagram of the prediction procedure and performance verification of the proposed method.
2. Method
2.1. Construction of BSR Dataset and Physical Quantity Information
BSR measurements and Loudness N10 calculations were performed based on GMW 14011, as illustrated in
Figure 2 [
19]. The BSR data were extracted from positions 150 mm away from each point on the car seat, as shown in
Figure 2a, in accordance with GMW 14011. A multi-axis silent shaker was used, as depicted in
Figure 2b. The background noise of the anechoic chamber was within 30 dB(A) under the operating conditions of the shaker, and the environmental chamber allowed for temperature control from −40 °C to 50 °C. The experimental conditions of the configured dataset were measured at low temperature (−20 ± 5 °C), room temperature (23 ± 5 °C), and high temperature (50 ± 5 °C).
Loudness N10 was calculated using software(ArtemiS Classic V12) based on Zwicker Loudness. BSR data were measured using nine microphones across the 130 different seat models used, resulting in a total of 1170 data points. The data used varied in environmental conditions, such as temperature and seat position, during the measurement process. Since this study aims to estimate Loudness N10, which requires complex calculations based on various physical quantities related to sound quality and acoustics, environmental conditions were not considered. However, the same test method was used for all measurement conditions. An exciter with operating background noise less than or equal to 30 dB(A) and a 300 Hz high-pass filter were utilized in the experiment. Loudness N10 estimation was performed by analyzing the characteristics of a total of ten physical quantities related to sound quality and acoustics. Each physical quantity was based on the lowest level (N10) in the top 10% positions. Ten physical quantities relating to sound quality and acoustics were used: Loudness (M1), 3rd octave (M2), sound pressure level (M3), fluctuation strength (M4), Roughness (M5), Sharpness (M6), Tonality (M7), Harmonic distortion (M8), Speech intelligibility index (M9), and Articulation index (M10). Given the significance of magnitude in BSR, the selection of the physical quantities was defined as the physical quantities for sound pressure level and those that determine emotional quality.
The measured signals contain noise due to the external environment and structural issues. Data measurement involves considering the measurement process and analyzing the signals through a filtering process using specialized hardware and software for system characteristics. Noise can be implemented using various methods, but in numerical analysis methods, it is generally implemented using Gaussian noise. The probability density function of the noise applied to BSR sound sources is defined as follows:
In this equation,
and
represent the standard deviation and mean of the noise signal, respectively, and
denotes the noise signal. Noise was implemented using a Gaussian random distribution, and the standard deviation was modeled at the 2/3 level of the basic data. The characteristics of the signals with noise are illustrated in
Figure 3.
Figure 3a,b illustrate the results in the time domain and frequency domain, respectively. Gaussian noise was introduced into the raw data to exhibit characteristics of the added noise that did not exist previously. Particularly in the frequency domain, the characteristics of the added noise are exhibited across all frequencies except the fundamental frequency.
Each physical quantity was normalized according to the physical quantity calculation method because the absolute magnitude varies depending on the calculation method. Common normalization methods include the min–max normalization method and the z-score normalization method. The z-score method is suitable for handling outlier problems and is sensitive to the mean and standard deviation of the data. However, the constructed data were measured at equal intervals using microphones with similar specifications at nine positions. Since the microphones had similar characteristics, the possibility of outliers occurring is minimal. Thus, the min–max normalization was performed. The normalization results are not a conclusion of this study, and the relationship between Loudness (M1) and each metric was intuitively compared using the minimum value (0) and maximum value (1). Although 130 data points were analyzed through the data analysis, only the results for representative samples were described.
Table 1 illustrates the results of the samples containing normalized noise.
As shown in the normalized results, Loudness exhibits extremely similar characteristics to the acoustic physical quantities M2, M3, M4, M5, and M6. Conversely, Loudness shows contrasting results with M7, M9, and M10. Alternatively, Loudness is presumably determined by the magnitude of the noise and the frequency of the sound. Although complex factors enable a precise analysis of systems, they complicate the polynomials. Hence, it is necessary to exclude physical quantities with low impact. Therefore, covariance analysis and correlation analysis were performed to define the relationship for each physical quantity and derive significant factors.
2.2. Variables for the Physical Quantity Correlation Analysis and Determination of the Variables
Physical quantities calculated using different methods represent the characteristics of sound quality. Thus, the selection of physical quantities analyzes the correlation of related variables to derive the final physical quantities that will be used in regression and deep learning. Correlation analysis examines the strength of the linearity between the physical quantities and identifies the presence of linear relationships as a statistical result. It also defines the correlation between variables by performing covariance analysis and defines the relationship based on the levels of variables, regardless of units. A positive correlation exists between two variables when an increase in the value of one variable corresponds to an increase in the value of the other. Conversely, a negative correlation occurs when an increase in one variable results in a decrease in the value of the other. A covariance of zero indicates that the two variables are independent of each other. The results of the covariance analysis are listed in
Table 2.
Since the results of the covariance analysis define the relationship based on the level of the variables, they were expressed based on the normalized physical quantities. As shown in the covariance analysis results, each physical quantity exhibits a correlation with each other. Similar to the normalized data analysis results, the covariance analysis results indicate a correlation of approximately 0.07 between the magnitude-based quantities M2 and M3 and the physical quantities corresponding to frequency characteristics M4, M5, and M6. Considering vocal aspects, BSR noise is an unclear signal, suggesting that voice-related metrics may exhibit a high negative correlation. Since covariance does not include the degree of the relationship (the degree of the relationship according to the level of two variables), the relationship between the two variables was analyzed through correlation analysis. However, the previous covariance analysis was performed based on normalization to minimize the error in the deviation of levels. Therefore, it can be predicted that the correlation analysis results will exhibit similar characteristics to the normalized covariance analysis results. The results are listed in
Table 3.
The analysis of correlation coefficients was performed using multiple correlation analyses for a total of ten physical quantities. The sample correlation coefficient indicates the linear correlation between variables. The results of the correlation analysis for each physical quantity demonstrated that the physical quantities related to the sound pressure level (M2, M3) exhibit the highest linear correlation, and the physical quantities corresponding to frequency characteristics (M4, M5, M6) also exhibit high linear correlation. Roughness and Sharpness demonstrated relatively high correlations, likely attributed to the low-frequency and high-frequency characteristics due to Gaussian noise instead of the correlation of the pure system.
Figure 3 illustrates the correlation analysis results for Loudness N10 of the data with and without noise.
As illustrated in
Figure 4, metrics related to the sound pressure level equally exhibited high correlations regardless of the presence or absence of noise. However, Roughness (M5) and Sharpness (M6), which correspond to the frequency characteristics, showed relatively low correlations in the absence of noise, while they exhibited high correlations when noise was present. This outcome can be attributed to the characteristics of Gaussian noise, which adds noise across the entire region. Hence, the correlation between Sharpness, which represents high-frequency characteristics, and Roughness, which represents low-frequency regions, increased. Therefore, sound pressure level (M2), which exhibits a high correlation with Loudness N10 regardless of the presence of noise, and fluctuation strength (M4), which can partially reflect the frequency characteristics, were selected as effective factors.
2.3. Method of K-fold Cross-Validation
K-fold cross-validation is a method that evaluates a model by randomly partitioning the dataset into k sub-groups. It uses one of the sub-groups as the test data and the remaining k-1 sub-groups as the training data. This was repeated k times. The model is evaluated based on the average prediction error derived from each iteration. Typically, five or ten is used as the value of k to balance (Trade-off) the bias and variance of the regression model [
20]. In general regression models, overfitting may occur, which only reflects biased characteristics. The K-fold cross-validation method can prevent this issue by randomly partitioning the dataset into training and test data and building and evaluating the model k times. Since BSR signals are collected during the final stage of the process, it is impractical to obtain a large amount of data. K-fold cross-validation is a representative method that leverages all data for both training and testing, thereby enabling the creation of a more generalized model and effective detection of overfitting and underfitting. Consequently, to address the issue of limited data, we employed K-fold cross-validation in this study, as illustrated in
Figure 5.
Regression models can be divided into linear and nonlinear models, depending on the distribution of the data. No particular model is superior to the others. Rather, it is important to select the optimal model based on the type of data. In this study, the final model was selected by comparing the multiple linear regression model and a multiple nonlinear regression model.
Multiple linear regression is a regression analysis technique that models the linear relationships between a dependent variable and two or more independent variables. The multiple linear regression model is expressed using a linear equation, as shown in the equation below.
and
are both independent variables.
is a regression coefficient and represents the influence of each independent variable.
Linear regression uses the method of least squares, which minimizes the sum of the squares of the residuals, to estimate the regression coefficient. However, as the number of independent variables increases, multi-collinearity may occur due to the correlations between the variables. Hence, the variance of the least squares regression coefficient estimates increases, thus reducing the stability of the prediction accuracy of the regression equation [
21].
In this study, a nonlinear regression model in the form of an exponential function was constructed through logarithmic transformation, as shown in the equation below.
and
are both independent variables.
is a regression coefficient and represents the influence of each independent variable.
When there is a nonlinear relationship between an independent variable and a dependent variable, logarithmic transformation can be used to model this relationship linearly, making it a linear relationship. The regression coefficient of the linear model can be derived by applying the least squares method. Logarithmic transformation can linearly transform variables using natural logarithms, as shown in Equation (4).
Here, the regression model can be expressed as Equation (5) for i datasets through matrix transformation.
Assuming
, the least squares estimate can be expressed as shown in Equation (6) when
exists [
22]. The regression coefficient is determined through Equation (6). If Equation (6) is substituted into Equation (4), reverse exponential transformation can be performed to derive a multiple nonlinear regression equation similar to Equation (3).
2.4. Machine Learning Model
Since deep learning is performed based on data, a substantial amount of data is generally required to improve accuracy. Accuracy typically improves with the increase in network depth, and an optimized model can be constructed through careful tuning of hyperparameters. A DNN, also known as a feedforward neural network or a multi-layer perceptron, is a neural network that has two or more hidden layers [
23,
24,
25]. As illustrated in
Figure 6, the DNN described in the example has three input dimensions and five neurons in the hidden layers. The output of the hidden layers is expressed as follows:
In this equation,
is the output of the
-th neuron of hidden layer 1.
denotes the activation function, and ReLU is typically used as the activation function [
26,
27].
is the connection weight between the
-th input and the
-th neuron of hidden layer 1. Additionally,
denotes the bias of the
-th neuron in hidden layer 1. Assuming there are
neurons in layer
, the output of the
-th neuron in layer
is calculated as follows:
The feedforward neural network performs computations using the outputs of preceding layers, beginning with the input layer and going through to the results of the output layer. The neural network uses a loss function to measure the deviation between the predictions made by the model and the actual values and utilizes the gradient descent method to update the weights and biases of each layer to gradually bring the prediction values of the model closer to the actual values. The regression loss function for the
-th layer is calculated as follows:
In this equation,
represents the predicted value of the DNN model, and
denotes the actual value. The gradient of the loss function can be calculated as follows:
In this equation, denotes the Hadamard product.
Assuming
as the inactive output, the inactive output of the
-th layer,
, can be expanded as follows:
Subsequently, the gradient of the loss function can be transformed as follows:
Through mathematical derivation, the relationship between
and
can be obtained as follows:
The changes in the weights and biases of the
-th layer due to gradient descent are as follows:
In this equation,
denotes the step size, and optimization functions that use gradient descent include SGD (Stochastic Gradient Descent), RMSprop (Root Mean Square Propagation), and Adam (Adaptive Moment Estimation) [
28,
29,
30]. This study was not research on optimizing parameters; hence, the architecture was kept simple, and Adam was used as the optimization function. The architecture for predicting Loudness is listed in
Table 4.
The activation and optimization functions were defined as the ReLU function and Adam (learning rate: 0.001), respectively. Various methods have been developed for initial weights, such as Xavier and He initialization, to minimize the occurrence of convergence problems and local minimum issues [
31,
32]. However, since the aim of this study was not to optimize machine learning models, initialization issues were not addressed, and hyperparameters were not optimized. The dataset is divided into three parts: training, validation, and the test dataset. The dataset split ratio and the number of data points used for training are listed in
Table 5. Since the order of data can also have a significant impact on the accuracy of training, data were shuffled to prevent overfitting due to sequential data. The input data were standardized to eliminate errors because of varying data sizes. The training was performed 1000 times, and early stopping was applied, which stops the training if there is no improvement in error after training 20 times.
4. Conclusions
In this study, we aimed to propose a deep learning method for predicting Loudness N10, a quantitative metric for BSR. This metric requires difficult conditions and is complex to calculate based on the physical quantities related to the acoustics and sound quality of automotive seats for sound sources containing noise. Among various physical quantities, sound pressure level and fluctuation strength were derived as significant factors based on the analysis of variance results. In addition, the traditional K-fold cross-validation method was utilized to derive linear and nonlinear regression equations. However, the prediction results showed relatively large errors, with values of 2.08 and 0.69. This outcome indicates that BSR cannot be predicted using regression equations.
Conversely, predictions using DNN in hold-out estimated Loudness accurately, with a value of 0.55. We obtained numerous datasets from other studies. However, it is nearly impossible to acquire a large amount of data and various types of datasets from experiments. The K-fold cross-validation method can achieve maximum efficiency within a limited dataset for development purposes and from a methodology perspective. Therefore, we proposed the method of applying K-fold cross-validation to a DNN as a method of predicting Loudness. Consequently, we attained the best-performing prediction model within an error range of 0.54. Since we could not acquire extensive BSR noise datasets in a limited environment, we utilized the proposed DNN method to verify that the proposed model has relatively superior performance. It is predicted that the quantitative test index for BSR can be estimated using a few sound-quality physical quantities, even when noise is included. Therefore, the results of this study suggest that it is feasible to estimate the results of complex noise and vibration experiments, including BSR experiments with limited datasets. This demonstrates the significance of applying machine learning-based prediction methods to various engineering experiments that involve nonlinearity.
In future research, we aim to establish a methodology that utilizes several physical quantities to apply machine learning so that the BSR characteristics of the seat can be estimated from all positions in actual tests.