Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy

Hu, Hao; Deng, Shiqi; Yan, Wang; He, Yanyong; Wu, Yudong

doi:10.3390/electronics13132553

Open AccessArticle

Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy

by

Hao Hu

¹

,

Shiqi Deng

¹,

Wang Yan

¹,

Yanyong He

¹ and

Yudong Wu

^2,*

¹

School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China

²

School of Intelligent Manufacturing, Chengdu Technological University, Chengdu 611730, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(13), 2553; https://doi.org/10.3390/electronics13132553

Submission received: 17 May 2024 / Revised: 27 June 2024 / Accepted: 27 June 2024 / Published: 28 June 2024

(This article belongs to the Special Issue Applications of Artificial Intelligence in Mechanical Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

The acoustic performance of automotive micro-motors directly impacts the comfort and driving experience of both drivers and passengers. However, various motor production and testing uncertainties can lead to noise fluctuations during operation. Thus, predicting the operational noise range of motors on the production line in advance becomes crucial for timely adjustments to production parameters and process optimization. This paper introduces a prediction model based on a Multi-Branch Channel–Spatial Adaptive Weighting Strategy (MCSAWS). The model includes a multi-branch feature extraction (MFE) network and a channel–spatial attention module (CSAM). It uses the vibration and noise data from micro-motors’ idle operations on the production line as input to efficiently predict the operational noise uncertainty interval of automotive micro-motors. The model employs the VAE-GAN approach for data augmentation (DA) and uses Gammatone filters to emphasize the noise at the commutation frequency of the motor. The model was compared with Convolutional Neural Networks (CNNs) and Multilayer Perceptrons (MLPs). Experimental results demonstrate that the MCSAWS method is superior to conventional methods in prediction accuracy and reliability, confirming the feasibility of the proposed approach. This research can help control noise uncertainty in micro-motors’ production and manufacturing processes in advance.

Keywords:

noise uncertainty interval; micro-motor acoustic performance; multi-branch channel–spatial adaptive weighting strategy; VAE-GAN; Gammatone filter

1. Introduction

As automobile electrification advances, the environment inside the car becomes quieter [1], highlighting the micro-motor noise that was previously masked by engine sounds. Since automotive micro-motors are widely used in various components of electric vehicles [2], including wipers, air conditioners, electric seats, windows, and other components [3,4,5], each car is equipped with dozens of micro-motors. Although their sound pressure is generally low, the large number makes them an important source of noise inside the vehicle, and the permanent magnet DC motor does not include MOSFETs or IGBTs, so the switching frequency of the controller is not involved in the influence of noise. However, due to the influence of the brush, the noise is highly noticeable around the motors’ commutation frequency and will affect passengers. The problem of operational noise from automotive micro-motors is receiving increasing attention. However, differences in equipment precision, operational inconsistencies during manufacturing and testing, component wear, temperature changes, and other factors contribute to noise fluctuations during operation. These factors collectively produce an operational noise uncertainty interval in automotive micro-motors [6]. Due to technical and practical limitations, comprehensive testing of all micro-motors is impossible. In the micro-motor production line, by predicting the uncertainty interval of the operating noise in advance, real-time adjustments and optimizations can be made, which are crucial for motor design, production, and quality control. During motor design optimization, considering the impact of uncertainty, it is essential to select production parameters that meet performance requirements and have smaller operating noise intervals and more stable performance. Additionally, by analyzing the noise prediction results, production equipment and processes can be promptly adjusted, such as equipment tuning and maintenance, lubrication plan adjustments, and motor magnetizing equipment inspections. Specific process parameters such as cutting depth, cutting speed, tool selection, coolant usage, and machining path planning can also be adjusted accordingly to ensure that the motor’s noise interval meets the required standards during operation. Considering the differences in acoustic performance between idle and operational states of micro-motors, the production line must adopt intelligent predictive strategies to ensure the quality of the final product.

1.1. Words Related to Noise Prediction

The prediction of micro-motor operational noise involves the prediction and analysis of acoustic performance in automobile micro-motors (usually referring to micro-motors embedded within various electric drive systems of electric vehicles) [7]. This prediction considers sound pressure levels (SPLs) at designated points alongside statistical measures such as variance and mean. The prediction of operational noise generally consists of three main components: data acquisition and preprocessing, feature extraction and analysis, and the design of preference models [8].

Feature extraction and analysis are crucial stages in noise prediction. In recent years, signal processing methods such as windowed Fourier transform, adaptive short-time Fourier transform, Hilbert–Huang transform (HHT), wavelet packet transform (WPT), and empirical mode decomposition (EMD) have been widely applied in noise feature extraction [9,10,11,12,13]. Particularly, the Mel filter, which simulates the frequency perception of the human auditory system, has been employed extensively. Yassin [14] utilized the Mel-Frequency Cepstral Coefficients of acoustic signals to classify vehicles. Despite its success in various fields, the Mel filter has limitations, such as deviations from human auditory perception and inadequate time–frequency resolution. Therefore, researchers have proposed Gammatone filters, which are designed based on the human cochlear model, making their frequency response closer to human auditory characteristics. Research has demonstrated the utility of Gammatone filters in enhancing the accuracy of sound frequency perception models and their applications in diverse fields, such as spatial audio event detection and rail crack detection [15,16]. In addition to feature extraction methods, selecting appropriate methods to build prediction models has also attracted extensive research.

Traditional automotive noise analysis methods, which largely depend on physical modeling and signal processing, often encounter problems such as high complexity, low accuracy, and limited generalization. Due to the large number of motors on the micro-motor production line and the vast amount of data requiring processing and analysis, traditional methods struggle to achieve fast and accurate noise prediction. This challenge is further compounded by various uncertainties in the production process [17]. Introducing data-driven methods can automatically learn and analyze large amounts of historical and real-time data [18], improving the accuracy of noise prediction. Furthermore, data-driven models can use the continually increasing test results on the production line, continuously optimizing the prediction strategy by receiving new data. This reduces dependence on experts, lowers labor costs and the risk of misjudgment, and meets the needs for efficient, adaptive, and accurate noise prediction. Li [19] utilized Elman neural networks to predict vehicle noise levels and optimize vehicle body structures. Huang [20] employed an improved Deep Belief Network (DBN) model to achieve a precise assessment of vehicle noise. Steinbach et al. [21] compared artificial neural network models to linear regression models based on psychological parameters regarding their predictive performance, revealing the superior predictive performance of neural networks due to their ability to manage complex nonlinear relationships.

The application of neural network methods in industrial domains is key to success. Industrial data often face challenges in data acquisition and high equipment costs. Moreover, the frequency of faults or abnormal events is low, resulting in sparse data samples that are insufficient to meet model training requirements. To address the problem of data scarcity, DA technology has become an effective solution. These methods enhance data diversity by transforming existing data through rotation, flipping, masking, or adding noise to create new samples [22]. This allows models to learn various noise characteristics over different timescales. Recently, DA methods based on generative models have gained significant attention, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), which can learn the distribution of noise data and generate new noise samples that are similar to real data [23,24]. Han et al. [25] proposed an improved GAN model for material image generation and DA. Islam [26] utilized autoencoders to generate additional samples, balancing imbalanced datasets. As DA techniques continue to evolve, their application in automotive noise analysis will become more widespread.

In fact, during the manufacturing, assembly, and testing processes of micro-motors, various uncertain factors influence their acoustic characteristics. In response to these challenges, various methods such as probability theory, fuzzy mathematics, and gray system theory have been used to address uncertainty problems [27,28]. James [29] utilized the evolution equation of the probability density function (PDF) to quantify uncertainties in the sound field caused by environmental, boundary, or initial conditions. Yin [30] proposed a unified uncertainty model based on evidence theory to handle mixed stochastic and cognitive uncertainties in structural acoustics problems. Interval theory is gaining attention as a robust method for handling incomplete or fuzzy data [31]. It allows parameters or variables to exist within specified intervals, enhancing modeling flexibility. Huang [32] proposed an improved interval analysis method to address and optimize the uncertainty of tire and road noise in electric vehicles. Dong [33] applied an interval model-based method to tackle internal noise uncertainties in electric vehicles. Although traditional methods based on data models have achieved some results in addressing uncertainty problems, they have limitations such as model assumption restrictions, difficulty in determining parameters, and limited ability to handle nonlinear relationships [34,35]. The rapid development of neural networks offers new perspectives for addressing uncertainty issues, potentially overcoming the limitations of traditional methods.

1.2. Analysis of Related Works and the Proposal

Based on the above analysis, researchers have conducted extensive studies on the operational noise of automotive micro-motors, achieving notable advancements. However, there are still three key challenges:

(1): Challenges of Testing: The complex and cumbersome motor installation process, along with the limited interior space of the vehicle, makes it difficult to arrange test instruments and deploy all motors in their operational positions within the vehicle for testing.
(2): Limitations of Noise Prediction Methods: Traditional analyses of automotive micro-motor noise rely on deterministic methods, often overlooking the various uncertainties inherent in the testing process. Moreover, the acoustic performance of micro-motors’ idle operation on the production line has a strong nonlinear relationship with their actual operational position inside the vehicle. These factors make it difficult for traditional prediction methods to accurately forecast the noise uncertainty interval of micro-motors in actual operation.
(3): Limitations of Data Augmentation Models: The training process may lead to mode collapse [36], where the generator only produces a limited number of patterns, lacking diversity. Additionally, it tends to generate blurry or unrealistic samples, resulting in generated vibration and noise data lacking authenticity and failing to cover its complexity.

To solve the above-mentioned problems in the evaluation of operational noise from automotive micro-motors, and considering efficiency and accuracy, this paper proposes an MCSAWS model based on data-driven methods. It aims to predict the micro-motor’s operational noise uncertainty interval. The primary contributions of this paper are as follows:

(1): Proposing a method for predicting the micro-motor’s operational noise uncertainty interval. This method is based on experimental data from the idling condition of motors on the production line and quantifies the noise uncertainty interval of operational noise. This effectively avoids the limitations of traditional testing and prediction methods.
(2): Adjusting the VAE-GAN model to augment the data, generating more realistic and diverse vibration noise data and further improving the performance and generalization capabilities of the model.

The remainder of this paper is organized as follows: Section 2 introduces the proposed methodology, including preprocessing techniques for vibration and noise signals, and introduces the MCSAWS model. Section 3 discusses the experimental setup and dataset construction, along with the optimization of the VAE-GAN model for DA. Subsequently, Section 4 presents the construction process of the MCSAWS model and a comparison and discussion between different models. Finally, conclusions are drawn in Section 5.

2. System Framework and Methodology

Figure 1 effectively illustrates the five key components of the proposed method: data collection, data preprocessing, data augmentation, model construction, and model evaluation.

(1): Data Collection: Conducting vibration and noise experiments on micro-motors in a semi-anechoic chamber to collect time-domain data.
(2): Data Preprocessing: The collected time-domain data undergo STFT and are further refined using Gammatone filters to reduce the interference of non-commutation frequency components in the data.
(3): Data Augmentation: Based on the VAE-GAN model to perform data augmentation of the original dataset and adjusting its loss function to increase the sample diversity and comprehensiveness.
(4): Model Construction: The MCSAWS model structure is developed through experimental methods and selecting the structural design with the best performance.
(5): Model Evaluation: Comparing the proposed model against models such as CNN and MLP to explore the role of different modules.

2.1. Frequency Domain Analysis and Noise Reduction

Fourier transform is a signal-processing technology [37] that can only reflect the frequency-domain characteristics of the signal and ignores time-domain information. To address this limitation, Gabor [38] proposed the short-time Fourier transform (STFT) in 1946, which is used to analyze how signal frequency content varies over time. STFT divides the signal into short periods and weights it by adding a fixed-length window [39]. When the center position of the window function is τ₀, the weighted result is:

y (t) = x (t) \cdot w (t - τ_{0})

(1)

where

x

is the weighted window function. Assuming the non-stationary signal is stationary within the short time interval of the analysis window, we can analyze this segment through a Fourier transform to obtain the local spectrum of the signal. The signal results of the Fourier transform are as follows:

X (ω) = ℱ (y (t)) = \int_{\infty}^{+ \infty} x (t) \cdot w (t - τ_{0}) e^{- j ω t} d t

(2)

To observe the spectral information of the signal in different time segments through segment-by-segment analysis. Therefore, the short-time Fourier transform of signal

x (t)

is defined as:

S T F T (t, f) = \int_{- \infty}^{\infty} x (τ) h (τ - t) e^{- j ω t} d τ

(3)

Given that permanent magnet DC motors generate the most significant vibration and noise at the commutation frequency, this paper focuses on studying the noise amplitude at this frequency. However, during actual load operation, the motor is influenced by the skeleton frequency response function, causing the amplitude at non-commutating frequencies to also increase, which brings challenges to model training. To handle this complex vibration and noise data, the Gammatone filter was chosen [40]. These filters are designed based on the human auditory system, enabling accurate capture of signal frequency characteristics. Furthermore, Gammatone filters have nonlinear characteristics, making them suitable for addressing the operational noise uncertainty interval prediction problem of micro-motors in this paper. By setting the center frequency to the motor’s commutation frequency, the interference of other frequency components is reduced, strengthening the model’s ability to learn key features of vibration and noise data. The frequency response of the Gammatone filter bank in the 8000 Hz range is shown in Figure 2. the result after the signal passes through

S T F T

is as shown in Figure 3a, which highlights the commutation frequency part in the red box.

The Gammatone filter is a standard cochlear auditory filter, and its time-domain impulse response is given by:

g_{i} (t) = A t^{n - 1} \exp (- 2 π b_{i} t) \cos (2 π f_{i} + ϕ_{i}) U (t) t \geq 0, 1 \leq i \leq N

(4)

where

A

is the filter gain,

f_{i}

is the center frequency of the filter,

U (t)

is the step function,

ϕ_{i}

is the phase,

n

is the filter order,

b_{i} = 1.019 E R B (F_{i})

is the decay factor determining the decay rate of the impulse response and is related to the bandwidth of the corresponding filter, and

E R B (F_{i})

is the equivalent rectangular bandwidth.

E R B (F_{i}) = (F_{i} + a) \exp (\lg (a / (F_{i} + a))) - a

(5)

where

a

is a constant, and

F_{i}

is the filter frequency. The time-domain impulse response of the Gammatone filter after Fourier transformation is superimposed with the short-time Fourier transform result as the input of the model. Gs is calculated by Equation (6):

G s = F F T (g_{i} (t)) \cdot S T F T (t, f)

(6)

The commutation frequency

F_{c}

of the micro-motor used in the experiment can be expressed as:

F_{c} = n \times p

(7)

where

n

is the operating rotation speed and

p

is the number of motor slots. Therefore, the center frequency of the Gammatone filter is set to

F_{c}

. The time–frequency data of the original signal after processing through the Gammatone filter is shown in Figure 3b.

2.2. Multi-Branch Channel–Spatial Adaptive Weighting Strategy (MCSAWS)

The spatial distribution of vibration and noise signals is uneven, and the importance of information varies across different areas. Traditional feature extraction methods make it difficult to distinguish these differences and fail to adaptively weight critical information, causing the model to be sensitive to noise. In contrast to conventional methods, the CSAM effectively integrates the global information in the input data by learning the correlation information between different channels and the global space. It can adaptively identify channels and spatial regions that contribute significantly to acoustic performance prediction and automatically assign weights to emphasize important information.

Vibration and noise signals can be influenced by various factors such as operating conditions, load variations, mechanical component movements, motor structural characteristics, and friction noise. This leads to notable differences and complex nonlinear relationships in the acoustic performance between idle and operational conditions [41]. Traditional neural network architectures usually rely on a singular feature extraction pathway, which limits their ability to fully explore the feature space and capture these intricate influencing factors. In contrast, MFE methods utilize multi-branch networks to comprehensively capture vibration and noise features from different perspectives. The different structures and depths of each branch enable the network to extract features at different depths and scales. This simplifies network complexity and reduces the risk of overfitting, thereby enhancing the model’s generalization capabilities.

2.2.1. Channel–Spatial Attention Module

CSAM implementation draws inspiration from human visual attention [42], where focus is concentrated on particular regions of the observed scene rather than being uniformly distributed across the entire area. Channel Attention Module Structure: This module employs global max-pooling and global average-pooling layers to emphasize critical feature channels. These layers compress the features in the spatial direction into a vector corresponding to the number of channels, with each reduced to a width and height of one. Subsequently, these two vectors are fed into an MLP and an activation function to determine the importance of weights for different channels. Finally, the weight vector is utilized to weight the original feature map, highlighting or suppressing specific channels. Spatial Attention Module Structure: Similarly, the spatial attention module compresses features across the channel direction into a one-dimensional plane using global max-pooling and global average-pooling, reducing the number of channels to one while maintaining the original width and height. The plane then passes through a CNN to generate a weight map. This weight map is used to apply weights to each spatial point of the original feature map to highlight or suppress different spatial positions. Therefore, these two modules are expressed as:

\begin{array}{l} M_{C} (F) = [σ (M L P (A v g P o o l (F_{S}))), σ (M L P (M a x P o o l (F_{S})))] \\ M_{S} (F) = [σ (C N N (A v g P o o l (F_{C}))), σ (C N N (M a x P o o l (F_{C})))] \end{array}

(8)

where

F_{S}

operates on the original feature map in the spatial direction, and

F_{C}

operates on the original feature map in the channel direction. Operation

σ

is the sigmoid activation function, which adjusts the weight value to 0–1.

The proposed channel–spatial module’s overarching framework is shown in Figure 4. The addition of the feature maps weighted by the channel attention module and the spatial attention module to the original feature map ensures the retention of inherent information while assigning it importance-weighted data, similar to the concept of residual connections. This combination aids the network in better learning effective input features. Therefore, the feature map after CSAM is expressed as:

F' = F \cdot M_{C} {(F)}_{[1]} + F \cdot M_{C} {(F)}_{[2]} + F \cdot M_{S} {(F)}_{[1]} + F \cdot M_{S} {(F)}_{[2]} + F

(9)

2.2.2. Multi-Branch Feature Extraction Approach

Using an MFE strategy, multiple sub-networks are used to extract features of different dimensions and types. At the end of each sub-network, an adaptive weighting module is introduced to weigh the feature maps extracted by the sub-network. Subsequently, the feature maps weighted by all sub-networks are concatenated along the channel direction. Another adaptive weighting module is then introduced to assign different weights to the sub-networks. Figure 5 shows the structure diagram of the MFE methodology. Through the above method, the introduction of multiple different feature extraction and CSAMs has improved the robustness and accuracy of the model.

(1): Multiple parallel but structurally and depth-different basic network feature extraction networks (backbone) are utilized to extract high-dimensional scale features from input data.
(2): An adaptive weighting module is introduced to strengthen the focus on important features, especially the feature dimensions and spatial locations that have a greater impact on acoustic performance.
(3): Perform residual connections on the weighted features and reintroduce the adaptive weighting module to assign weights to different sub-networks, focusing on the characteristics of important subnetworks.

The employment of multiple parallel sub-networks augments the model’s capacity for data representation. This complementarity among sub-networks enhances resilience against input data noise, with certain sub-networks compensating for deficiencies in others’ handling of specific noise types. Notably, this paper designed four backbones, and the structures are shown in Table 1.

The basic feature extraction network consists of a series of modules to extract relevant features from the input data. This includes a convolution layer tasked with extracting local features, an activation function introducing nonlinearity to enhance model expressivity, a pooling layer aimed at diminishing feature dimensions, and a normalization layer fostering model stability (see Equations (10)–(13)). A fully connected layer maps the high-level features extracted by the convolutional layers to the final output categories or values.

y = \begin{matrix} σ (W x + b) \end{matrix}

(10)

σ (x) = \max (0, x)

(11)

f_{\max} = \max (A_{h w})

(12)

z = \frac{(x - μ)}{δ}

(13)

3. Acquisition and Augmentation of Micro-Motor Noise

3.1. Experimental Acquisition of Micro-Motor Noise

This paper takes an electric seat micro-motor as the experimental object. The experiment is conducted within a semi-anechoic chamber with a background noise lower than 15 dB to avoid interference from the external environment. Laboratory conditions are controlled at a temperature of 25 °C and humidity of 50%. The sound pressure sensor model BSWA MA231 was used to record the noise level of the micro-motor, and a 356A15 three-axis accelerometer was used to record vibration acceleration. Following the Nyquist sampling theorem, the noise sampling frequency is set to 51,200 Hz, with the highest analyzable frequency capped at 25,600 Hz. Similarly, the vibration acceleration sampling frequency is fixed to 25,600 Hz, with the highest analyzable frequency set to 12,800 Hz. Data processing is performed using the Simcenter Testlab equipped with 16 channels.

The purpose of this experiment is to test the micro-motor’s performance under two main operating conditions: idle operation and operational running when assembled in the seat. The data under idle operating conditions will be used to analyze and predict the motor’s noise uncertainty interval during operation. To ensure the reliability and universality of the experimental results and meet the needs of practical applications, three different types of motors were designed, and their main design parameters are shown in Table 2. Multiple tests were performed on the same motor to analyze the operating noise uncertainty, employing the setup shown in Figure 6 to collect vibration and noise data for both operating conditions. The experimental steps for collecting vibration and noise samples from automotive micro-motors unfold in the following sequential steps:

Step 1: Place the micro-motor in the center of a sponge and conduct idle operation tests to measure its vibration and noise levels. Repeat the measurements five times under diverse idle conditions, including forward and reverse rotations, with working voltages spanning 13 V to 15 V in increments of 0.5 V.

Step 2: Assemble the micro-motor in the loaded position and test the noise level during translational operation. Repeat this test five times for each of the various translational motion scenarios of the seat while maintaining the same voltage range as in Step 1.

Step 3: Perform disassembly and reassembly of the micro-motor five times and conduct tests following the micro-motor Step 2 after each operation.

Microphones and accelerometers are installed at fixed positions. During idle operation, the accelerometer records the motor body’s vibration acceleration, and the microphone is located 500 mm directly above the motor to record the noise. Following Step 1, each sample lasts 4 s, collecting a total of 450 data points. Subsequently, during the operational noise testing, the microphone records noise levels at the passenger’s right ear. Operations following Step 2 and Step 3. Each sample lasts 4 s, totaling 2250 data points. Figure 7 shows the time-domain vibration and noise spectrum of the micro-motor. In Figure 8, “I” denotes the vibration and noise datasets from five tests under idle conditions, and the average value is used as the model’s input. “O” denotes the results of five tests performed during each of the five disassembly and assembly cycles. These are used to analyze the noise uncertainty interval of the operation, serving as the model’s output. In the analysis, the working voltage and rotation direction were consistent between the samples under idle conditions and operation conditions, totaling 90 samples. Acquiring vibration and noise levels from multiple motor types, disassembly operations, and repetitions in the experiment facilitates later studies on uncertainty in micro-motor noise.

3.2. Data Augmentation for Micro-Motor Noise

GAN is a probabilistic model that includes a generator and a discriminator [43], which are commonly constructed using neural networks (such as feedforward neural networks). Variational Autoencoder Generative Adversarial Network (VAE-GAN) [44] enhances GAN by adding an encoder, and the synergy of the three improves image quality. The training process of VAE-GAN is divided into the following steps: (1) The encoder maps input samples to distribution parameters in the latent space, followed by resampling to generate latent vectors. (2) The generator generates samples from these latent space vectors. (3) The discriminator evaluates these samples’ authenticity.

The first module of VAE-GAN is the encoder, which is responsible for learning and encoding the representation of the input and mapping it to the parameter representation of the latent space (usually

μ

and

σ

). Resampled vectors will gradually converge to the Gaussian distribution. Therefore, its loss function is defined as:

L_{E} = K_{L} (N (μ, σ^{2}) ∥ N (0, 1)) = \frac{1}{2} (- \log_{2} σ^{2} + μ^{2} + σ^{2} - 1)

(14)

where

K_{L}

is the divergence between the latent space vector and the standard Gaussian distribution. The second module of VAE-GAN is the decoder. The decoder’s role is to convert these vectors back to the original inputs. Therefore, its purpose is to restore inputs by minimizing the reconstruction loss. The loss function is defined as:

L_{G} = \log_{2} (1 - D (G (Z))) + | D_{c} (X) - D_{c} (G (Z)) |^{2}

(15)

where

Z

is the vector reconstructed by sampling

μ

and

σ^{2}

;

G

is the generator;

D

is the discriminator, which has the same function as the decoder in VAE;

D_{c}

is the network part before the fully connected layer in the discriminator; and

X

is the input real samples. The third module of VAE-GAN is the discriminator. Its primary function is to classify images created by the generator, determining whether they are generated samples or real samples. Its loss function is defined as:

L_{D} = \log_{2} D (X) + \log_{2} (1 - D (G (Z)))

(16)

VAE-GAN uses an encoder to constrain the input to a standard normal distribution, thereby imposing constraints on the generator and reducing the gap between the generated data and the original data. This allows the generated images to better retain the features of the original images, while the adversarial module prompts the generator to better fit the original images. The model architecture is shown in Figure 9. By combining VAE and GAN, a balance point can be found between the realism and diversity of generated samples. To further improve its performance and enhance its interpretability, this paper will utilize an efficient dimensionality reduction technique—t-Distributed Stochastic Neighbor Embedding (t-SNE)—to aid in subsequent model optimization. t-SNE is a nonlinear dimensionality reduction technique [44] used to map high-dimensional data to low-dimensional space for easy visualization. It usually employs Gaussian or t-distributions to measure the similarity of feature vectors within the high-dimensional space, where the high-dimensional space similarity matrix is defined as:

P_{j | i} = \frac{\exp (- ∥ x_{i} - x_{j} ∥^{2} / 2 σ_{i}^{2})}{\sum_{k \neq i} \exp (- ∥ x_{i} - x_{k} ∥^{2} / 2 σ_{i}^{2})}

(17)

where

P_{j | i}

represents the conditional probability of

x_{j}

under condition

x_{i}

, and

σ_{i}

is used to control the field size between data points and symmetrize the similarity matrix.

P_{i j} = \frac{P_{j | i} + P_{i | j}}{2 N}

(18)

The positions of sample points in the low-dimensional space are adjusted via optimization algorithms such as gradient descent. The similarity matrix Q in the low-dimensional space is:

Q_{i j} = \frac{{(1 + ∥ y_{i} - y_{j} ∥^{2})}^{- 1}}{\sum_{k \neq l} {(1 + ∥ y_{k} - y_{l} ∥^{2})}^{- 1}}

(19)

Finally, through Equation (14), the

K_{L}

divergence between

P

and

Q

is calculated to ensure that the similarity distribution in the low-dimensional space approximates that in the high-dimensional space. Low-dimensional data points can be used for intuitive visualization of high-dimensional data. By applying t-SNE to both original and augmented samples to extract two-dimensional feature vectors, the mean distance (

M D

) is calculated using Equation (20) to express the similarity between samples. The

M D

is incorporated into the loss functions of both the encoder and generator to optimize model performance. To improve the effectiveness of the augmentation model, the following three strategic adjustments are proposed for the model’s input and loss functions:

(1): Direct Utilization of Raw Data: The model employs unprocessed raw audio signals as inputs.
(2): Application of Gammatone Filtering: Before input into the model, the original audio signals are processed through Gammatone filtering.
(3): Enhancement through Distance Metrics: Building on the second strategy, the average distance between the original and generated samples within the two-dimensional feature space is computed. This distance metric is then incorporated into the loss functions of both the encoder and decoder.

M D = \sum_{i = 0}^{n - 1} \sum_{j = 0}^{n - 1} d_{i j} / n^{2}

(20)

where

d_{i j}

represents the distance between point

i

and point

j

. Draw a circle with a radius of 1 centered at the two-dimensional feature vector. The Area of Coverage (

A O C

) of the generated samples is positively correlated with the comprehensiveness and diversity of the samples, with overlapping areas only calculated once. Due to the complexity of computing the coverage area with multiple circles, a Monte Carlo simulation (MCS) is utilized to approximate this coverage area accurately.

Figure 10 displays the 2D visualization of original and augmented samples after applying different processing methods, and Table 3 shows the metrics after applying different processing methods. Initial results highlight that the original samples and augmented samples exhibit a scattered distribution. However, after implementing Gammatone filtering coupled with adjustments to the loss function, the

M D

decreases from 66.604 to 28.317, indicating an enhancement in the similarity and correlation among the augmented samples. Moreover, the

A O C

expands from 1067.371 to 1146.098, signaling not just an increase in the volume of augmented samples but also enhancements in their diversity and comprehensiveness. Through appropriate processing of signals and adjustment of the loss function, the quality of the augmented samples has been enhanced. Particularly, the third method (Gammatone filtering with

M D

adjustments in the loss function) shows an obvious advantage in improving sample similarity and reducing the uncertainty of generated data. The results of this DA are visualized in Figure 11.

4. MCSAWS Prediction Model Establishment and Result Analysis

4.1. Architectural Design of MCSAWS

The paper focuses on exploring the uncertainty of micro-motors’ acoustic performance by establishing a model to predict the motor noise intervals. Therefore, the following five parameters are selected to compare the model performance: Average Overlap Ratio (

A O R

), Mean Width Error (

M W E

), Root Mean Square Error (

R M S E

), Mean Absolute Error (

M A E

), and Coefficient of Determination (

R^{2}

). The calculation formulas for each parameter are summarized as follows:

A O R = \frac{1}{n} \sum_{i = 1}^{n} \frac{\max (0, \min (U_{i}, U_{i}^{*}) - \max (L_{i}, L_{i}^{*}))}{\max (U_{i}, U_{i}^{*}) - \min (L_{i}, L_{i}^{*})}

(21)

M W E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{U_{i}^{*} - L_{i}^{*}}{U_{i} - L_{i}} |

(22)

R M S E = \frac{1}{\sqrt{n}} \sqrt{\sum_{i = 1}^{n} {(M_{i} - M_{i}^{*})}^{2}}

(23)

M A E = \frac{1}{n} \sum_{i = 1}^{n} | M_{i} - M_{i}^{*} |

(24)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(M_{i} - M_{i}^{*})}^{2}}{\sum_{i = 1}^{n} {(M_{i} - {\bar{M}}_{i})}^{2}}

(25)

where the predicted interval is

[L_{i}^{*}, U_{i}^{*}]

, the actual interval is

[L_{i}, U_{i}]

,

M = | U - L | / 2

denotes the interval mean, and

\bar{M} = \sum_{i = 1}^{n} M_{i} / n

is the average of the interval mean. The design of the MCSAWS structure mainly involves the sub-network architecture and the determination of the number of branches. Flexible choices in these areas can make the model more effective at handling complex data. However, too many branches or overly complex sub-networks can increase the number of parameters and the difficulty of training. Therefore, the design should favor simple structures appropriate to the specific problem, avoiding unnecessary complexity. Due to the absence of mature methods guiding the selection of sub-networks and the number of branches, this study adopts an experimental approach. Based on four backbone sets, a maximum limit of 20 branch networks is set, gradually increasing from 2 branches to 20. At the same time, the model performance is evaluated to see how model performance varies with the number of branches on a test set. The experiment selects all samples after DA and normalizes the parameters to the range of [−1, 1]. The experimental samples are randomly divided into a 70% training set and a 30% test set, ensuring that the proportion of samples under different working conditions remains consistent. In addition to the five main model performance parameters, this study also considers the influence of training time.

Figure 12 shows the performance of different branches within the backbone network. Exploring the impact of multi-branch feature extraction network configurations on model performance shows that increasing the number of branches generally improves performance, but this improvement is accompanied by certain fluctuations. The optimal number of branches and performance vary with different backbone configurations. The observations reveal that when using Backbone1, the model demonstrates the best overall performance with 15 branches. For Backbone2, peak performance transpires with five branches. For Backbone3 and Backbone4, peak performance occurs with four branches. This indicates that as the complexity of the backbone increases, the required number of branches to achieve optimal performance typically decreases. This may be because the more complex basic feature networks enhance their feature extraction capability, thus requiring fewer branches to process input data. In contrast, simpler feature extraction networks cannot handle complex problems, thus requiring more branches to supplement their insufficient feature representation. Selecting Backbone1 as the basic feature extraction structure and configuring 15 branches, the specific performance metrics, including

A O R

,

M W E

,

R M S E

,

M A E

, and

R^{2}

, stand at 0.821, 0.857, 0.773, 0.608, and 0.954, respectively, with a training duration of 606.080 s. The model not only demonstrates excellent overall performance but also exhibits shorter training time.

Furthermore, the study found that as the number of branches increases excessively, the model performance gradually weakens. Particularly within the Backbone3 and Backbone4 networks, when the number of branches surpasses nine and five, respectively, there is a gradual downturn in model performance. This suggests that an overabundance of branches may lead to excessive model complexity, thereby increasing the risk of overfitting and causing a downward trend in performance. Additionally, a linear relationship exists between the number of branches and model complexity. Therefore, when designing multi-branch network structures, striking a balance between branch count and performance is of paramount significance.

Figure 13 shows the weight allocation results for each branch when different backbones are equipped with optimal numbers of branches. Among them, the sixth branch of Backbone1, the fifth branch of Backbone2, and the fourth branches of both Backbone3 and Backbone4 exhibit the highest weights, while the fifteenth branch of Backbone1, the second branch of Backbone3, and the third branches of both Backbone2 and Backbone4 display the lowest weights. Further analysis indicates that branches with higher weights often correspond to more discriminative features. For instance, the sixth branch of Backbone1 may extract features closely related to the performance or intensity of micro-motor noise. Conversely, branches with lower weights may extract redundant or weakly relevant features and, hence, be suppressed by the model. This adaptive adjustment mechanism enables the model to focus more on key features, avoiding interference from irrelevant information. The performance differences and weight variations across different branches of the model demonstrate the efficacy of the MCSAWS method.

4.2. Model Verification and Comparative Analysis

This paper will evaluate the effectiveness of the constructed MCSAWS model and the proposed method through comparative experiments. The experiment will compare the performance of MLP, CNN, and CNN models augmented with the CSAM and the MFE module individually. The models will predict the operating noise interval of micro-motors to facilitate performance evaluation and comparative analysis. The MCSAWS model and the proposed method’s potential advantages in handling this problem will be verified.

CNN: Following experimentation and comparative analysis of the four backbone options outlined in Table 1, Backbone4 emerges as the preferred choice.

MLP: The model features three hidden layers: the first and third layers each contain 128 neurons, while the second has 256. Data passes through each layer, where it undergoes ReLU activation to enhance nonlinearity, followed by max-pooling for dimensionality reduction and computational efficiency; the final output is obtained after processing by the third layer.

CSAM-CNN: The CSAM is added to the CNN model.

MFE-CNN: Multiple-branch feature extraction networks are adopted, using the remaining part after removing the CSAM from the MCSAWS model.

To ensure the authenticity and reliability of the results, all models were run 100 times, and the results were averaged, with all data retained to three significant figures. The same training and test datasets were used for each training, with the same loss function (MSELoss) and optimizer (Adam), and 3000 epochs were passed for each model in each training. Table 4 shows the evaluation metric results obtained from training on the original dataset. Table 5 presents the training results after adding augmented datasets.

Analysis revealed that after adding the CSAM and the MFE network separately, the

R^{2}

of the CNN model improved from the original 0.864 to 0.878 and 0.902, respectively, and the performance of the other metrics also improved. This improvement shows that each module can effectively enhance the model’s data processing and feature analysis capabilities. On further combining both modules, the resulting MCSAWS model exhibits excellent overall performance. This integrated model not only inherits the advantages of individual modules but also improves the overall performance of the model through the synergistic interaction between modules, achieving an

R^{2}

value of 0.932. When applied to augmented datasets, performance is further enhanced, with

A O R

increasing from 0.815 to 0.821 and

R^{2}

increasing from 0.932 to 0.954.

Figure 14 shows the comparison between the predicted noise uncertainty interval of each model and the actual noise uncertainty interval, alongside relative errors when using the original samples. This demonstrates that the MCSAWS model can more accurately predict the noise uncertainty interval of the micro-motor. The MCSAWS model integrates individual module advantages and enhances overall performance through module interaction. This approach effectively handles complex data structures, leading to accurate and efficient predictions.

After adding augmented samples, the model’s performance metrics exhibited varying degrees of improvement. For example, the

A O R

of the CSAM-CNN model increased from 0.789 to 0.821, and the

R^{2}

also improved from 0.878 to 0.916. The

R^{2}

value of the MFE-CNN model increased from 0.902 to 0.921, indicating a more precise fit to the augmented data. The

R M S E

of the MCSAWS model decreased from 0.938 to 0.773, and the

M A E

decreased from 0.740 to 0.608. Figure 15 displays the comparison results after adding augmented samples. The results show that the introduction of augmented data improves the accuracy of all models, and the prediction error is generally reduced. Among them, the MCSAWS model, with its MFE network and CSAM mechanism, effectively enhances its ability to capture noise features. With the support of DA, the MCSAWS model demonstrates outstanding generalization performance and the highest overlap between the predicted intervals and the actual noise uncertainty interval. Compared to other models, the MCSAWS model achieves superior performance, with a decrease of 0.165 in

R M S E

and a decrease of 0.132 in

M A E

. The reduction in error indicates that augmented samples improve the model’s performance on unseen data by providing more diverse and representative data.

5. Conclusions

This paper focuses on the automotive micro-motor’s operational noise uncertainty interval. Initially, Gammatone filters are used to enhance noise in specific frequency bands of interest within the raw data and reduce interference in other frequency bands. Subsequently, due to limitations in sample quantity, this paper adopts the VAE-GAN method to augment original data. Its loss function is adjusted to ensure that the generated samples are closer to the original data in the two-dimensional feature space while covering a broader area, thereby obtaining a more comprehensive and diverse sample set. Finally, building upon this foundation, we proposed an MCSAWS method, which utilizes test data from motor idle operations as model inputs to predict the micro-motor’s operational noise uncertainty interval.

Analysis and comparison of experimental data show that the MCSAWS model comprehensively considers the effects of multiple uncertainties and improves performance with the addition of augmented samples. Specifically, the MCSAWS model achieves

A O R

,

M W E

,

R M S E

,

M A E

, and

R^{2}

values of 0.821, 0.857, 0.773, 0.608, and 0.954, respectively, outperforming both MLP and CNN methods. In addition, by adding MFE and CSAM modules into conventional CNN models, overall CNN performance is bolstered, demonstrating the effectiveness of these modules. On the micro-motor production line, the proposed method can accurately control the operational noise uncertainty in advance, offering a reference for similar industrial production challenges.

Author Contributions

Conceptualization, H.H. and S.D.; Methodology, H.H. and Y.W.; Formal analysis, W.Y. and S.D.; Software, H.H.; Investigation, H.H. and Y.W.; Supervision, Y.H. and Y.W.; Validation, H.H. and Y.H.; Writing—original draft, H.H.; Writing—review and editing, H.H and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Natural Science Foundation of Sichuan Province (2022NSFSC1892), the Fundamental Research Funds for the Central Universities (X2021KZK054), and the Liuzhou Science and Technology Program (2022DAA0102).

Data Availability Statement

Data are available from authors on reasonable request.

Acknowledgments

The authors thank the China Automotive Technology & Research Center for supporting the completion of this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

$A O C$	Area of Coverage
$A O R$	Average Overlap Ratio
Backbone	Fundamental feature extraction network
CNN	Convolutional Neural Network
CSAW	Channel and Spatial Adaptive Weighting
MFE	Multi-branch feature extraction
DA	Data Augmentation
FFT	Fast Fourier Transform
GAN	Generative Adversarial Network
$M A E$	Mean Absolute Error
MCS	Monte Carlo simulation
$M D$	Mean distance
MLP	Multi-Layer Perceptron
$M W E$	Mean Width Error
NVH	Noise, vibration and harshness
PDF	Probability density function
$R^{2}$	Coefficient of Determination
$R M S E$	Root Mean Square Error
STFT	Short-time Fourier transform
t-SNE	t-Distributed Stochastic Neighbor Embedding
VAE	Variational Autoencoder

References

Huang, H.; Huang, X.; Ding, W.; Zhang, S.; Pang, J. Optimization of electric vehicle sound package based on LSTM with an adaptive learning rate forest and multiple-level multiple-object method. Mech. Syst. Signal Process. 2023, 187, 109932. [Google Scholar] [CrossRef]
Zhao, T.; Ding, W.; Huang, H.; Wu, Y. Adaptive Multi-Feature Fusion for Vehicle Micro-Motor Noise Recognition Considering Auditory Perception. Sound Vib. 2023, 57, 133–153. [Google Scholar] [CrossRef]
Min, D.; Jeong, S.; Yoo, H.H.; Kang, H.; Park, J. Experimental investigation of vehicle wiper blade’s squeal noise generation due to windscreen waviness. Tribol. Int. 2014, 80, 191–197. [Google Scholar] [CrossRef]
Fu, T.H. Study on Mechanical Automation with Automatically Adjustable Seat Based on Mechanical Properties. Appl. Mech. Mater. 2013, 454, 3–6. [Google Scholar] [CrossRef]
Hou, Q.; Jin, Y.P.; Zhou, Y.F. Electromagnetic Interference Testing and Suppression Methods for Automotive Window Lifter Motor. Appl. Mech. Mater. 2013, 433–435, 940–944. [Google Scholar] [CrossRef]
Li, F.; Zhang, Y.; Li, J.; Yang, X.; Li, T.; Shang, W. Application of Measurement Uncertainty for Electric Motor Efficiency Evaluation. In Proceedings of the 2015 International Forum on Energy, Environment Science and Materials, Shenzhen, China, 25–26 September 2015; pp. 973–979. [Google Scholar]
Dong, Q.; Liu, X.; Qi, H.; Zhou, Y. Vibro-acoustic prediction and evaluation of permanent magnet synchronous motors. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2020, 234, 2783–2793. [Google Scholar] [CrossRef]
Feng, H.; Zhou, Y.; Zeng, W.; Ding, C. Review on metrics and prediction methods of civil aviation noise. Int. J. Aeronaut. Space Sci. 2023, 24, 1199–1213. [Google Scholar] [CrossRef]
Li, L.; Cai, H.; Han, H.; Jiang, Q.; Ji, H. Adaptive short-time Fourier transform and synchrosqueezing transform for non-stationary signal separation. Signal Process. 2020, 166, 107231. [Google Scholar] [CrossRef]
Wang, W.; Mao, X.; Liang, H.; Yang, D.; Zhang, J.; Liu, S. Experimental research on in-pipe leaks detection of acoustic signature in gas pipelines based on the artificial neural network. Measurement 2021, 183, 109875. [Google Scholar] [CrossRef]
Beale, C.; Niezrecki, C.; Inalpolat, M. An adaptive wavelet packet denoising algorithm for enhanced active acoustic damage detection from wind turbine blades. Mech. Syst. Signal Process. 2020, 142, 106754. [Google Scholar] [CrossRef]
Seid Ahmed, Y.; Arif, A.F.M.; Veldhuis, S.C. Application of the wavelet transform to acoustic emission signals for built-up edge monitoring in stainless steel machining. Measurement 2020, 154, 107478. [Google Scholar] [CrossRef]
Amarnath, M.; Praveen Krishna, I.R. Empirical mode decomposition of acoustic signals for diagnosis of faults in gears and rolling element bearings. IET Sci. Meas. Technol. 2012, 6, 279–287. [Google Scholar] [CrossRef]
Yassin, A.I.; Shariff, K.K.M.; Kechik, M.A.; Ali, A.M.; Amin, M.S.M. Acoustic Vehicle Classification Using Mel-Frequency Features with Long Short-Term Memory Neural Networks. TEM J. 2023, 12, 1490–1496. [Google Scholar] [CrossRef]
Rosero, K.; Grijalva, F.; Masiero, B. Sound events localization and detection using bio-inspired gammatone filters and temporal convolutional neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 2314–2324. [Google Scholar]
Chang, Y.; Zhang, X.; Shen, Y.; Song, S.; Song, Q.; Cui, J.; Jie, H.; Zhao, Z. Rail Crack Detection Using Optimal Local Mean Decomposition and Cepstral Information Coefficient Based on Electromagnetic Acoustic Emission Technology. IEEE Trans. Instrum. Meas. 2024, 73, 9506412. [Google Scholar] [CrossRef]
Huang, H.; Lim, T.C.; Wu, J.; Ding, W.; Pang, J. Multitarget prediction and optimization of pure electric vehicle tire/road airborne noise sound quality based on a knowledge-and data-driven method. Mech. Syst. Signal Process. 2023, 197, 110361. [Google Scholar] [CrossRef]
Qian, K.; Shen, Z.; Tan, J.; Liu, K.; Wang, Y.; Li, H.; Zhao, J. Interior sound quality evaluation of high-speed trains-a literature review. Int. J. Rail Transp. 2024, 1–26. [Google Scholar] [CrossRef]
Li, M.; Zhou, W.; Liu, J.; Zhang, X.; Pan, F.; Yang, H.; Li, M.; Luo, D. Vehicle Interior Noise Prediction Based on Elman Neural Network. Appl. Sci. 2021, 11, 8029. [Google Scholar] [CrossRef]
Huang, H.B.; Li, R.X.; Yang, M.L.; Lim, T.C.; Ding, W.P. Evaluation of vehicle interior sound quality using a continuous restricted Boltzmann machine-based DBN. Mech. Syst. Signal Process. 2017, 84, 245–267. [Google Scholar] [CrossRef]
Steinbach, L.; Altinsoy, M.E. Prediction of annoyance evaluations of electric vehicle noise by using artificial neural networks. Appl. Acoust. 2019, 145, 149–158. [Google Scholar] [CrossRef]
Qi, Y.; Yang, Z.; Sun, W.; Lou, M.; Lian, J.; Zhao, W.; Deng, X.; Ma, Y. A Comprehensive Overview of Image Enhancement Techniques. Arch. Comput. Methods Eng. 2021, 29, 583–607. [Google Scholar] [CrossRef]
Kusiak, A. Convolutional and generative adversarial neural networks in manufacturing. Int. J. Prod. Res. 2019, 58, 1594–1604. [Google Scholar] [CrossRef]
Tran, N.T.; Tran, V.H.; Nguyen, N.B.; Nguyen, T.K.; Cheung, N.M. On Data Augmentation for GAN Training. IEEE Trans. Image Process. 2021, 30, 1882–1897. [Google Scholar] [CrossRef] [PubMed]
Han, Y.; Liu, Y.; Chen, Q. Data augmentation in material images using the improved HP-VAE-GAN. Comput. Mater. Sci. 2023, 226, 112250. [Google Scholar] [CrossRef]
Islam, Z.; Abdel-Aty, M.; Cai, Q.; Yuan, J. Crash data augmentation using variational autoencoder. Accid. Anal. Prev. 2021, 151, 105950. [Google Scholar] [CrossRef] [PubMed]
Yao, W.; Chen, X.; Luo, W.; van Tooren, M.; Guo, J. Review of uncertainty-based multidisciplinary design optimization methods for aerospace vehicles. Prog. Aerosp. Sci. 2011, 47, 450–479. [Google Scholar] [CrossRef]
Liu, S.; Józefczyk, J.; Forrest, J.; Vallee, R. Emergence and development of grey systems theory. Kybernetes 2009, 38, 1246–1256. [Google Scholar] [CrossRef]
James, K.R.; Dowling, D.R. A probability density function method for acoustic field uncertainty analysis. J. Acoust. Soc. Am. 2005, 118, 2802–2810. [Google Scholar] [CrossRef]
Yin, S.; Yu, D.; Ma, Z.; Xia, B. A unified model approach for probability response analysis of structure-acoustic system with random and epistemic uncertainties. Mech. Syst. Signal Process. 2018, 111, 509–528. [Google Scholar] [CrossRef]
Huang, H.; Huang, X.; Ding, W.; Yang, M.; Yu, X.; Pang, J. Vehicle vibro-acoustical comfort optimization using a multi-objective interval analysis method. Expert Syst. Appl. 2023, 213, 119001. [Google Scholar] [CrossRef]
Huang, H.; Huang, X.; Ding, W.; Yang, M.; Fan, D.; Pang, J. Uncertainty optimization of pure electric vehicle interior tire/road noise comfort based on data-driven. Mech. Syst. Signal Process. 2022, 165, 108300. [Google Scholar] [CrossRef]
Dong, J.; Ma, F.; Gu, C.; Hao, Y. Uncertainty analysis of high-frequency noise in battery electric vehicle based on interval model. SAE Int. J. Veh. Dyn. Stab. NVH 2019, 3, 73–85. [Google Scholar] [CrossRef]
Nicholas, N. The black swan: The impact of the highly improbable. J. Manag. Train. Inst. 2008, 36, 56. [Google Scholar]
Klir, G.J.; Folger, T.A. Fuzzy Sets, Uncertainty, and Information; Prentice-Hall, Inc.: Saddle River, NJ, USA, 1987. [Google Scholar]
Dai, Z.; Zhao, L.; Wang, K.; Zhou, Y. Mode standardization: A practical countermeasure against mode collapse of GAN-based signal synthesis. Appl. Soft Comput. 2024, 150, 111089. [Google Scholar] [CrossRef]
Fourier, J.B.J. Théorie Analytique de la Chaleur; Gauthier-Villars: Paris, France, 1888; Volume 1. [Google Scholar]
Gabor, D. Theory of communication. Part 1: The analysis of information. J. Inst. Electr. Eng. Part III Radio Commun. Eng. 1946, 93, 429–441. [Google Scholar] [CrossRef]
Allen, J.B.; Rabiner, L.R. A unified approach to short-time Fourier analysis and synthesis. Proc. IEEE 1977, 65, 1558–1564. [Google Scholar] [CrossRef]
Slaney, M. An efficient implementation of the Patterson-Holdsworth auditory filter bank. Apple Comput. Percept. Group Tech. Rep. 1993, 35, 795–811. [Google Scholar]
Lin, F.; Zuo, S.-G.; Deng, W.-Z.; Wu, S.-L. Reduction of vibration and acoustic noise in permanent magnet synchronous motor by optimizing magnetic forces. J. Sound Vib. 2018, 429, 193–205. [Google Scholar] [CrossRef]
Bengio, Y.; Ducharme, R.; Vincent, P. A neural probabilistic language model. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 1 January 2000; Volume 13. [Google Scholar]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative adversarial networks: An overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
Larsen, A.B.L.; Sønderby, S.K.; Larochelle, H.; Winther, O. Autoencoding beyond pixels using a learned similarity metric. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 1558–1566. [Google Scholar]

Figure 1. Framework and methodology of the proposed approach: (a) schematic overview; (b) general implementation procedure.

Figure 2. Frequency response characteristics of Gammatone filter.

Figure 3. (a) Spectrogram analysis of operating noise in micro−motors; (b) spectrogram of micro-motor operating noise post-Gammatone filtering.

Figure 4. Architecture of CSAM: (a) channel attention module; (b) spatial attention module; (c) integration with residual connection.

Figure 5. Architecture of the MCSAWS.

Figure 6. Experimental setup for data collection: (a) idle operation condition; (b) operational condition; (c) vibration and noise acquisition system.

Figure 7. Time-domain vibration and noise spectrum of micro-motors: (a) noise under idle conditions; (b) operational noise.

Figure 8. Correspondence between idle operation data and operational data.

Figure 9. The architecture of the VAE-GAN model.

Figure 10. t-SNE visualization of original and augmented samples: (a) none; (b) Gammatone filtering; (c) Gammatone filtering plus.

Figure 11. Comparison of original and augmented data samples: (a) original sample; (b) augmented sample.

Figure 12. Performance metrics for each branch of the basic feature extraction network: (a)

A O R

; (b)

M W E

; (c)

R M S E

; (d)

M A E

; (e)

R^{2}

; (f)

T i m e

.

Figure 12. Performance metrics for each branch of the basic feature extraction network: (a)

A O R

; (b)

M W E

; (c)

R M S E

; (d)

M A E

; (e)

R^{2}

; (f)

T i m e

.

Figure 13. Optimal branch weight configurations across various backbones: (a) Backbone1; (b) Backbone2; (c) Backbone3; (d) Backbone4.

Figure 14. Comparison of predicted versus measured values for models trained with original dataset: (a) MLP; (b) CNN; (c) MFE-CNN; (d) CSAM-CNN; (e) MCSAWS.

Figure 15. Comparison of predicted versus measured values for models trained with augmented data: (a) MLP; (b) CNN; (c) MFE-CNN; (d) CSAM-CNN; (e) MCSAWS.

Table 1. Components of basic feature extraction network.

Layer	Backbone1	Backbone2	Backbone3	Backbone4
1	Convolution 16/3/1/1	Convolution 32/5/1/2	Convolution 64/7/1/3	Convolution 128/7/1/3
2	BatchNormalization	BatchNormalization	BatchNormalization	BatchNormalization
3	Relu	Relu	Relu	Relu
4	Avgpooling 2/2	Avgpooling 2/2	Avgpooling 2/2	Avgpooling 2/2
5	Convolution 32/3/1/1	Convolution 64/5/1/2	Convolution 128/7/1/3	Convolution 256/5/1/2
6	BatchNormalization	BatchNormalization	BatchNormalization	BatchNormalization
7	Relu	Relu	Relu	Relu
8	Maxpooling 2/2	Maxpooling 2/2	Maxpooling 2/2	Maxpooling 2/2
9	Convolution 64/3/1/1	Convolution 128/5/1/2	Convolution 256/7/1/3	Convolution 512/3/1/1
10	BatchNormalization	BatchNormalization	BatchNormalization	BatchNormalization
11	Relu	Relu	Relu	Relu
12	Maxpooling 2/2	Maxpooling 2/2	Maxpooling 2/2	Maxpooling 2/2

Table 2. Main structural parameters of micro-motors.

Parameters	1	2	3
Number of Poles	4	4	4
Number of Slots	12	12	12
Offset (mm)	0	2.5	4

Table 3. Impact of different processing methods on results.

Method	$M D$	$A O C$
None	66.604	1067.371
Gammatone filtering	32.757	1111.943
Gammatone filtering + ${L o s s}_{A d}$	28.317	1146.098

Table 4. Model predictions using the original dataset.

Method	$A O R$	$M W E$	$R M S E$	$M A E$	$R^{2}$
MLP	0.767	0.813	1.544	0.971	0.867
CNN	0.781	0.869	1.626	1.029	0.864
CSAM-CNN	0.789	0.880	1.356	0.845	0.878
MFE-CNN	0.794	0.879	1.148	0.879	0.902
MCSAWS	0.815	0.899	0.938	0.740	0.932

Table 5. Model predictions incorporating augmented datasets.

Method	$A O R$	$M W E$	$R M S E$	$M A E$	$R^{2}$
MLP	0.755	0.834	1.169	0.870	0.892
CNN	0.796	0.870	1.314	0.894	0.893
CSAM-CNN	0.821	0.889	1.259	0.790	0.916
MFE-CNN	0.801	0.876	0.957	0.755	0.921
MCSAWS	0.821	0.857	0.773	0.608	0.954

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, H.; Deng, S.; Yan, W.; He, Y.; Wu, Y. Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy. Electronics 2024, 13, 2553. https://doi.org/10.3390/electronics13132553

AMA Style

Hu H, Deng S, Yan W, He Y, Wu Y. Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy. Electronics. 2024; 13(13):2553. https://doi.org/10.3390/electronics13132553

Chicago/Turabian Style

Hu, Hao, Shiqi Deng, Wang Yan, Yanyong He, and Yudong Wu. 2024. "Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy" Electronics 13, no. 13: 2553. https://doi.org/10.3390/electronics13132553

APA Style

Hu, H., Deng, S., Yan, W., He, Y., & Wu, Y. (2024). Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy. Electronics, 13(13), 2553. https://doi.org/10.3390/electronics13132553

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Operational Noise Uncertainty in Automotive Micro-Motors Based on Multi-Branch Channel–Spatial Adaptive Weighting Strategy

Abstract

1. Introduction

1.1. Words Related to Noise Prediction

1.2. Analysis of Related Works and the Proposal

2. System Framework and Methodology

2.1. Frequency Domain Analysis and Noise Reduction

2.2. Multi-Branch Channel–Spatial Adaptive Weighting Strategy (MCSAWS)

2.2.1. Channel–Spatial Attention Module

2.2.2. Multi-Branch Feature Extraction Approach

3. Acquisition and Augmentation of Micro-Motor Noise

3.1. Experimental Acquisition of Micro-Motor Noise

3.2. Data Augmentation for Micro-Motor Noise

4. MCSAWS Prediction Model Establishment and Result Analysis

4.1. Architectural Design of MCSAWS

4.2. Model Verification and Comparative Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI