1. Introduction
With the rapid development of electronic systems, circuit design requires high-performance microwave devices [
1]. Most of the circuit designs are first simulated by simulation software to obtain the desired circuit performance and then physically fabricated [
2]. Simulations not only make it possible to find the appropriate circuit parameters faster but also save labor and production costs in the physical manufacturing process [
3]. Therefore, highly accurate and efficient microwave device models play a very important role in circuit design simulation [
4,
5,
6,
7]. Improving device modeling accuracy and shortening device design cycle have become major fields of research in microwave devices [
8,
9]. Traditional modeling methods consume a lot of human and computer resources by constantly trying and correcting to obtain an accurate model [
10,
11]. In addition, due to the lack of degrees of freedom, the models built by the traditional modeling methods often fail to meet the required accuracy. In order to satisfy the requirements of fast simulation and high model accuracy [
12,
13], advanced modeling methods need to be investigated. Coupled microstrip lines are widely used in microwave semiconductor devices for their small size, simple structure, stability and reliability [
14,
15]. However, the existing models of coupled microstrip lines for device design rely heavily on simulation software [
16,
17], which limits the flexibility of the models. This paper focuses on a new modeling method for coupled microstrip lines to efficiently build an accurate and highly compatible model.
Artificial intelligence and deep learning are now widely used in the field of microwave device modeling. Research on, development of, and innovation in intelligent approaches in microwave devices have become popular topics of research in this field [
18,
19,
20]. Artificial neural networks (ANNs) are one of the early machine learning algorithms [
21]. Modeling methods using ANNs are considered an effective alternative in the field of microwave device modeling [
22,
23]. ANNs have strong learning and generalization capabilities, and can learn the nonlinear relationship between the input and output of modeled microwave devices by optimizing weights. The trained model can accurately reflect the output responses of the modeled device. The more data that provide a nonlinear relationship within a certain range, the more accurate the prediction of the ANNs will be [
24,
25]. To avoid the large amount of data required for ANNs, Neuro-SM is proposed for modeling microwave devices [
26,
27,
28]. Compared with traditional ANN modeling methods, the Neuro-SM method can effectively save training cost and improve the generalization capability of ANNs. Normally, accurate data obtained by the simulation or measurement of the modeled microwave devices are defined as the fine model, while the coarse model is expressed by empirical formulas that can match the fine model approximately. The Neuro-SM model consists of two parts: the coarse model and the MNNs. By adjusting the MNNs, the coarse model is gradually adapted to the characteristics of the fine model, enabling the Neuro-SM model to achieve both high accuracy and high simulation efficiency. This method has been widely used for device modeling in the microwave field [
29,
30,
31,
32,
33,
34].
The first presentation of the Neuro-SM modeling method is in [
35], which modifies the modeled microwave device behavior with new space mapping formulas. If the performance of the coarse model is similar to the modeled microwave device, the Neuro-SM model matches the fine model well by introducing the input MNNs. The frequency MNNs added to the Neuro-SM model can enhance the frequency characteristics of the coarse model [
36]. In [
29], the coarse model introduces both the input and output MNNs in order to obtain a more accurate model. If there is a significant difference between the fine model and the empirical formulas of the coarse model, the existing Neuro-SM modeling methods cannot develop a precise model. More degrees of freedom of variables are required in input MNNs. Frequency MNNs can easily put the training process into an overlearning state. For output MNNs, it is difficult to achieve passivity in a passive device model. Therefore, novel modeling methods based on Neuro-SM are needed to cover the differences between the coarse and fine models.
This paper proposes an improved empirical formula modeling method for coupled microstrip lines. The correction values are added to the empirical formulas and the modified empirical formulas are used as the coarse model. The MNNs adjusts the correction values according to the input variables, improving their ability to learn and predict. In addition, an advanced training method is proposed to automatically optimize the weights of the MNNs in order to improve the efficiency of modeling. Modeling examples verify the effectiveness and feasibility of the improved empirical formula modeling method proposed in this paper.
2. Proposed Empirical Formula Modeling Method
The microstrip line is made on a dielectric substrate with height
and relative permittivity
. A conductor strip of length
, width
, and thickness
is on one side, while a grounded metal plate is on the other side. The coupled microstrip lines consist of two parallel microstrip lines spaced
apart from each other. Let the normalized width be defined as
, i.e.,
. Let the normalized gap be defined as
, i.e.,
.
Figure 1 shows the physical structure of the coupled microstrip lines.
The two microstrip lines in the coupled microstrip lines are close to each other, so there is a coupling phenomenon of electromagnetic signals when electromagnetic waves are transmitted. The coupled microstrip lines are surrounded by the nonuniform medium, so the transmitted electromagnetic waves are mixed modes with dispersion characteristics, which makes the analysis more complicated. For the convenience of analysis, the mode transmitted in the coupled microstrip lines is considered a transverse electric and magnetic field mode in this paper. It can be decomposed into odd-mode and even-mode modes when different excitation sources are applied to the coupled microstrip lines. In these operating states, the transmission between the two parallel lines is independent of each other and coupled with each other. These two transmission states are mathematically separated and studied in terms of symmetry and antisymmetry. The even-mode excitation means that the magnitude and phase of the incident source excitation at the symmetric port are the same, while the odd-mode excitation has the opposite magnitude and phase of the incident source excitation at the symmetric port. A mathematical method is used to analyze coupled microstrip lines, making the model independent of different simulation software and highly compatible.
2.1. Improved Empirical Formulas of the Coarse Model
The odd- and even-mode methods are used to develop the empirical formulas from geometric variables to generate responses. Some important empirical formulas are derived in this section. The effective permittivity for even-mode and odd-mode excitation is given by:
where
and
are parametric equations related to the geometrical variables, which are explained in [
38].
Equations (2) and (3) represent the characteristic impedance and the characteristic capacitance for even-mode and odd-mode excitation, respectively:
where
is defined as the wave impedance in vacuum,
expresses the impedance of the uniform microstrip line,
represents the speed of light, and
and
are parametric equations related to the
and
of the coupled microstrip line, which are explained in [
38].
The substrate dielectric of the coupled microstrip lines remains unchanged, while the medium around the conductor strip of the coupled microstrip lines is completely replaced by air. In this case, the characteristic capacitance of even- and odd-mode excitation is given by:
The mutual inductance and the self-inductance are represented by Equation (5), and the mutual capacitance and the self-capacitance can be expressed by Equation (6):
where
is defined as permeability of vacuum and
expresses the vacuum absolute permittivity.
The empirical formulas derived by the odd- and even-mode methods roughly match the fine model of the coupled microstrip lines, but when the operating frequency is too high or the input variables vary over a wide range, it takes a lot of time and computer resources to constantly try and correct the intermediate parameters in the empirical formulas. In most cases, the empirical formulas fail to build an accurate model. This method finds the intermediate parameters in the empirical formulas that have a large impact on the response of the coarse model through control variables. The correction values are considered as factors multiplied by the intermediate parameters in the corresponding positions of the empirical formulas, which improve the flexibility of the model. The improved coarse model consists of empirical formulas with correction values. The whole process of building the coarse model is performed in NeuroModelerPlus software. In the proposed method, correction values are added at the locations of relative permittivity, impedance, capacitance and inductance. For example, is a selected intermediate parameter, is the correction value, and the improved empirical formula is . By changing the values of the selected intermediate parameter, the response of the coarse model gradually approaches that of the fine model. The experimental result shows that the method gives enough degrees of freedom to the coarse model to make it more flexible to accurately match the fine model.
2.2. Improved Neuro-SM Model Structure
To obtain an accurate model, the improved Neuro-SM model is proposed based on empirical formulas and correction values.
Figure 2 is a schematic diagram of the improved Neuro-SM model structure. The parameter analysis method was used to determine the input variables that have significant effects on the response characteristics of the coupled microstrip lines. The input variables for the improved Neuro-SM model are defined as
, which includes the geometrical variables
and frequency variable
.
are the inputs to both the coarse model and the MNNs. The vector
represents the correction values added to the empirical formulas. The output variables
are the responses of the improved Neuro-SM model.
MNNs with a simple structure can accurately represent the nonlinear relationship between
and
. The MNNs adjust the internal weights
according to the different
, and then change
. The adjusted
and
are fed into the coarse model, and
are finally generated. Through the training process of the MNNs, the outputs
of the coarse model match the outputs
of the fine model well. The nonlinear relationship between
and
, which is adjusted by the MNNs, is represented by
. The expression is shown as:
where
denotes the internal weights of the MNNs and
denotes a multilayer perceptron neural network that uses the sigmoid function as an excitation function to arbitrarily approximate the nonlinear relationship between input and output [
39].
2.3. The Proposed Training Method
In the proposed method, the most critical step in modeling is to obtain the internal weights of the MNNs so that the outputs of the developed model are constantly close to those of the desired device. The training process determines not only the learning effect of the model but also the prediction effect. In this paper, the evaluation criteria for model learning and prediction ability are usually presented in term of training error
and test error
, and can be formulated as:
where
and
are the responses of the improved Neuro-SM model and the responses of the fine model, respectively.
represents the maximum value of the absolute value of the fine model responses. The superscript
is the index of the output response, and
is the total amount of output response. The subscript
indicates the index of the modeled data
, where
is the total amount of modeled data. During the optimization process, the internal weights of the MNNs are continuously adjusted by using different optimization algorithms to reduce the error until the error meets the accuracy requirements.
To speed up the training process, the internal weights
of the MNNs are treated as optimization variables. The first-order derivative
is used to speed up the search for the optimal variables. Since the input variables for the fine and coarse models are the same, the subscript
is added to the signs of the input variable of the coarse model in Equation (9). The first-order derivative of the fine model outputs
with respect to the optimization variables
is given by:
where
is the derivative of the fine model outputs
with respect to the coarse model outputs
.
and
, respectively, represent the derivative of the coarse model outputs
with respect to the input variables of the coarse model
and the frequency variable of the coarse model
.
, and
represents the derivative of the input variables of the coarse model
and the frequency variable of the coarse model
with respect to the optimization variables
, respectively.
2.4. The Whole Process of the Proposed Modeling Method
The flowchart of the whole process of the proposed modeling method is shown in
Figure 3. In the data generation section, the geometric variables that have significant effects on the responses of the coupled microstrip lines are first determined. The training and test data for the proposed model are generated using the design of experiments (DOE) method [
40]. The DOE method can generate the geometric parameters of an orthogonal distribution, which ensures that the training data can represent the entire modeling range approximately. The proposed Neuro-SM can learn the nonlinear relationship between
and
with the training data, and the predictive capability of the model is verified using the test data. Test data are within the training data range and different from the training data.
The first step of the training process is to build a coarse model according to the empirical formulas. The odd- and even-mode methods derive all empirical formulas used for modeling, as described in
Section 2.1. If the test error of the initial coarse model is higher than a user-defined threshold
, it returns to the derivation part of the empirical formulas. Otherwise, the development of the coarse model has been preliminarily completed.
The second step is to complete the construction and train the improved Neuro-SM model in the training process. Based on the improved empirical formulas for the coarse model in this paper, the correction values are added at appropriate locations. After that, the whole construction of the model is completed according to the structure of the improved Neuro-SM model. Before the whole model is trained, unit MNNs are first developed. The test error of the Neuro-SM model with unit MNNs is identical to that of the coarse model. The input data of the MNNs are and the outputs are the value 1. The number of the MNN outputs is the same as the number of . The initial data are randomly generated in MATLAB software, which has a larger range than the training data. The training data are used to adjust the internal weights of the MNNs, which affect the intermediate parameters with the correction values, so that the responses of the coarse model are consistent with the fine model. When does not satisfy the user-defined threshold while the number of hidden neurons is fewer than 100, the number of hidden neurons can be increased to increase the nonlinear degree of the proposed model. If the number of neurons has reached 100, the process returns to increase the amount of correction values and retrains the new model. The training process will not stop until meets . This process focuses on finding the number of correction values and hidden neurons that minimize the training error of the proposed model. The effectiveness of the proposed model is demonstrated by obtaining the best results with the fewest correction values and hidden neurons.
In the third training stage, the model is tested against the test data to verify that it makes good predictions for untrained data in the modeling range. If satisfies , the model development is complete. If does not satisfy , it means that the training data are insufficient and the amount of training data needs to be increased for retraining. The smaller the test error, the better the generalization ability of the proposed model.
3. Experimental Verification
In this section, the experimental verification is performed by modeling the coupled microstrip lines. The fine model is the coupled microstrip line structure built in Advance Design System (ADS) simulation software and the coarse model is the model with empirical formulas and correction values. In this experiment, the line length
and coupled lines spacing
are used as geometric variables, while
is a frequency variable. In this paper, the DOE method is used to generate training and test data ranges for geometric and frequency variables, respectively. The data ranges are shown in
Table 1. The training and test data used for the proposed model are obtained by simulation in ADS software. To ensure that the test data are untrained, different starting and ending points are chosen for the training and test data with the same intervals.
During the training process, finding a suitable set of weights can effectively reduce the difference between the coarse model and the fine model. Therefore, it is essential to choose a suitable number of correction values. The choice of correction values depends on the influence of the intermediate parameters of the empirical formulas on the S-parameter results. The correction values as factors should be multiplied by the intermediate parameters, which significantly affect the S-parameters. When correction values are added to intermediate parameters that have less impact on the responses, it not only wastes computer resources but also results in slower modeling.
Table 2 shows the training and test errors for different numbers of correction values and hidden neurons for the proposed model. When the number of correction values is fixed, the number of hidden neurons is continuously varied to find the minimum number of hidden neurons, making the training error and test error satisfy the user-defined threshold. The training and test errors are compared at different correction values to find the most appropriate number of correction values. According to
Table 2, we can find that the result with only 15 hidden neurons when the correction value is 12 is much better than the result with 35 hidden neurons when there are 8 correction values. When the number of correction values is increased to 15, the result with 18 hidden neurons is not as good as the result with 12 correction values. The result shows that too few correction values cannot satisfy the nonlinear relationship of the coupled microstrip lines, while too many correction values represent high nonlinearity and complex structure of the MNNs, leading to lower accuracy.
At a fixed number of correction values of 12, the errors for different numbers of hidden neurons are shown in
Table 3. When the number of hidden neurons is 10, the training and test errors of the model are higher than that of the model with 15 hidden neurons. This indicates that the nonlinear relationship between the input and output of the proposed model cannot be accurately expressed when the number of hidden neurons is small. However, when the number of hidden neurons is increased to 20 or 25, the training errors decrease while testing errors increase significantly. Because the models in this case are in the overlearning state, fewer hidden neurons are needed to retrain the model. It can be concluded that the best result of the proposed model can be achieved when the number of correction values is 12 and the number of hidden neurons is 15.
The correction values at the three frequency points for the geometric variables
are shown in
Table 4. The results show that the correction values change with the frequency, making the coarse model responses at each frequency point fit the fine model responses well. Numerically, the correction values vary around 1, which proves that the choice of the correction values is appropriate. Small changes in the correction values can lead to large changes in response, thus ensuring the smoothness of the model output.
The feasibility of the proposed model is compared using two existing modeling approaches based on Neuro-SM. Model 1 adds the MNNs to the coarse model input, while model 2 adds the MNNs to both the coarse model input and output [
32,
38]. The proposed model achieves the lowest training and test error, as shown in
Table 5. Comparing the results of 15 hidden neurons and 55 hidden neurons in model 1, it is found that increasing the number of hidden neurons did not reduce the training error, but significantly increased the test error. Model 1 is in the overlearning state. The error comparison of model 2 reveals that the accuracy of the model is not significantly improved only by increasing the number of hidden neurons in the input MNNs. By increasing the number of hidden neurons in the output MNNs, the training error of model 2 is reduced, but the test error fails to meet the requirement. This indicates that the amount of data used to train the model is not enough. Additional modules are needed to ensure model passivity due to the introduction of output MNNs. The above conclusions demonstrate that the proposed modeling method can accurately match the fine model while keeping the device passive.
The comparison of the
S-parameter responses among the fine model, the coarse model and the proposed model for the test data
is shown in
Figure 4. The fine model is shown as a red line, the coarse model as a magenta dashed line and the proposed model as a green down triangle. It can be seen that there is a certain gap between the coarse model and the fine model, while the
S-parameter responses of the proposed model and fine model are basically consistent. The matching results with the test data in the modeling range illustrate the feasibility of the improved empirical formula modeling method for coupled microstrip lines.
To verify the efficiency of the modeling method, computation time comparisons between the fine model in High-Frequency Structure Simulator (HFSS) software and the model built with the proposed method are shown in
Table 6. The proposed model is developed with 25 sets of EM data generated by the DOE method, and the modeling time is 19.7 m. The trained proposed model can be used instead of the EM model in circuit design, because the two models have the same characteristics in the modeling range. From
Table 6, it can be seen that the proposed model consumes less time than the fine model in the HFSS software when generating the same data. The more data that are needed, the more obvious the advantage of the proposed model. The proposed model applied into circuit design can significantly reduce the simulation time and thus shorten the device design cycle.