1. Introduction
Numerical modeling methods based on the finite element method (FEM) [
1], finite difference method (FDM) [
2], and finite volume method (FVM) [
3] have made remarkable progress in solving physical problems. These methods are commonly used to describe partial differential equations (PDEs) for natural phenomena, relying on polynomials, segmented polynomials, and other elementary functions for modeling. Although traditional methods are considered to provide efficient and reliable solutions, their applicability is limited as the dimensionality of the problem increases, a phenomenon known as the “curse of dimensionality” (CoD) [
4]. The CoD means that computational operations and resource requirements grow exponentially as the problem dimension increases, limiting the scope of application of these methods. In addition, grid-based methods suffer from discretization errors that fail to capture the required resolution of the modeling system when the grid size is not small enough, leading to inaccurate results. These challenges make it more difficult to solve PDEs in high-dimensional problems. When some physical problems with special properties are involved, such as chemical phase transition coupled porous media flow [
5], lattice Boltzmann method coupled temperature field prediction [
6], and microscopic fluid flow [
7]. Complex chemical phase transitions and physical variability between the macroscopic and microscopic levels can lead to a dramatic increase in the computational cost of traditional numerical modeling methods. Faced with these physical problems with phase transitions and the obstacles in solving PDEs, researchers are actively looking for any potential solutions.
The rapid development of machine learning in recent decades has introduced new opportunities in various fields, having a significant impact on science and engineering and providing an opportunity for a shift in the solution paradigm. Machine learning offers brand new possibilities for numerical modeling, prediction, and forward problem solving for the study of physics problems.
Deep learning is a new and highly emphasized technique in the field of machine learning, one of the most important branches, and deep learning algorithms have been extremely effective in the fields of engineering and science, such as computer vision, speech recognition, natural language processing, assisted driving [
8,
9,
10,
11], etc. They can process large-scale data based on the special structure of their networks, extract high-level features, and achieve excellent performance and accuracy in various tasks such as image classification, target detection, semantic segmentation, machine translation [
12,
13,
14,
15], etc. Typical DNN models include multilayer perceptron (MLP) [
16], convolutional neural network [
17], and recurrent neural network (RNN) [
18].
Physics-informed neural networks (PINNs), as new deep learning algorithms, can integrate physics law constraints with neural network algorithms for PDEs. They provide new methods for solving PDEs in addition to the traditional numerical modeling methods. PINNs predict the results of PDEs without constructing a detailed grid, thus overcoming the CoD problem experienced by traditional numerical modeling methods. Instead of using a grid for spatiotemporal stepping, PINNs collect points irregularly from the defined domain through different collection distributions [
19]. The PDEs are embedded in PINNs in the form of a loss function, rather than an algebraic matrix as in traditional numerical modeling methods. The gradient optimizer is a residual minimizer in PINNs [
20], which is quite different from the linear solver in traditional numerical modeling methods.
The paradigm shift of PINNs provides a new approach to solving PDEs for practical physical applications where they are needed. Raissi et al. [
21,
22] proposed and used PINNs to solve several cases of PDEs. They applied PINNs to solve NS equations to simulate the flow around the circular cylinder case. The SRT-LBM-PINN model was proposed by Liu et al. [
23], who used PINNs in combination with a single relaxation time lattice Boltzmann method to solve inverse problems in fluid mechanics. They used PINNs to solve PDEs with unclear boundary conditions and initial conditions. Lou et al. [
24] solved the forward–backward problem in fluid dynamics by using PINNs in conjunction with the BGK equation. In addition, the performance of PINNs is extremely promising for applications in the engineering sciences. The stiff-PINN method developed by Ji et al. [
25] utilizes QSSA to enable PINNs to solve for stiffening chemical kinetics, which opens up the possibility of applying PINNs to a wide variety of reaction–diffusion systems involving stiffening kinetics. Huang et al. [
26] applied PINNs in a power system. Zhong et al. [
27] proposed two generalized artificial intelligence frameworks for low-temperature plasma simulations, the coefficient-subnetwork (CS-PINN) physics-informed neural network and the Runge–Kutta physics-informed neural network (RK-PINN). PINNs are also prominent in coastal storm surge prediction [
28] and geophysics [
29], where they provide functions including state analysis, parameter estimation, dynamic analysis, macroscopic physical quantity computation, anomaly detection and localization, and data synthesis for real-world problems by integrating specific physical information into state-of-the-art deep learning methods.
Although PINNs have been widely used in solving PDEs, they still face many challenges, such as poor interpretability, difficult debugging of parameters, training difficulties, and unclear optimization strategies. Especially when dealing with higher-dimensional physical problems, PINNs consume a lot of time for training and debugging. For problems with nonunique solutions, traditional PINNs may not appear smooth enough in design. The hard-constraint physics-informed neural network (HPINN) was proposed by Lu et al. [
30] for solving topology optimization problems. Lu et al. found that compared with traditional constrained optimization methods for PDEs based on the adjoint method and numerical PDEs solvers, HPINN is able to achieve similar optimization objectives and the resulting designs are smoother when dealing with problems with nonunique solutions.
HPINN takes advantage of the recent advances in PINNs for solving PDEs without the need for large-scale datasets (generated by numerical PDEs solvers) for training and imposes hard constraints using penalization and augmented Lagrange methods. The training and debugging of HPINN re still extremely challenging. HPINN has advantages in terms of fast convergence and accuracy as well as solution smoothness. HPINN abandons the flexible boundary conditions and initial condition settings of traditional PINNs, making it extremely difficult to formulate an optimization strategy for HPINN.
In order to improve the performance of HPINN and develop an optimization strategy, we propose a hard-constraint wide-body network PINN model (HWPINN). HWPINN introduces a wider neural network structure, which enhances the approximation ability of neural networks. By combining the highly constrained nature of HPINN, HWPINN further improves the accuracy of PINNs in solving PDEs while reducing the time and number of iterations required for training. This model provides a new optimization idea for optimizing HPINNs from the point of view of changing the network model, which offers the possibility of solving PDEs in positive problems with higher accuracy and speed.
In this manuscript, we focus on three cases of PDEs: the 1D Allen–Cahn (AC) equation, the 1D Burgers equation, and the 2D wave equation. Among them, the 1D AC equation and the 1D Burgers equation are common PDEs in phase transition dynamics and computational fluid dynamics, and the code for their tensor-flow versions is provided in reference [
22]. To compare the performance of HWPINN in different dimensions, the 2D wave equation is provided in this paper. HWPINN is able to provide higher PDE solution accuracy and faster training time than the traditional soft-constraint PINN. A series of numerical experiments were set up for the case of PDEs of interest to compare the performance of HWPINN with respect to soft-constraint PINN (SPINN) and HPINN, while appropriate test hyperparameters were applied for each experiment. It is worth mentioning that SPINN can also provide better prediction owing to its wide-body structure, but the constraints (residual function) of SPINN can be flexibly adjusted; the soft-constrained wide-body PINN (SWPINN) was used as a comparison test in this study.
The rest of the manuscript is set up as follows:
Section 2 describes the experimental methodology and the design approach for PDEs combined with PINNs, as well as the network structure.
Section 3 shows the experimental results obtained with the four models, SPINN, HPINN, SWPINN, and HWPINN, to solve the three PDEs cases of interest.
Section 4 presents our conclusions and future work.
3. Results
In the results in this section, for computational convenience, all physical parameters were nondimensionalized, ensuring consistency in the flow and heat transfer criteria before and after nondimensionalization. In the experimental part, all the models were trained using the same convergence conditions, and the training was stopped at 10,000 iterations. All modeling experiments were performed on machines equipped with NVIDIA TITANX GPUs and Windows operating systems in Shanghai, China. The back end of all PINN designs was implemented using Pytorch, where the code for the 1D AC equation and the 1D Burgers equation was converted from the Tensorflow code provided in reference [
22].
3.1. One-Dimensional Allen–Cahn Equation
The Allen–Cahn equation, a prominent mathematical model in the realm of material science and physics, captures the evolution of phase boundaries in various materials [
39]. This PDE delineates the gradual transition between different phases within a material. The 1D AC equation is commonly used to study phase transition problems in linear structures, such as the movement of interfaces and the propagation of phase transitions [
40].
The 1D AC equation case in this subsection is set up as follows:
The boundary conditions and initial conditions are set as follows:
The diffusion term is set to 0.0001, where and . The 1D AC equation case is a PDE with periodic boundary conditions.
Hard constraints are implemented by adding a hard constraint function to the physics-informed part of the network architecture, which is set as follows:
where
denotes the hard-constraint function, and
is the output of the approximator neural network (in
Figure 1). The hard-constraint function mandatorily constrains the output of the neural network in terms of boundary and initial conditions, making it linearly smoother and closer to the exact solution of the PDE.
In the case of the 1D AC equation, we designed a series of experiments to verify the performance of the HWPINN proposed in this paper. There were four neural network architectures used in the numerical experiments: HWPINN, hard-constraint PINN (HPINN), traditional PINN with soft constraint (SPINN), and a wide-body soft-constraint PINN (SWPINN).
As shown in
Table 1, for the parameter settings of the neural network part, both HWPINN and SWPINN used the wide-body model of the neural network architecture. The neural network part was set to 8 layers and 32 neurons in the hidden layers; the wide-body network part was set to 8 layers and 32 neurons. For HPINN and SPINN, the parameters in the neural network were set to 8 layers of 64 neurons in the hidden layers, and the maximum training number was set to 10,000. To verify that the wide-body model was not simply widening the network itself, the total number of settings for the number of neurons in the hidden layers for all four PINN models was 64.
Benefiting from the properties of PINN in solving forward problems in PDEs, no specific dataset is required for training, and PINN discretely collects training points randomly as training data within the space–time of the computational domain. In this case, the density of the collection points is set to , and the density of the collection points on the boundary was similarly set to . These collection points contain only spatial coordinate information , and discrete temporal information .
From the training loss curves shown in
Figure 3, it can be concluded that the HWPINN and HPINN models converge quickly under the set conditions, and due to the very low density of the collection points, the training loss curves of the SWPINN and SPINN models show overfitting, which needs to be solved by changing the training parameters. A comparison of the training loss curves showed that HWPINN can extract abstract features from the solution space of PDEs faster than other models and is more effective in solving high-dimensional problems [
30].
Figure 4 illustrates the FDM solution of the 1D AC equation (Equation (11)). The results predicted by the four neural network architectures are presented in
Figure 5. The results in
Figure 5 show that HPINN and HWPINN outperform SPINN and SWPINN for a small number of iterations (maximum number of iterations = 10,000).
For a more intuitive comparison of the effects of HPINN, HWPINN, SPINN, SWPINN, a comparison chart of the absolute error visualization is provided in
Figure 5. The absolute error
is defined as follows:
where
denotes the results of FDM, and
represents the results predicted using the PINN models.
Figure 6 visualizes the absolute errors of the four model predictions with respect to the FDM results, more intuitively demonstrating the advantages of HWPINN over SWPINN, SPINN, and HPINN, especially when the collection point density was low and fewer iterative loops were set up in the experiments. The results in
Figure 6a,b present the absolute error of HPINN and HWPINN; their comparison clearly shows the strength of HWPINN. The results in
Figure 6c,d show that the PINN with soft constraint performed badly with the very low density of collection points and number of iterations designed in this experiment and that the loss functions of all four architectures converged when the maximum number of iterations (10,000) was reached.
Table 2 lists the mean square error (MSE), root mean square error (RMSE), R-squared (R2), and relative
L2 error of the prediction results of the four models: HPINN, HWPINN, SPINN, and SWPINN. The relative
L2 error is defined as follows:
where
represents the number of predicted points,
denotes the results predicted by the neural networks, and
represents the results of FDM.
By analyzing the data in
Table 2, we conclude that HWPINN demonstrated significantly better performance than the other PINN architectures under the experimental conditions we set. Due to architectural advantages, the computation time of HWPINN and HPINN was better than that of SPINN and SWPINN. HPINN already has many advantages in solving the 1D AC equation, and HWPINN is a complementary model to the HPINN model, which is characterized by faster convergence and slightly better predictions than HPINN when working with small datasets.
3.2. One-Dimensional Burgers Equation
The 1D Burgers equation is typically used to study phenomena in the fields of fluid dynamics, nonlinear fluctuation theory, and phase transition dynamics. This equation describes the nonlinear fluctuation behavior in a nonviscous, incompressible fluid. Although its mathematical form is relatively simple, its analytical solution is often difficult to obtain due to the inclusion of nonlinear terms, causing researchers to focus more on numerical simulations, approximation methods, and mathematical analysis.
The numerical experimental case set up for the 1D Burgers equation is as follows:
Equation (16) denotes the 1D Burgers equation with the coefficient of viscosity
, and both boundary and initial condition settings are included. Equation (16) can be obtained via shifting the terms of Equation (2). The range is set to
.
The hard-constraint function of the 1D Burgers equation is shown in Equation (17). The hard constraint is implemented in the same way as in the case of the AC equation, imposed according to the initial and boundary conditions. is the output of the approximator neural networks. is output of the neural network part of PINNs with hard constraints imposed using the penalty method and the ALM.
The input of the neural network is and the output is . HWPINN and SWPINN use the same neural network parameter settings. The neural network part was set to 8 layers and 32 neurons in the hidden layers, the wide-body network part was set to 8 layers and 32 neurons. For HPINN and SPINN, the parameters in the neural network were set to 8 layers of 64 neurons in the hidden layers, and the maximum number of iterations was set to 10,000 iterations. The density of the collection points was set to , and the density of the collection points on the boundary was similarly set to . These collection points contained only spatial coordinate information and discrete temporal information .
Figure 7 illustrates the training loss curves for HWPINN, HPINN, SPINN, and SWPINN. It can be inferred that the PINN models (HWPINN and SWPINN) that incorporate wide-body structures performed more smoothly in terms of the training loss curves. It is worth mentioning that the residual function was discarded when hard constraints were used, resulting in a loss function value that does not match the traditional soft-constrained PINNs (SPINN and SWPINN) in terms of scale, and the results presented in
Figure 7 are only used as a reference for approximating the smoothness. The accuracy of the prediction results was compared in terms of percentage error.
Figure 8 shows the FDM solution of the 1D Burgers equation set up in the numerical experiments described in this subsection.
Figure 9a–d display the results of the 1D Burgers equation predicted using HPINN, HWPINN, SPINN, and SWPINN; a comparison revealed that HWPINN, HPINN, and SWPINN all perform relatively well. SPINN performs poorly due to the fact that very few sampling points were used for training, resulting in a residual neural network constrained by the residual function that fails to capture the features of the solution space of the 1D Burgers equation. In order to more intuitively demonstrate the prediction accuracy of the four models,
Figure 10 provides a visualization of the absolute error.
The maximum absolute error of HWPINN is not more than 0.4 in the absolute error distribution plot of HWPINN shown in
Figure 10b, which shows the superior advantage of the model proposed in this paper in terms of fast convergence to provide high accuracy in the case of sparse collection point density. It is worth noting that as the collection point density increases, the SWPINN model with the addition of the wide-body structure is also able to capture the abstract features of the solution of the 1D Burgers equation with relatively low-density collection points. This confirms that the inclusion of the wide-body structure has the effect of enhancing the performance of neural networks in the PINN architecture.
The error evaluation of HWPINN, HPINN, SPINN, and SWPINN are presented in
Table 3, where it can be observed that HWPINN has a significant advantage over the other models in solving the 1D Burgers equation for the experimental set up. The SWPINN model also obtained predictions with some accuracy due to the increase in the number of collection points, which demonstrates the enhancement of the wide-body structure for the Approximator neural network in the PINN architecture. In terms of computational time, HWPINN still has an advantage over traditional PINNs.
3.3. Two-Dimensional Wave Equation
In this subsection of numerical experiments, we extended the PDE case from 1D to 2D, with the aim of validating the performance of the proposed HWPINN model in 2D problems. Having extensibility across dimensions is one of the advantages provided by the PINN architecture in combination with strict physical laws.
The wave equation finds diverse applications in oceanography, including simulating waves, tides, and tsunamis; analyzing marine structures’ and ships’ responses to waves and studying underwater acoustics, providing essential insights for marine engineering, safety, and environmental monitoring [
41].
The 1D wave equation is a mathematical model describing fluctuations propagating along a straight line, and the 2D wave equation is its generalization in 2 dimensions, introducing a variable
. For the definition of a 2D wave equation, see Equation (3). The experimental set up for the 2D wave equation is as follows:
where
is the input,
is the output of the neural network and the velocity of wave
= 1. The range is set to
. The physical definition of this can be understood as applying a peak initial velocity to the membrane
in a 2D region at moment
t = 0. The membrane vibrates back and forth in time, and, at the same time, the resulting wave propagates in all directions. The initialization of the 2D wave equation is shown in
Figure 11. The 2D wave field is formed in a two-dimensional plane. The solution to the 2D wave equation is a function of the wave field.
The implementation of the hard-constraint function is defined by the following:
where
is the output constrained by the hard-constraint function.
represents the output of the approximator neural networks.
As for the network parameters, HWPINN and SWPINN used the same neural network parameter settings. The neural network part was set to 8 layers and 32 neurons in the hidden layers, the wide-body network part was set to 8 layers and 32 neurons. For HPINN and SPINN, the parameters in the neural network were set to 8 layers of 64 neurons in the hidden layers, and the maximum number of iterations was set to 10,000 iterations. The density of the collection points was set to , and the density of the collection points on the boundary was similarly set to . These collection points contained only spatial coordinate information and discrete temporal information .
As shown in
Figure 12, the convergence intervals of the training loss curves of HWPINN, HPINN, SPINN, and SWPINN are very close to each other in the case of the 2D wave equation, and it can be noticed that the training curves of HWPINN and HPINN are much smoother. It can be concluded that the network structure of HWPINN is more numerically stable than those of SPINN, SWPINN, and HPINN. It is worth noting that since HWPINN adds physical constraints in a different way than traditional residual neural networks, the loss function has a certain scale difference, which explains the different intervals of the loss curve values. The loss curves reveal the fast convergence property of HWPINN.
Figure 13 illustrates the FDM results for the 2D wave equation, presented in the form of three time-slices in the computational time domain.
Figure 14 shows the predictions for HWPINN, HPINN, SPINN, and SWPINN at time slices
,
, and
. The absolute errors of the prediction results for these four models are shown in
Figure 15 to give a more visual indication of the performance when solving the 2D wave equation. The comparison shows that HWPINN has the best performance in terms of absolute error, which does not exceed 0.02 over the entire time space of the computational domain. SPINN predicts the worst absolute error, which demonstrates the improvement in the prediction stability provided by HWPINN with respect to the conventional PINN.
Figure 15a–c show the absolute errors of the HPINN prediction results, and
Figure 15d–f show the absolute errors of the HWPINN prediction results. We conclude from the comparison that the HWPINN provides more accurate predictions and the stability is improved relative to those of the HPINN.
The error evaluation is provided in
Table 4. The data in
Table 4 show that HWPINN performs optimally and SPINN performs the worst under the set conditions, which reflects the fact that SPINN is difficult to debug although it can flexibly adapt to various boundary conditions. The comparison of the relative L
2 errors of HWPINN and HPINN demonstrates that HWPINN is a more accurate predictor than HPINN, improving the accuracy by a factor of about three. Changing from 1D PDEs to 2D PDEs significantly increases the complexity of the computation and the computation time. In particular, in this case, HWPINN and HPINN have a significant advantage in computation time over conventional PINN due to the hard-constraint-specific properties, which can increase the speed.
4. Conclusions
In this paper, we proposed the HWPINN neural network architecture based on a wide-body structure and hard-constraint method, and we set up a series of numerical experiments using PINN to solve PDEs. Numerical experiments were designed using sparser PINN collection point densities in conjunction with defined boundary conditions, and the results show that HWPINN converges faster and makes more accurate predictions than SPINN.
HWPINN is better than the PINN architecture with soft constraints in terms of the accuracy of the results and the speed of convergence. HWPINN still has disadvantages because it needs to be modeled with hard-constraint functions. Hard constraints also have the disadvantage of being inflexible and difficult to debug. Additionally, HWPINN does not perform consistently in dealing with some flexible boundary conditions. For example, when solving problems with Neumann boundaries, a residual function still needs to be construct to allow HWPINN to participate in fitting the constrained part of the process using traditional PINN technologies to increase its stability. In solving real physical problems, HWPINN needs a more refined design when facing complex boundary conditions. The flexibility and powerful approximation ability of traditional PINN enable it to handle all kinds of PDEs, whether homogeneous or nonhomogeneous, linear or nonlinear, systematic or monolithic; HWPINN discards the flexibility of PINN in boundary conditions to make the approximation smoother, converges more rapidly, is convenient for debugging, and provides better overall performance. When solving higher-dimensional problems, such as porous media flow, phase transition dynamics, and supercritical airfoil optimization, PINN is often confronted with the problem of being unable to debug and difficult to compute; then, the advantages of HWPINN, including ease of computation and smoothness of the fit, allow the modeling to be carried out normally.
For the part of solving PDEs with which we were concerned, we set up several cases” 1D AC equation, 1D Burgers equation, and 2D wave equation. The experimental results for different dimensional PDEs showed that HWPINN has better numerical stability; due to the more efficient network structure design, HWPINN can also converge to a suitable solution when the number of training iterations is small. For the traditional PINN, SPINN, and SWPINN with an added wide-body structure, the performance is not advantageous under the experimental conditions we set. The reason for this is that although PINN with soft constraints can more flexibly adapt to different problems, it requires more training iterations and more complex optimization strategies due to the complexity and instability of its constraints.
In future work, we envision utilizing HWPINN to solve a wider range of PDE problems, with a special emphasis on utilizing HWPINN to solve positive and negative PDEs. In addition, we are keen to explore the synergies between HWPINN and physical models, with the intention of adopting this combined approach to efficiently solve PDEs in real-world problems. The potential applications in phase transition dynamics and CFD offer exciting prospects for pushing the limits of HWPINN’s capabilities and refining its performance in a variety of complex physical scenarios. As our research progresses, our focus will extend beyond the traditional PDE cases explored in current experiments, and we will explore ways to improve the stability of HWPINN in solving real-world physics problems. We anticipate that incorporating HWPINN into the realm of phase transition dynamics and CFD will not only enhance its versatility and provide solutions for high-dimensional PDEs but will also provide groundbreaking solutions to real-world problems involving complex physical processes.