GPU Accelerating Algorithms for Three-Layered Heat Conduction Simulations
Abstract
:1. Introduction
1.1. Application Context
1.2. Core Scientific Question
How can GPU-accelerated solvers enhance the computational efficiency and accuracy of simulating heat conduction in multi-layered solids, particularly when solving large, sparse linear systems generated by finite difference methods, and which solver strategies (direct versus iterative) are optimal for managing the stability and condition number challenges inherent in these models?
1.3. Numerical Modelling
1.4. Computational Simulation
1.5. Main Contributions
2. Mathematical Model and Finite Difference Scheme
2.1. Heat Conduction Mathematical Model
2.2. Finite Difference Method to Approximate (9)–(12)
3. The GPU Accelerating Algorithm
- Steps 1 and 2: CPU Pre-Computation. Initially, Steps 1 and 2, are computed on the CPU (host) due to their minimal computational overhead. These steps involve the definition of input data and the discretization of space and time. The resulting values are then allocated in the memory on the GPU (device) to prepare for further processing.
- Steps 3 and 4: GPU Pre-Computation. Steps 3 and 4 are executed on the GPU using a set of CUDA kernels invoked with PyCUDA. This section involves memory allocation on the GPU and the execution of CUDA kernels to perform computations efficiently. By keeping the data on the device, the linear solver can access it easily, reducing overheads associated with transferring data between the CPU and GPU. At the conclusion of this step, unnecessary data are deallocated, retaining only the matrices and arrays required to solve the linear system and compute the solution vector u.
- Step 5: Compute the evolution of the system with GPU. Step 5 involves the use of a CPU function that calls the GPU functions implemented with CuPy to solve the systems using various numerical methods. Despite the GPU acceleration for the operations, a constant call to independent and sequential kernels is made to compute the right-hand side of the equation and solve the linear system for each iteration after completing iteration n.
Algorithm 1 Sequential Compute |
|
Algorithm 2 Parallel Compute |
|
Algorithm 3 Evolution of the system |
|
4. Simulation Performance Using the GPU Accelerating Algorithm
4.1. Experimental Setup
- -
- Hardware Configuration. Operating System: Ubuntu 22.04.03 LTS; Processor: 13th Gen Intel i5-13420H (12 cores); Memory: 16 GB DDR5 RAM; GPU: NVIDIA RTX4050 60 W; and Memory Size: 6 GB GDDR6.
- -
- Software Configuration. The computations were accelerated using CUDA technology, and the following software tools and libraries were utilized: CUDA Compilation Tools: V12.3.107, PyCUDA Version: 2024.1, CuPy Version: 13.0.0, and Precision: float64.
4.2. Physical and Mesh Parameters
4.3. Selection of Linear Solver Method
4.4. Accelerating with GPU: PyCUDA Acceleration for Steps 3 and 4
4.5. Accelerating with GPU: Evolution of the System with CuPy
5. Discussion on Findings, Implications and Future Work
5.1. Findings
5.1.1. Efficient GPU Usage
5.1.2. Acceleration Factors
5.1.3. Accuracy and Stability
5.2. Implications
5.2.1. Real-Time Simulations in Thermal Modeling
5.2.2. Solver Selection for Systems with High Condition Numbers
5.2.3. Precision Considerations
5.2.4. Framework for GPU-Optimized Finite Difference Methods
5.3. Future Work
5.3.1. High Condition Numbers
5.3.2. Multidimensional Geometries
5.3.3. Validation by Parameter Identification
5.3.4. Outlook on Scalability
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Short Supplement on GPU Computing
Appendix A.1. Computing—Solver Level
- d_Efe = gpu.gpu_compute_efe(d_x, d_t, d_L, len_t, len_x)
- d_Nsource = gpu.gpu_compute_Nsource(Dt, d_Dx, ..., d_Efe, ..., N, len_x)
- d_Efe.free()
- cupy_lu_solver(d_u0, d_Nsource, d_hA, d_hB, d_A, d_B, d_C, N, len_x)
- u, residuals = switch_solver(s, d_u0, d_Nsource, ..., N, len_x)
Appendix A.2. Computing—Matrix Level
- def gpu_compute_efe(d_x, d_t, d_L, len_t, len_x):
References
- Dutil, Y.; Rousse, D.R.; Salah, N.B.; Lassue, S.; Zalewski, L. A review on phase-change materials: Mathematical modeling and simulations. Renew. Sustain. Energy Rev. 2011, 15, 112–130. [Google Scholar] [CrossRef]
- Assis, E.; Ziskind, G.; Letan, R. Numerical and experimental study of solidification in a spherical shell. J. Heat Tran. 2009, 131, 024502. [Google Scholar] [CrossRef]
- Aydın, O. Conjugate Heat Transfer Analysis of Double Pane Windows. Build. Environ. 2006, 41, 109–116. [Google Scholar] [CrossRef]
- Ehms, J.H.N.; Oliveski, R.D.C.; Rocha, L.A.O.; Biserni, C.; Garai, M. Fixed grid numerical models for solidification and melting of phase change materials (PCMs). Appl. Sci. 2019, 9, 4334. [Google Scholar] [CrossRef]
- Liu, X.Y.; Xie, Z.; Yang, J.; Meng, H.J. Accelerating phase-change heat conduction simulations on GPUs. Case Stud. Therm. Eng. 2022, 39, 102410. [Google Scholar] [CrossRef]
- Raimundo, A.M.; Oliveira, A.V.M. Assessing the impact of climate changes, building characteristics, and HVAC control on Energy Requirements under a Mediterranean Climate. Energies 2024, 17, 2362. [Google Scholar] [CrossRef]
- Jezierski, W.; Święcicki, A.; Werner-Juszczuk, A.J. Deterministic mathematical model of energy demand of single-family building with different parameters and orientation of windows in climatic conditions of Poland. Energies 2024, 17, 2360. [Google Scholar] [CrossRef]
- Korkut, T.B.; Rachid, A. Numerical investigation of interventions to mitigate heat stress: A case study in Dubai. Energies 2024, 17, 2242. [Google Scholar] [CrossRef]
- Tan, L.; Gao, D.; Liu, X. Can Environmental information disclosure improve energy efficiency in manufacturing? Evidence from Chinese Enterprises. Energies 2024, 17, 2342. [Google Scholar] [CrossRef]
- Walacik, M.; Chmielewska, A. Energy performance in residential buildings as a property market efficiency driver. Energies 2024, 17, 2310. [Google Scholar] [CrossRef]
- Koshlak, H.; Basok, B.; Davydenko, B. Heat transfer through double-chamber glass unit with low-emission coating. Energies 2024, 17, 1100. [Google Scholar] [CrossRef]
- Jezierski, W.; Zukowski, M. Evaluation of the impact of window parameters on energy demand and CO2 emission reduction for a single-family house. Energies 2023, 16, 4429. [Google Scholar] [CrossRef]
- Tzou, D.Y. Macro to Microscale Heat Transfer. The Lagging Behaviour, 2nd ed.; Taylor & Francis: Washington, DC, USA, 2014. [Google Scholar]
- Dai, W.; Han, F.; Sun, Z. Accurate numerical method for solving dual-phase-lagging equation with temperature jump boundary condition in nano heat conduction. Int. J. Heat Mass Transf. 2013, 64, 966–975. [Google Scholar] [CrossRef]
- Jain, A.; Krishnan, G. Stability analysis of a multilayer diffusion-reaction heat transfer problem with a very large number of layers. Int. J. Heat Mass Transf. 2024, 231, 125769. [Google Scholar] [CrossRef]
- Jain, A.; Krishnan, G. Thermal stability of a two-dimensional multilayer diffusion-reaction problem. Int. J. Heat Mass Transf. 2024, 221, 125038. [Google Scholar] [CrossRef]
- Bandhauer, T.M.; Garimella, S.; Fuller, T.F. A critical review of thermal issues in lithium-ion batteries. J. Electrochem. Soc. 2011, 158, R1. [Google Scholar] [CrossRef]
- Hickson, R.I.; Barry, S.I.; Mercer, G.N.; Sidhu, H.S. Finite difference schemes for multilayer diffusion. Math. Comput. Model. 2011, 54, 210–220. [Google Scholar] [CrossRef]
- March, N.G.; Carr, E.J. Finite volume schemes for multilayer diffusion. J. Comput. Appl. Math. 2019, 345, 206–223. [Google Scholar] [CrossRef]
- Zhou, Y.; Wu, X.Y. Finite element analysis of diffusional drug release from complex matrix systems. I.: Complex geometries and composite structures. J. Control. Release 1997, 49, 277–288. [Google Scholar] [CrossRef]
- Coronel, A.; Lozada, E.; Berres, S.; Fernando, F.; Murúa, N. Mathematical modeling and numerical approximation of heat conduction in three-phase-lag solid. Energies 2024, 17, 2497. [Google Scholar] [CrossRef]
- Demmel, J.W. Applied Numerical Linear Algebra; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1997. [Google Scholar]
- Trefethen, L.N.; Bau, D. Numerical Linear Algebra; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2022. [Google Scholar]
- LeVeque, R.J. Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-State and Time-Dependent Problems; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2007. [Google Scholar]
- Van der Vorst, H.; Van Dooren, P. (Eds.) Parallel Algorithms for Numerical Linear Algebra; Elsevier: Amsterdam, The Netherlands, 2014. [Google Scholar]
- Okuta, R.; Unno, Y.; Nishino, D.; Hido, S.; Loomis, C. Cupy: A numpy-compatible library for nvidia gpu calculations. In Proceedings of the Workshop on Machine Learning Systems (LearningSys), in the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017; Volume 6. [Google Scholar]
- Klöckner, A.; Pinto, N.; Lee, Y.; Catanzaro, B.; Ivanov, P.; Fasih, A. PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation. Parallel Comput. 2012, 38, 157–174. [Google Scholar] [CrossRef]
- Sun, H.; Sun, Z.Z.; Dai, W. A second-order finite difference scheme for solving the dual-phase-lagging equation in a double-layered nano-scale thin film. Numer. Methods Partial. Differ. Equ. 2017, 33, 142–173. [Google Scholar] [CrossRef]
- Coronel, A.; Huancas, F.; Lozada, E.; Tello, A. A numerical method for a heat conduction model in a double-pane window. Axioms 2022, 11, 422. [Google Scholar] [CrossRef]
- Peng, Y.; Yamaguchi, H.; Funabora, Y.; Doki, S. Modeling Fabric-Type Actuator Using Point Clouds by Deep Learning. IEEE Access 2022, 10, 94363–94375. [Google Scholar] [CrossRef]
- Sun, W.; Ma, H.; Qu, W. A hybrid numerical method for non-linear transient heat conduction problems with temperature-dependent thermal conductivity. Appl. Math. Lett. 2024, 148, 108868. [Google Scholar] [CrossRef]
- Guarro, M.; Ferrante, F.; Sanfelice, R.G. A hybrid observer for linear systems under delayed sporadic measurements. Int. J. Robust Nonlinear Control. 2024, 34, 6610–6635. [Google Scholar] [CrossRef]
- Cao, R.; Chen, Y.; Shen, N.; Li, H.; Chen, S.; Mai, Y.; Guo, F. Enhancing Spectral Response of Thermally Stable Printed Dion–Jacobson 2D FAPbI3 Photovoltaics via Manipulating Charge Transfer. ACS Energy Lett. 2024, 9, 3737–3745. [Google Scholar] [CrossRef]
- Mubaraki, A.M.; Nuruddeen, R.I. Steady-state thermodynamic process in multilayered heterogeneous cylinder. Open Phys. 2024, 22, 20240067. [Google Scholar] [CrossRef]
Notation | Definition |
---|---|
Geometrical | |
width of ℓth-layer | |
left boundary | |
interface 1 | |
interface 2 | |
right boundary | |
interval denoting the ℓth-layer | |
space domain | |
time domain | |
space–time domain | |
Physical | |
the heat capacitance of ℓth-layer | |
heat flux phase lags of ℓth-layer | |
temperature gradient phase lags of ℓth-layer | |
thermal conductivity of ℓth-layer | |
some proportionality constants | |
Knudsen numbers | |
heat source function of ℓth-layer | |
initial distribution of the temperature | |
initial distribution of the temporal derivative of temperature | |
temperature flux at the left boundary of the solid | |
temperature flux at the right boundary of the solid |
under-diagonal | diagonal | upper-diagonal | ||
under-diagonal | diagonal | upper-diagonal | ||
under-diagonal | diagonal | upper-diagonal | ||
under-diagonal | diagonal | upper-diagonal | ||
under-diagonal | diagonal | upper-diagonal | ||
Step | Step Description | CPU/GPU Usage |
---|---|---|
1 | Definition of Input Data | CPU Pre-Computation |
2 | Discretization of Space and Time | CPU Pre-Computation |
3 | Evaluation of Functions on the Mesh | GPU Pre-Computation |
4 | Calculation of Matrices and Vectors | GPU Pre-Computation |
5 | Discretization of Equations | Compute evolution of the system with GPU |
Layer 1 | Layer 2 | Layer 3 | |
---|---|---|---|
1/3 | 1/3 | 1/3 | |
1 | 1 | 1 | |
1 | 1 | 1 | |
1 | 4 | 4/3 | |
4 | 1 | 6 |
N | 100 | 200 | 500 | 1000 | 2000 | 5000 | 10,000 | 100,000 | |
---|---|---|---|---|---|---|---|---|---|
M3 | |||||||||
96 | A | ||||||||
192 | A | ||||||||
384 | A | ||||||||
768 | A | ||||||||
1536 | A | ||||||||
3072 | A | ||||||||
6144 | A | ||||||||
Speedup in Step 3 | Speedup in Step 4 | |
---|---|---|
96 | ||
192 | ||
384 | ||
768 | ||
1536 | ||
3072 | ||
6144 |
LU Speedup | QR Speedup | |
---|---|---|
96 | ||
192 | ||
384 | ||
768 | ||
1536 | ||
3072 | ||
6144 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Murúa, N.; Coronel, A.; Tello, A.; Berres, S.; Huancas, F. GPU Accelerating Algorithms for Three-Layered Heat Conduction Simulations. Mathematics 2024, 12, 3503. https://doi.org/10.3390/math12223503
Murúa N, Coronel A, Tello A, Berres S, Huancas F. GPU Accelerating Algorithms for Three-Layered Heat Conduction Simulations. Mathematics. 2024; 12(22):3503. https://doi.org/10.3390/math12223503
Chicago/Turabian StyleMurúa, Nicolás, Aníbal Coronel, Alex Tello, Stefan Berres, and Fernando Huancas. 2024. "GPU Accelerating Algorithms for Three-Layered Heat Conduction Simulations" Mathematics 12, no. 22: 3503. https://doi.org/10.3390/math12223503
APA StyleMurúa, N., Coronel, A., Tello, A., Berres, S., & Huancas, F. (2024). GPU Accelerating Algorithms for Three-Layered Heat Conduction Simulations. Mathematics, 12(22), 3503. https://doi.org/10.3390/math12223503