Enabling Parallel Performance and Portability of Solid Mechanics Simulations Across CPU and GPU Architectures
Abstract
:1. Introduction
2. Solid Mechanics and Strategies for Acceleration
3. Modernizing Fortran Solid Mechanics Codes
3.1. Naming Conventions in MATAR
3.2. MATAR Implementation Examples
Listing 1. The MATAR programming syntax is compared to Fortran programming syntax for matrix allocation and addition. The C++ coding syntax with MATAR has similarities to the Fortran language. In (a), three 2D matrices are allocated on the device—a multi-core CPU or the GPU—depending on the Kokkos backend used. The contents inside the DO_ALL loop, as shown in (b), are executed in parallel on the device. The Fortran coding, which is serial, is shown in (c,d). |
Listing 2. The MATAR programming syntax is compared to Fortran programming syntax for a matrix–vector multiply. In (a), two vectors and a matrix are allocated on the device (e.g., a GPU). The contents inside the DO_ALL loop, as shown in (b), are executed in parallel on the device. Nested parallelism was used in this example. An alternative parallel implementation would place a serial loop inside a parallel DO_ALL loop, where the serial loop performs the addition. The Fortran coding, which is serial, is shown in (c,d). |
4. Fierro Mechanics Code
Listing 3. In this work, the user-defined material model (e.g., VPSC-GMM implementation) in Fierro is called inside a parallel loop over all the elements in the mesh. The parallel for loop syntax with the MATAR library is FOR_ALL. The coding shown here will run in parallel (via the Kokkos library) on a multi-core CPU using OpenMP or pthreads, and it will run in parallel on a GPU using CUDA for NVIDIA hardware or HIP for AMD hardware. Additional Kokkos backends are available for more fine-grained parallelism than mentioned here. |
5. Test Cases
5.1. Isotropic Hypoelastic–Plastic Model
5.2. Crystal Plasticity Model
5.3. VPSC-GMM
5.3.1. Stand-Alone VPSC
5.3.2. VPSC-GMM Code Coupled to the Fierro Code
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Edwards, H.C.; Trott, C.; Sunderland, D. Kokkos. J. Parallel Distrib. Comput. 2014, 74, 3202–3216. [Google Scholar] [CrossRef]
- Deakin, T.; McIntosh-Smith, S.; Price, J.; Poenaru, A.; Atkinson, P.; Popa, C.; Salmon, J. Performance portability across diverse computer architectures. In Proceedings of the 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), Denver, CO, USA, 22 November 2019; pp. 1–13. [Google Scholar]
- Dunning, D.J.; Morgan, N.R.; Moore, J.L.; Nelluvelil, E.; Tafolla, T.V.; Robey, R.W. MATAR: A Performance Portability and Productivity Implementation of Data-Oriented Design with Kokkos. J. Parallel Distrib. Comput. 2021, 157, 86–104. [Google Scholar] [CrossRef]
- Morgan, N.; Yenusah, C.; Diaz, A.; Dunning, D.; Moore, J.; Roth, C.; Lieberman, E.; Walton, S.; Brown, S.; Holladay, D.; et al. On A Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures. Information, 2024; accepted. [Google Scholar] [CrossRef]
- Zecevic, M.; Lebensohn, R.; Rogers, M.; Moore, J.; Chiravalle, V.; Lieberman, E.; Dunning, D.; Shipman, G.; Knezevic, M.; Morgan, N. Viscoplastic self-consistent formulation as generalized material model for solid mechanics applications. Appl. Eng. Sci. 2021, 6, 100040. [Google Scholar] [CrossRef]
- Morgan, N.; Moore, J.; Brown, S.; Chiravalle, V.; Diaz, A.; Dunning, D.; Lieberman, E.; Walton, S.; Welsh, K.; Yenusah, C.; et al. Fierro. 2021. Available online: https://github.com/LANL/Fierro (accessed on 21 October 2024).
- Diaz, A.; Morgan, N.; Bernardin, J. A parallel multi-constraint topology optimization solver. In Proceedings of the ASME 2022 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference IDETC/CIE2022, St. Louis, MO, USA, 14–17 August 2022. [Google Scholar]
- Diaz, A.; Morgan, N.; Bernardin, J. Parallel 3D topology optimization with multiple constraints and objectives. Optim. Eng. 2023, 25, 1531–1557. [Google Scholar] [CrossRef]
- Chiravalle, V.; Morgan, N. A 3D finite element ALE method using an approximate Riemann solution. Int. J. Numer. Methods Fluids 2016, 83, 642–663. [Google Scholar] [CrossRef]
- Burton, D.; Carney, T.; Morgan, N.; Sambasivan, S.; Shashkov, M. A Cell Centered Lagrangian Godunov-like method for solid dynamics. Comput. Fluids 2013, 83, 33–47. [Google Scholar] [CrossRef]
- Liu, X.; Morgan, N.; Burton, D. A high-order Lagrangian discontinuous Galerkin hydrodynamic method for quadratic cells using a subcell mesh stabilization scheme. J. Comput. Phys. 2019, 386, 110–157. [Google Scholar] [CrossRef]
- Liu, X.; Morgan, N.R.; Lieberman, E.J.; Burton, D.E. A fourth-order Lagrangian discontinuous Galerkin method using a hierarchical orthogonal basis on curvilinear grids. J. Comput. Appl. Math. 2022, 404, 113890. [Google Scholar] [CrossRef]
- Lieberman, E.; Liu, X.; Morgan, N.; Luscher, D.J.; Burton, D. A higher-order Lagrangian discontinuous Galerkin hydrodynamic method for solid dynamics. Comput. Methods Appl. Mech. Eng. 2019, 353, 467–490. [Google Scholar] [CrossRef]
- Abgrall, R.; Lipnikov, K.; Morgan, N.; Tokareva, S. Multidimensional staggered grid residual distribution scheme for Lagrangian hydrodynamics. SIAM J. Sci. Comput. 2020, 42, A343–A370. [Google Scholar] [CrossRef]
- Moore, J.; Morgan, N.; Horstemeyer, M. ELEMENTS: A high-order finite element library in C++. SoftwareX 2019, 10, 100257. [Google Scholar] [CrossRef]
- Yenusah, C.O.; Morgan, N.R.; Lebensohn, R.A.; Zecevic, M.; Knezevic, M. A parallel and performance portable implementation of a full-field crystal plasticity model. Comput. Phys. Commun. 2024, 300, 109190. [Google Scholar] [CrossRef]
- Zecevic, M.; Lebensohn, R.A.; Capolungo, L. New large-strain FFT-based formulation and its application to model strain localization in nano-metallic laminates and other strongly anisotropic crystalline materials. Mech. Mater. 2022, 166, 104208. [Google Scholar] [CrossRef]
- Bathe, K.J. Finite Element Procedures; Prentice Hall: Englewood Cliffs, NJ, USA, 1996. [Google Scholar]
- Lebensohn, R.; Tomé, C. A self-consistent anisotropic approach for the simulation of plastic deformation and texture development of polycrystals: Application to zirconium alloys. Acta Metall. Mater. 1993, 41, 2611–2624. [Google Scholar] [CrossRef]
- Zecevic, M.; Knezevic, M. An implicit formulation of the elasto-plastic self-consistent polycrystal plasticity model and its implementation in implicit finite elements. Mech. Mater. 2019, 136, 103065. [Google Scholar] [CrossRef]
- Zecevic, M.; Pantleon, W.; Lebensohn, R.; McCabe, R.; Knezevic, M. Predicting intragranular misorientation distributions in polycrystalline metals using the viscoplastic self-consistent formulation. Acta Mater. 2017, 140, 398–410. [Google Scholar] [CrossRef]
- Lhadi, S.; raj purohit Purushottam raj purohit, R.; Richeton, T.; Gey, N.; Berbenni, S.; Perroud, O.; Germain, L. Elasto-viscoplastic tensile behavior of as-forged Ti-1023 alloy: Experiments and micromechanical modeling. Mater. Sci. Eng. A 2020, 787, 139491. [Google Scholar] [CrossRef]
- Kalidindi, S.; Bronkhorst, C.; Anand, L. Crystallographic texture evolution in bulk deformation processing of FCC metals. J. Mech. Phys. Solids 1992, 40, 537–569. [Google Scholar] [CrossRef]
- Ardeljan, M.; Beyerlein, I.; Knezevic, M. A dislocation density based crystal plasticity finite element model: Application to a two-phase polycrystalline HCP/BCC composites. J. Mech. Phys. Solids 2014, 66, 16–31. [Google Scholar] [CrossRef]
- Knezevic, M.; Drach, B.; Ardeljan, M.; Beyerlein, I. Three dimensional predictions of grain scale plasticity and grain boundaries using crystal plasticity finite element models. Comput. Methods Appl. Mech. Eng. 2014, 277, 239–259. [Google Scholar] [CrossRef]
- Taylor, G. Plastic strain in metals. J. Inst. Met. 1938, 62, 307–324. [Google Scholar]
- Fromm, B.; Adams, B.; Ahmadi, S.; Knezevic, M. Grain size and orientation distributions: Application to yielding of α-titanium. Acta Mater. 2009, 57, 2339–2348. [Google Scholar] [CrossRef]
- Lebensohn, R.; Kanjarla, A.; Eisenlohr, P. An elasto-viscoplastic formulation based on fast Fourier transforms for the prediction of micromechanical fields in polycrystalline materials. Int. J. Plast. 2012, 32–33, 59–69. [Google Scholar] [CrossRef]
- Eghtesad, A.; Knezevic, M. High-performance full-field crystal plasticity with dislocation-based hardening and slip system back-stress laws: Application to modeling deformation of dual-phase steels. J. Mech. Phys. Solids 2020, 134, 103750. [Google Scholar] [CrossRef]
- Lieberman, E.; Lebensohn, R.; Menasche, D.; Bronkhorst, C.; Rollett, A. Microstructural effects on damage evolution in shocked copper polycrystals. Acta Mater. 2016, 116, 270–280. [Google Scholar] [CrossRef]
- Segurado, J.; Lebensohn, R.; Llorca, J.; Tomé, C. Multiscale modeling of plasticity based on embedding the viscoplastic self-consistent formulation in implicit finite elements. Int. J. Plast. 2012, 28, 124–140. [Google Scholar] [CrossRef]
- Barrett, T.; Knezevic, M. Deep drawing simulations using the finite element method embedding a multi-level crystal plasticity constitutive law: Experimental verification and sensitivity analysis. Comput. Methods Appl. Mech. Eng. 2019, 354, 245–270. [Google Scholar] [CrossRef]
- Zecevic, M.; Beyerlein, I.; Knezevic, M. Coupling elasto-plastic self-consistent crystal plasticity and implicit finite elements: Applications to compression, cyclic tension-compression, and bending to large strains. Int. J. Plast. 2017, 93, 187–211. [Google Scholar] [CrossRef]
- Gierden, C.; Kochmann, J.; Waimann, J.; Kinner-Becker, T.; Sôlter, J.; Svendsen, B.; Reese, S. Efficient two-scale FE-FFT-based mechanical process simulation of elasto-viscoplastic polycrystals at finite strains. Comput. Methods Appl. Mech. Eng. 2021, 374, 113566. [Google Scholar] [CrossRef]
- Barton, N.; Bernier, J.; Knap, J.; Sunwoo, A.; Cerreta, E.; Turner, T. A call to arms for task parallelism in multi-scale materials modeling. Int. J. Numer. Methods Eng. 2011, 86, 744–764. [Google Scholar] [CrossRef]
- Panchal, J.; Kalidindi, S.; McDowell, D. Key computational modeling issues in Integrated Computational Materials Engineering. Comput.-Aided Des. 2013, 45, 4–25. [Google Scholar] [CrossRef]
- Li, D.; Garmestani, H.; Schoenfeld, S. Evolution of crystal orientation distribution coefficients during plastic deformation. Scr. Mater. 2003, 49, 867–872. [Google Scholar] [CrossRef]
- Shaffer, J.; Knezevic, M.; Kalidindi, S. Building texture evolution networks for deformation processing of polycrystalline fcc metals using spectral approaches: Applications to process design for targeted performance. Int. J. Plast. 2010, 26, 1183–1194. [Google Scholar] [CrossRef]
- Knezevic, M.; Kalidindi, S.; Mishra, R. Delineation of first-order closures for plastic properties requiring explicit consideration of strain hardening and crystallographic texture evolution. Int. J. Plast. 2008, 24, 327–342. [Google Scholar] [CrossRef]
- Kalidindi, S.; Duvvuru, H.; Knezevic, M. Spectral calibration of crystal plasticity models. Acta Mater. 2006, 54, 1795–1804. [Google Scholar] [CrossRef]
- Knezevic, M.; Al-Harbi, H.; Kalidindi, S. Crystal plasticity simulations using discrete Fourier transforms. Acta Mater. 2009, 57, 1777–1784. [Google Scholar] [CrossRef]
- Al-Harbi, H.; Knezevic, M.; Kalidindi, S. Spectral approaches for the fast computation of yield surfaces and first-order plastic property closures for polycrystalline materials with cubic-triclinic textures. Comput. Mater. Contin. 2010, 15, 153–172. [Google Scholar]
- Kalidindi, S.; Knezevic, M.; Niezgoda, S.; Shaffer, J. Representation of the orientation distribution function and computation of first-order elastic properties closures using discrete Fourier transforms. Acta Mater. 2009, 57, 3916–3923. [Google Scholar] [CrossRef]
- Eghtesad, A.; Barrett, T.; Knezevic, M. Compact reconstruction of orientation distributions using generalized spherical harmonics to advance large-scale crystal plasticity modeling: Verification using cubic, hexagonal, and orthorhombic polycrystals. Acta Mater. 2018, 155, 418–432. [Google Scholar] [CrossRef]
- Fast, T.; Knezevic, M.; Kalidindi, S. Application of microstructure sensitive design to structural components produced from hexagonal polycrystalline metals. Comput. Mater. Sci. 2022, 43, 374–383. [Google Scholar] [CrossRef]
- Sundararaghavan, V.; Zabaras, N. Linear analysis of texture-property relationships using process-based representations of Rodrigues space. Acta Mater. 2007, 55, 1573–1587. [Google Scholar] [CrossRef]
- Barton, N.; Knap, J.; Arsenlis, A.; Becker, R.; Hornung, R.; Jefferson, D. Embedded polycrystal plasticity and adaptive sampling. Int. J. Plast. 2008, 24, 242–266. [Google Scholar] [CrossRef]
- Chockalingam, K.; Tonks, M.; Hales, J.; Gaston, D.; Millett, P.; Zhang, L. Crystal plasticity with Jacobian-Free Newton–Krylov. Comput. Mech. 2013, 51, 617–627. [Google Scholar] [CrossRef]
- Knezevic, M.; Savage, D. A high-performance computational framework for fast crystal plasticity simulations. Comput. Mater. Sci. 2014, 83, 101–106. [Google Scholar] [CrossRef]
- Savage, D.; Knezevic, M. Computer implementations of iterative and non-iterative crystal plasticity solvers on high performance graphics hardware. Comput. Mech. 2015, 56, 677–690. [Google Scholar] [CrossRef]
- Eghtesad, A.; Germaschewski, K.; Lebensohn, R.; Knezevic, M. A multi-GPU implementation of a full-field crystal plasticity solver for efficient modeling of high-resolution microstructures. Comput. Phys. Commun. 2020, 254, 107231. [Google Scholar] [CrossRef]
- Nickolls, J.; Dally, W. The GPU computing era. IEEE Micro 2010, 30, 56–69. [Google Scholar] [CrossRef]
- Eghtesad, A.; Germaschewski, K.; Beyerlein, I.; Hunter, A.; Knezevic, M. Graphics processing unit accelerated phase field dislocation dynamics: Application to bi-metallic interfaces. Adv. Eng. Softw. 2018, 115, 248–267. [Google Scholar] [CrossRef]
- Feldman, S.I. A Fortran to C converter. ACM SIGPLAN Fortran Forum 1990, 9, 21–22. [Google Scholar] [CrossRef]
- Grosse-Kunstleve, R.; Terwilliger, T.; Sauter, N.; Adams, P. Automatic Fortran to C++ conversion with FABLE. Source Code Biol. Med. 2012, 7, 5. [Google Scholar] [CrossRef] [PubMed]
- Online Fortran to C Converter. Available online: https://www.codeconvert.ai/fortran-to-c-converter (accessed on 7 October 2024).
- Morgan, N.R.; Archer, B.J. On the origins of Lagrangian hydrodynamic methods. Nucl. Technol. 2021, 207, S147–S175. [Google Scholar] [CrossRef]
- Feng, Z.; Zecevic, M.; Knezevic, M.; Lebensohn, R.A. Predicting extreme anisotropy and shape variations in impact testing of tantalum single crystals. Int. J. Solids Struct. 2022, 241, 111466. [Google Scholar] [CrossRef]
CPU Company | Intel | Intel | IBM | |
Name | Haswell | Broadwell | Power9 | |
Memory | 132 GBs | 132 GBs | 256 GBs | |
Number of cores per CPU | 20 | 32 | 20 | |
Clock speed | 2.6 GHz | 2.1 GHz | 3.45 GHz | |
GPU Company | NVIDIA | NVIDIA | NVIDIA | AMD |
Name | Tesla V100s | A100 | Quadro RTX | Vega MI50 |
Memory | 32 GBs | 40 GBs | 48 GBs | 16 GBs |
Number of multi-processors | 84 | 108 | 72 | 60 |
Number of CUDA cores (NVIDIA) or shading units (AMD) | 5376 | 6912 | 4608 | 3840 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Morgan, N.; Yenusah, C.; Diaz, A.; Dunning, D.; Moore, J.; Heilman, E.; Lieberman, E.; Walton, S.; Brown, S.; Holladay, D.; et al. Enabling Parallel Performance and Portability of Solid Mechanics Simulations Across CPU and GPU Architectures. Information 2024, 15, 716. https://doi.org/10.3390/info15110716
Morgan N, Yenusah C, Diaz A, Dunning D, Moore J, Heilman E, Lieberman E, Walton S, Brown S, Holladay D, et al. Enabling Parallel Performance and Portability of Solid Mechanics Simulations Across CPU and GPU Architectures. Information. 2024; 15(11):716. https://doi.org/10.3390/info15110716
Chicago/Turabian StyleMorgan, Nathaniel, Caleb Yenusah, Adrian Diaz, Daniel Dunning, Jacob Moore, Erin Heilman, Evan Lieberman, Steven Walton, Sarah Brown, Daniel Holladay, and et al. 2024. "Enabling Parallel Performance and Portability of Solid Mechanics Simulations Across CPU and GPU Architectures" Information 15, no. 11: 716. https://doi.org/10.3390/info15110716
APA StyleMorgan, N., Yenusah, C., Diaz, A., Dunning, D., Moore, J., Heilman, E., Lieberman, E., Walton, S., Brown, S., Holladay, D., Marki, R., Robey, R., & Knezevic, M. (2024). Enabling Parallel Performance and Portability of Solid Mechanics Simulations Across CPU and GPU Architectures. Information, 15(11), 716. https://doi.org/10.3390/info15110716