Embedded GPU Implementation for High-Performance Ultrasound Imaging
Abstract
:1. Introduction
2. System Design
2.1. Expansion Board
- -
- 17 ports are reserved for the DSPs integrated on the FE and MC modules (1 port for each DSP).
- -
- An M2 slot PCIe3.0 ×4 (1 port) can host a common consumer SSD (solid-state drive) NVMe (non-volatile storage memory express) storage with a capacity of up to 2 TB.
- -
- A PCIe3.0 ×16 connector (4 ports) connects to additional PCIe external resources.
- -
- The Jetson AGX Xavier module is connected through a PCIe3.0 ×8 bus (2 ports)
2.2. Root Complex Configurations
3. Results and Discussion
3.1. PCIe Bandwidths and Xavier AGX Performances
3.2. Application Examples
4. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Szabo, T.L. Diagnostic Ultrasound Imaging: Inside Out; Elsevier Science: Amsterdam, The Netherlands, 2014; ISBN 978-0-12-396542-4. [Google Scholar]
- Powers, J.; Kremkau, F. Medical Ultrasound Systems. Interface Focus 2011, 1, 477–489. [Google Scholar] [CrossRef] [Green Version]
- Tortoli, P.; Jensen, J.A. Introduction to the Special Issue on Novel Equipment for Ultrasound Research. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2006, 53, 1705–1706. [Google Scholar] [CrossRef] [Green Version]
- Boni, E.; Yu, A.C.H.; Freear, S.; Jensen, J.A.; Tortoli, P. Ultrasound Open Platforms for Next-Generation Imaging Technique Development. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2018, 65, 1078–1092. [Google Scholar] [CrossRef]
- Tanter, M.; Fink, M. Ultrafast Imaging in Biomedical Ultrasound. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2014, 61, 102–119. [Google Scholar] [CrossRef]
- Jensen, J.A.; Holten-Lund, H.; Nilsson, R.T.; Hansen, M.; Larsen, U.D.; Domsten, R.P.; Tomov, B.G.; Stuart, M.B.; Nikolov, S.I.; Pihl, M.J.; et al. SARUS: A Synthetic Aperture Real-Time Ultrasound System. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2013, 60, 1838–1852. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jensen, J.A.; Nikolov, S.; Yu, A.C.H.; Garcia, D. Ultrasound Vector Flow Imaging—Part I: Sequential Systems. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2016, 63, 1704–1721. [Google Scholar] [CrossRef] [Green Version]
- Jensen, J.A.; Nikolov, S.I.; Yu, A.C.H.; Garcia, D. Ultrasound Vector Flow Imaging—Part II: Parallel Systems. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2016, 63, 1722–1732. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Smith, P.R.; Cowell, D.M.J.; Raiton, B.; Ky, C.V.; Freear, S. Ultrasound Array Transmitter Architecture with High Timing Resolution Using Embedded Phase-Locked Loops. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2012, 59, 40–49. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ramalli, A.; Dallai, A.; Guidi, F.; Bassi, L.; Boni, E.; Tong, L.; Fradella, G.; D’Hooge, J.; Tortoli, P. Real-Time High-Frame-Rate Cardiac B-Mode and Tissue Doppler Imaging Based on Multiline Transmission and Multiline Acquisition. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2018, 65, 2030–2041. [Google Scholar] [CrossRef]
- Verasonics Inc. The Verasonics Ultrasound Engine. Available online: http://www.verasonics.com/pdf/verasonics_ultrasound_eng.pdf (accessed on 7 February 2013).
- Cheung, C.C.P.; Yu, A.C.H.; Salimi, N.; Yiu, B.Y.S.; Tsang, I.K.H.; Kerby, B.; Azar, R.Z.; Dickie, K. Multi-Channel Pre-Beamformed Data Acquisition System for Research on Advanced Ultrasound Imaging Methods. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2012, 59, 243–253. [Google Scholar] [CrossRef] [Green Version]
- Lu, J.-Y.; Cheng, J.; Wang, J. High Frame Rate Imaging System for Limited Diffraction Array Beam Imaging with Square-Wave Aperture Weightings High Frame Rate Imaging System for Limited Diffraction Array Beam Imaging with Square-Wave Aperture Weightings. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2006, 53, 1796–1812. [Google Scholar] [CrossRef] [PubMed]
- Martín-Arguedas, C.J.; Romero-Laorden, D.; Martínez-Graullera, O.; Pérez-López, M.; Gómez-Ullate, L. An Ultrasonic Imaging SystemBbased on a New SAFT Approach and a GPU Beamformer. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2012, 59, 1402–1412. [Google Scholar] [CrossRef] [PubMed]
- Jensen, J.A.; Holm, O.; Jensen, L.J.; Bendsen, H.; Nikolov, S.I.; Tomov, B.G.; Munk, P.; Hansen, M.; Salomonsen, K.; Hansen, J.; et al. Ultrasound Research Scanner for Real-Time Synthetic Aperture Data Acquisition. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2005, 52, 881–891. [Google Scholar] [CrossRef] [Green Version]
- P5C-6 Compact Ultrasound Scanner with Built-in Raw Data Acquisition Capabilities. Available online: https://ieeexplore.ieee.org/document/4410141/ (accessed on 31 March 2021).
- The Ultrasonix 500RP: A Commercial Ultrasound Research Interface|IEEE Journals & Magazine|IEEE Xplore. Available online: https://ieeexplore.ieee.org/document/4012862/similar#similar (accessed on 31 March 2021).
- So, H.; Chen, J.; Yiu, B.; Yu, A. Medical Ultrasound Imaging: To GPU or Not to GPU? IEEE Micro 2011, 31, 54–65. [Google Scholar] [CrossRef] [Green Version]
- Walczak, M.; Lewandowski, M.; Zolek, N. Optimization of Real-Time Ultrasound PCIe Data Streaming and OpenCL Processing for SAFT Imaging. In Proceedings of the 2013 IEEE International Ultrasonics Symposium (IUS), Prague, Czech Republic, 21–25 July 2013; pp. 2064–2067. [Google Scholar]
- Lewandowski, M. Medical ultrasound digital signal processing in the gpu computing era. In Computer Vision in Medical Imaging; Series in Computer Vision; World Scientific: Singapore, 2013; Volume 2, pp. 229–244. ISBN 978-981-4460-93-4. [Google Scholar]
- Yiu, B.Y.S.; Yu, A.C.H. GPU-Based Minimum Variance Beamformer for Synthetic Aperture Imaging of the Eye. Ultrasound Med. Biol. 2015, 41, 871–883. [Google Scholar] [CrossRef]
- Chang, L.-W.; Hsu, K.-H.; Li, P.-C. Graphics Processing Unit-Based High-Frame-Rate Color Doppler Ultrasound Processing. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2009, 56, 1856–1860. [Google Scholar] [CrossRef]
- Chee, A.J.Y.; Yiu, B.Y.S.; Yu, A.C.H. A GPU-Parallelized Eigen-Based Clutter Filter Framework for Ultrasound Color Flow Imaging. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2017, 64, 150–163. [Google Scholar] [CrossRef]
- Rossi, S.; Lenge, M.; Dallai, A.; Ramalli, A.; Boni, E. Toward the Real Time Implementation of the 2-D Frequency-Domain Vector Doppler Method; Lecture Notes in Electrical Engineering; Springer: New York, NY, USA, 2019; pp. 129–135. ISBN 978-3-030-11972-0. [Google Scholar]
- Boni, E.; Bassi, L.; Dallai, A.; Guidi, F.; Meacci, V.; Ramalli, A.; Ricci, S.; Tortoli, P. ULA-OP 256: A 256-Channel Open Scanner for Development and Real-Time Implementation of New Ultrasound Methods. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2016, 63, 1488–1495. [Google Scholar] [CrossRef]
- NVIDIA NVIDIA Jetson AGX Xavier. Available online: https://www.nvidia.com/it-it/autonomous-machines/embedded-systems/jetson-agx-xavier/ (accessed on 18 October 2020).
- Lenge, M.; Ramalli, A.; Boni, E.; Liebgott, H.; Cachard, C.; Tortoli, P. High-Frame-Rate 2-D Vector Blood Flow Imaging in the Frequency Domain. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2014, 61, 1504–1514. [Google Scholar] [CrossRef]
- Rossi, S.; Ramalli, A.; Fool, F.; Tortoli, P. High-Frame-Rate 3-D Vector Flow Imaging in the Frequency Domain. Appl. Sci. 2020, 10, 5365. [Google Scholar] [CrossRef]
- Maione, E.; Tortoli, P.; Lypacewicz, G.; Nowicki, A.; Reid, J.M. PSpice Modelling of Ultrasound Transducers: Comparison of Software Models to Experiment. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 1999, 46, 399–406. [Google Scholar] [CrossRef] [PubMed]
- Ricci, S.; Bassi, L.; Boni, E.; Dallai, A.; Tortoli, P. Multichannel FPGA-Based Arbitrary Waveform Generator for Medical Ultrasound. Electron. Lett. 2007, 43, 1335–1336. [Google Scholar] [CrossRef]
- Matrone, G.; Ramalli, A.; D’hooge, J.; Tortoli, P.; Magenes, G. A Comparison of Coherence-Based Beamforming Techniques in High-Frame-Rate Ultrasound Imaging With Multi-Line Transmission. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2020, 67, 329–340. [Google Scholar] [CrossRef]
- Xu, Y.; Jiang, H.; Liu, T.; Zhai, D.; Lu, J. The Research and Implementation of PCI Express Driver for DSP Based on KeyStone Architecture. In Proceedings of the 2013 3rd International Conference on Computer Science and Network Technology, Dalian, China, 12–13 October 2013; pp. 592–595. [Google Scholar]
- NVIDIA Jetson AGX Xavier Delivers 32 TeraOps for New Era of AI in Robotics. Available online: https://developer.nvidia.com/blog/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/ (accessed on 20 January 2021).
- Jetson AGX Xavier Benchmarks for Image and Video Processing. Available online: https://www.fastcompression.com/benchmarks/xavier-benchmarks.htm (accessed on 20 January 2021).
- Boni, E.; Bassi, L.; Dallai, A.; Meacci, V.; Ramalli, A.; Scaringella, M.; Guidi, F.; Ricci, S.; Tortoli, P. Architecture of an Ultrasound System for Continuous Real-Time High Frame Rate Imaging. IEEE Trans. Ultrason. Ferroelect. Freq. Control 2017, 64, 1276–1284. [Google Scholar] [CrossRef] [PubMed]
PCIe Version | Encoding | Transfer Rate (GT/s) | Throughput (GB/s) | ||||
---|---|---|---|---|---|---|---|
×1 | ×2 | ×4 | ×8 | ×16 | |||
2.0 | 8b/10b | 5.0 GT/s | 0.5 | 1 | 2 | 4 | 8 |
3.0 | 128b/130b | 8.0 GT/s | 0.985 | 1.969 | 3.938 | 7.877 | 15.754 |
Resources | Protocol | Number of Lanes | Throughput (GB/s) |
---|---|---|---|
FEs DSPs | PCIe2.0 | 16(DSPs) ×2 | 16 |
MC DSP | PCIe2.0 | ×1 | 0.5 |
M2 | PCIe3.0 | ×4 | 3.938 |
PCIe 16x Conn. | PCIe3.0 | ×16 | 15.753 |
Jetson Xavier | PCIe3.0 | ×8 | 7.876 |
Kernel Size (Samples) | Output Estimates Matrix (Estimations Per Frame) | Real-Time Frame Rate (Hz) | ||
---|---|---|---|---|
z-dim | x-dim | z-dim | x-dim | |
50 | 10 | 50 | 20 | 3512 |
100 | 10 | 50 | 20 | 3423 |
50 | 20 | 50 | 20 | 3339 |
100 | 20 | 50 | 20 | 3396 |
50 | 10 | 100 | 20 | 3203 |
100 | 10 | 100 | 20 | 3034 |
50 | 20 | 100 | 20 | 3104 |
100 | 20 | 100 | 20 | 2956 |
50 | 10 | 50 | 40 | 3200 |
100 | 10 | 50 | 40 | 3094 |
50 | 20 | 50 | 40 | 3092 |
100 | 20 | 50 | 40 | 3026 |
50 | 10 | 100 | 40 | 2717 |
100 | 10 | 100 | 40 | 2641 |
50 | 20 | 100 | 40 | 2586 |
100 | 20 | 100 | 40 | 2488 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rossi, S.; Boni, E. Embedded GPU Implementation for High-Performance Ultrasound Imaging. Electronics 2021, 10, 884. https://doi.org/10.3390/electronics10080884
Rossi S, Boni E. Embedded GPU Implementation for High-Performance Ultrasound Imaging. Electronics. 2021; 10(8):884. https://doi.org/10.3390/electronics10080884
Chicago/Turabian StyleRossi, Stefano, and Enrico Boni. 2021. "Embedded GPU Implementation for High-Performance Ultrasound Imaging" Electronics 10, no. 8: 884. https://doi.org/10.3390/electronics10080884
APA StyleRossi, S., & Boni, E. (2021). Embedded GPU Implementation for High-Performance Ultrasound Imaging. Electronics, 10(8), 884. https://doi.org/10.3390/electronics10080884