Compiler and Hardware Design Systems for High-Performance Computing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Circuit and Signal Processing".

Deadline for manuscript submissions: 15 January 2025 | Viewed by 1337

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
Interests: compiler optimization for high-performance computing; accelerated computing; machine learning

E-Mail Website
Guest Editor
Department of Computer Science and Engineering, Kongju National University, Cheonan 31080, Republic of Korea
Interests: mobile operating systems; resource management; robotics systems

Special Issue Information

Dear Colleagues,

Compilers and computer architecture are both critical components of modern computing systems (from mobile devices to supercomputers) because their significance lies in their essential roles in making software and hardware work together efficiently and effectively. Compilers enable developers to write code in higher-level languages and convert it into machine code, making programming more accessible and efficient. In addition, computer architecture forms the foundation of hardware design, influencing performance, compatibility, scalability, energy efficiency, and parallelism in computing systems. Therefore, compilers and computer architecture both play a critical role in enhancing modern computing technology given that high-performance parallel architectures are exploited to solve many critical challenges in a variety of areas.

This Special Issue aims to present the latest research results and new ideas in compiler and hardware design systems for high-performance computing. In this Special Issue, original research articles and reviews are welcome. Research areas may include (but are not limited to) the following:

  • High-performance computer architectures;
  • IoT, mobile, Edge, and embedded architectures;
  • Compilers, runtimes, and programming languages for parallel computer systems;
  • Compilers and programming languages for novel architectures;
  • Heterogeneous computing accelerators;
  • Machine learning compilers and runtime;
  • Programming languages for machine learning;
  • Specialized hardware for machine learning.

Dr. Jinsung Kim
Dr. Jaehwan Lee
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer architectures
  • compilers
  • parallel and distributed computing
  • programming languages
  • accelerated computing
  • machine learning for systems
  • parallel algorithms

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 738 KiB  
Article
Tensor Core-Adapted Sparse Matrix Multiplication for Accelerating Sparse Deep Neural Networks
by Yoonsang Han, Inseo Kim, Jinsung Kim and Gordon Euhyun Moon
Electronics 2024, 13(20), 3981; https://doi.org/10.3390/electronics13203981 - 10 Oct 2024
Viewed by 924
Abstract
Sparse matrix–matrix multiplication (SpMM) is essential for deep learning models and scientific computing. Recently, Tensor Cores (TCs) on GPUs, originally designed for dense matrix multiplication with mixed precision, have gained prominence. However, utilizing TCs for SpMM is challenging due to irregular memory access [...] Read more.
Sparse matrix–matrix multiplication (SpMM) is essential for deep learning models and scientific computing. Recently, Tensor Cores (TCs) on GPUs, originally designed for dense matrix multiplication with mixed precision, have gained prominence. However, utilizing TCs for SpMM is challenging due to irregular memory access patterns and a varying number of non-zero elements in a sparse matrix. To improve data locality, previous studies have proposed reordering sparse matrices before multiplication, but this adds computational overhead. In this paper, we propose Tensor Core-Adapted SpMM (TCA-SpMM), which leverages TCs without requiring matrix reordering and uses the compressed sparse row (CSR) format. To optimize TC usage, the SpMM algorithm’s dot product operation is transformed into a blocked matrix–matrix multiplication. Addressing load imbalance and minimizing data movement are critical to optimizing the SpMM kernel. Our TCA-SpMM dynamically allocates thread blocks to process multiple rows simultaneously and efficiently uses shared memory to reduce data movement. Performance results on sparse matrices from the Deep Learning Matrix Collection public dataset demonstrate that TCA-SpMM achieves up to 29.58× speedup over state-of-the-art SpMM implementations optimized with TCs. Full article
(This article belongs to the Special Issue Compiler and Hardware Design Systems for High-Performance Computing)
Show Figures

Figure 1

Back to TopTop