FPGA Applications and Future Trends

A special issue of Micromachines (ISSN 2072-666X). This special issue belongs to the section "E:Engineering and Technology".

Deadline for manuscript submissions: closed (31 January 2024) | Viewed by 33499

Special Issue Editor


E-Mail Website
Guest Editor

Special Issue Information

Dear Colleagues,

Currently, the use of FPGAs (field-programmable gate arrays) has been extended to several areas of engineering, such as security, biomedical, control, power systems, instrumentation, automotive, astronomy, and particle physics, to name a few.

The flexibility and reconfigurability of FPGAs allow them to be a tool used for rapid prototyping, in addition to offering parallel processing enabling the control of several systems within the same FPGA. Furthermore, technological advances have allowed the integration of microprocessor architectures, facilitating the handling of sequential tasks without neglecting the advantages of the FPGA architecture.

This Special Issue provides a forum for the presentation of new and improved FPGA-based applications, including (but not limited to):

  • Cryptography;
  • Control;
  • Embedded systems;
  • Power systems;
  • Monitoring;
  • Signal processing;
  • Intelligent systems;
  • Image processing;
  • Biomedical application;
  • Robotics;
  • Industrial applications;
  • Reconfigurable computing;
  • Particle physics;
  • Deep neural networks.

Dr. José de Jesús Rangel Magdaleno
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Micromachines is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • FPGAs
  • high-level synthesis
  • Verilog
  • VHDL
  • embedded systems
  • reconfigurable computing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (16 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

18 pages, 7139 KiB  
Article
An FPGA-Based YOLOv5 Accelerator for Real-Time Industrial Vision Applications
by Zhihong Yan, Bingqian Zhang and Dong Wang
Micromachines 2024, 15(9), 1164; https://doi.org/10.3390/mi15091164 - 19 Sep 2024
Viewed by 1646
Abstract
The You Only Look Once (YOLO) object detection network has garnered widespread adoption in various industries, owing to its superior inference speed and robust detection capabilities. This model has proven invaluable in automating production processes such as material processing, machining, and quality inspection. [...] Read more.
The You Only Look Once (YOLO) object detection network has garnered widespread adoption in various industries, owing to its superior inference speed and robust detection capabilities. This model has proven invaluable in automating production processes such as material processing, machining, and quality inspection. However, as market competition intensifies, there is a constant demand for higher detection speed and accuracy. Current FPGA accelerators based on 8-bit quantization have struggled to meet these increasingly stringent performance requirements. In response, we present a novel 4-bit quantization-based neural network accelerator for the YOLOv5 model, designed to enhance real-time processing capabilities while maintaining high detection accuracy. To achieve effective model compression, we introduce an optimized quantization scheme that reduces the bit-width of the entire YOLO network—including the first layer—to 4 bits, with only a 1.5% degradation in mean Average Precision (mAP). For the hardware implementation, we propose a unified Digital Signal Processor (DSP) packing scheme, coupled with a novel parity adder tree architecture that accommodates the proposed quantization strategies. This approach efficiently reduces on-chip DSP utilization by 50%, offering a significant improvement in performance and resource efficiency. Experimental results show that the industrial object detection system based on the proposed FPGA accelerator achieves a throughput of 808.6 GOPS and an efficiency of 0.49 GOPS/DSP for YOLOv5s on the ZCU102 board, which is 29% higher than a commercial FPGA accelerator design (Xilinx’s Vitis AI). Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

16 pages, 1447 KiB  
Article
High-Performance Reconfigurable Pipeline Implementation for FPGA-Based SmartNIC
by Xiaoyong Song, Rui Lu and Zhichuan Guo
Micromachines 2024, 15(4), 449; https://doi.org/10.3390/mi15040449 - 27 Mar 2024
Viewed by 1499
Abstract
As the key module of programmable switches or the SmartNIC card, the packet processing pipeline undertakes the task of packet forwarding and processing. However, the current pipeline for the FPGA-based SmartNIC is inflexible, and the related reconfigurable commercial device designs are closed-source. To [...] Read more.
As the key module of programmable switches or the SmartNIC card, the packet processing pipeline undertakes the task of packet forwarding and processing. However, the current pipeline for the FPGA-based SmartNIC is inflexible, and the related reconfigurable commercial device designs are closed-source. To solve this problem, this paper proposes a high-performance reconfigurable pipeline design, which has fully reconfigurable match-action units, supporting various network functions by its flexible reconfiguration. The fields of the match key and the size of the match table can be reconfigured without recompiling the HDL code or modifying the hardware. The processing rules and action instructions for the pipeline can be dynamically installed by the configuration module at runtime. We implement our design on the Xilinx Alveo U200 board with a Virtex UltraScale+ XCU200-2FSGD2104E FPGA and show that the designed pipeline supports fast reconfiguration to implement new network functions and that the throughput of the designed pipeline reaches 100 Gbps with low latency. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

14 pages, 1000 KiB  
Article
QGWFQS: A Queue-Group-Based Weight Fair Queueing Scheduler on FPGA
by Yunfei Guo, Zhichuan Guo, Xiaoyong Song and Mangu Song
Micromachines 2023, 14(11), 2100; https://doi.org/10.3390/mi14112100 - 14 Nov 2023
Viewed by 1876
Abstract
Weight Fair Queuing is an ideal scheduling algorithm to guarantee the bandwidth of different queues according to their configured Weights when the switching nodes of the network are congested. Many of the switching nodes based on FPGA in the current network support four [...] Read more.
Weight Fair Queuing is an ideal scheduling algorithm to guarantee the bandwidth of different queues according to their configured Weights when the switching nodes of the network are congested. Many of the switching nodes based on FPGA in the current network support four physical ports or hundreds of virtual ports. Massive logic and storage resources would be consumed if each port implemented a WFQ scheduler. This paper proposes a Queue-Group-Based WFQ Scheduler (QGWFQS), which can support WFQ scheduling across multiple ports through the reuse of tag calculation and encoding circuits. We also propose a novel finish tag calculation algorithm to accommodate the variation in the link rate of each port. The remainder of integer division is also taken into account, which makes the bandwidth allocation fairer. Experimental results show that the proposed scheduler supports up to 512 ports, with 32 queues allocated on each individual port. The scheduler has the capability to operate at 200 MHz and the total scheduling capacity reaches 200 Mpps. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

16 pages, 1361 KiB  
Article
An FPGA-Based High-Performance Stateful Packet Processing Method
by Rui Lu and Zhichuan Guo
Micromachines 2023, 14(11), 2074; https://doi.org/10.3390/mi14112074 - 8 Nov 2023
Viewed by 1651
Abstract
Compared to a stateless data plane, a stateful data plane offloads part of state information and control logic from a controller to a data plane to reduce communication overhead and improve packet processing efficiency. However, existing methods for implementing stateful data planes face [...] Read more.
Compared to a stateless data plane, a stateful data plane offloads part of state information and control logic from a controller to a data plane to reduce communication overhead and improve packet processing efficiency. However, existing methods for implementing stateful data planes face challenges, particularly maintaining state consistency during packet processing and improving throughput performance. This paper presents a high-performance, FPGA (Field Programmable Gate Array)-based stateful packet processing approach, which addresses these challenges utilizing the PHV (Packet Header Vector) dynamic scheduling technique to ensure flow state consistency. Our experiments demonstrate that the proposed method could operate at 200 MHz while adding 3–12 microseconds latency. The method we proposed also provides a considerable degree of programmability. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

14 pages, 926 KiB  
Article
Optimizing the Performance of the Sparse Matrix–Vector Multiplication Kernel in FPGA Guided by the Roofline Model
by Federico Favaro, Ernesto Dufrechou, Juan P. Oliver and Pablo Ezzatti
Micromachines 2023, 14(11), 2030; https://doi.org/10.3390/mi14112030 - 31 Oct 2023
Cited by 2 | Viewed by 1609
Abstract
The widespread adoption of massively parallel processors over the past decade has fundamentally transformed the landscape of high-performance computing hardware. This revolution has recently driven the advancement of FPGAs, which are emerging as an attractive alternative to power-hungry many-core devices in a world [...] Read more.
The widespread adoption of massively parallel processors over the past decade has fundamentally transformed the landscape of high-performance computing hardware. This revolution has recently driven the advancement of FPGAs, which are emerging as an attractive alternative to power-hungry many-core devices in a world increasingly concerned with energy consumption. Consequently, numerous recent studies have focused on implementing efficient dense and sparse numerical linear algebra (NLA) kernels on FPGAs. To maximize the efficiency of these kernels, a key aspect is the exploration of analytical tools to comprehend the performance of the developments and guide the optimization process. In this regard, the roofline model (RLM) is a well-known graphical tool that facilitates the analysis of computational performance and identifies the primary bottlenecks of a specific software when executed on a particular hardware platform. Our previous efforts advanced in developing efficient implementations of the sparse matrix–vector multiplication (SpMV) for FPGAs, considering both speed and energy consumption. In this work, we propose an extension of the RLM that enables optimizing runtime and energy consumption for NLA kernels based on sparse blocked storage formats on FPGAs. To test the power of this tool, we use it to extend our previous SpMV kernels by leveraging a block-sparse storage format that enables more efficient data access. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

14 pages, 577 KiB  
Article
Dynamically Scalable NoC Architecture for Implementing Run-Time Reconfigurable Applications
by Qaiser Ijaz, Hiliwi Leake Kidane, El-Bay Bourennane and Gilberto Ochoa-Ruiz
Micromachines 2023, 14(10), 1913; https://doi.org/10.3390/mi14101913 - 7 Oct 2023
Cited by 2 | Viewed by 1935
Abstract
The paper proposes two architectures for a dynamically scalable network-on-chip (NoC) for dynamically reconfigurable intellectual properties (IPs) to save power. The first architecture is a run-time scalable column-based NoC, where the columns of the NoC are scaled up and down at run-time depending [...] Read more.
The paper proposes two architectures for a dynamically scalable network-on-chip (NoC) for dynamically reconfigurable intellectual properties (IPs) to save power. The first architecture is a run-time scalable column-based NoC, where the columns of the NoC are scaled up and down at run-time depending on the demands to connect reconfigurable IPs. The second architecture is an extension of the first, where both the rows and columns of the NoC are dynamically scaled up and down on demand. A robust control manager is developed to control the IP and sub-NoC reconfigurations by optimizing the reconfiguration costs. The proposed architectures have been implemented and tested in actual prototypes on a Virtex 6 FPGA mounted on the ML605 board. The results show that dynamically scalable architectures are capable of significant power reduction as compared to traditional static architectures for the same size of the NoC. It is anticipated that the scalable NoC can be very useful for sharing the FPGA resources among IPs at runtime. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

29 pages, 12275 KiB  
Article
A High-Performance and Cost-Effective Field Programmable Gate Array-Based Motor Drive Emulator
by Julio Hernandez, Jose de Jesus Rangel-Magdaleno and Roberto Morales-Caporal
Micromachines 2023, 14(10), 1864; https://doi.org/10.3390/mi14101864 - 28 Sep 2023
Cited by 1 | Viewed by 1177
Abstract
This work presents a hardware-based digital emulator capable of digitally driving a permanent magnet synchronous machine electronic setup. The aim of this work is to present a high-performance, cost-effective, and portable complementary solution when new paradigms of electronic drive design are generated, such [...] Read more.
This work presents a hardware-based digital emulator capable of digitally driving a permanent magnet synchronous machine electronic setup. The aim of this work is to present a high-performance, cost-effective, and portable complementary solution when new paradigms of electronic drive design are generated, such as machine early failure detection, fault-tolerant drive, and high-performance control strategy implementations. In order to achieve the high performance required by the digital emulator, the electronic drive models (permanent-magnet synchronous machine, voltage-source inverter, motor-control strategy) are digitally described in Verilog hardware description language and implemented on a field programmable gate array (FPGA) digital platform using two approaches: parallel and sequential methods. The results obtained show the effectiveness of the digital emulator design, and the resources used by the solution presented can be implemented on a low-cost digital platform that reveals a cost-effective operation of the solution presented. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

17 pages, 1605 KiB  
Article
Modeling of Particulate Pollutants Using a Memory-Based Recurrent Neural Network Implemented on an FPGA
by Julio Alberto Ramírez-Montañez, Jose de Jesús Rangel-Magdaleno, Marco Antonio Aceves-Fernández and Juan Manuel Ramos-Arreguín
Micromachines 2023, 14(9), 1804; https://doi.org/10.3390/mi14091804 - 21 Sep 2023
Viewed by 1115
Abstract
The present work describes the training and subsequent implementation on an FPGA board of an LSTM neural network for the modeling and prediction of the exceedances of criteria pollutants such as nitrogen dioxide (NO2), carbon monoxide (CO), and particulate matter (PM [...] Read more.
The present work describes the training and subsequent implementation on an FPGA board of an LSTM neural network for the modeling and prediction of the exceedances of criteria pollutants such as nitrogen dioxide (NO2), carbon monoxide (CO), and particulate matter (PM10 and PM2.5). Understanding the behavior of pollutants and assessing air quality in specific geographical regions is crucial. Overexposure to these pollutants can cause harm to both natural ecosystems and living organisms, including humans. Therefore, it is essential to develop a solution that can accurately evaluate pollution levels. One potential approach is to implement a modified LSTM neural network on an FPGA board. This implementation obtained an 11% improvement compared to the original LSTM network, demonstrating that the proposed architecture is able to maintain its functionality despite reducing the number of neurons in its initial layers. It shows the feasibility of integrating a prediction network into a limited system such as an FPGA board, but easily coupled to a different system. Importantly, this implementation does not compromise the prediction accuracy for both 24 h and 72 h time frames, highlighting an opportunity for further enhancement and refinement. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

16 pages, 4987 KiB  
Article
Implementation of Wavelet-Transform-Based Algorithms in an FPGA for Heart Rate and RT Interval Automatic Measurements in Real Time: Application in a Long-Term Ambulatory Electrocardiogram Monitor
by José Alberto García Limón, Frank Martínez-Suárez and Carlos Alvarado-Serrano
Micromachines 2023, 14(9), 1748; https://doi.org/10.3390/mi14091748 - 7 Sep 2023
Cited by 2 | Viewed by 1658
Abstract
Cardiovascular diseases are currently the leading cause of death worldwide. Thus, there is a need for non-invasive ambulatory (Holter) ECG monitors with automatic measurements of ECG intervals to evaluate electrocardiographic abnormalities of patients with cardiac diseases. This work presents the implementation of algorithms [...] Read more.
Cardiovascular diseases are currently the leading cause of death worldwide. Thus, there is a need for non-invasive ambulatory (Holter) ECG monitors with automatic measurements of ECG intervals to evaluate electrocardiographic abnormalities of patients with cardiac diseases. This work presents the implementation of algorithms in an FPGA for beat-to-beat heart rate and RT interval measurements based on the continuous wavelet transform (CWT) with splines for a prototype of an ambulatory ECG monitor of three leads. The prototype’s main elements are an analog–digital converter ADS1294, an FPGA of Xilinx XC7A35T-ICPG236C of the Artix-7 family of low consumption, immersed in a low-scale Cmod-A7 development card integration, an LCD display and a micro-SD memory of 16 Gb. A main state machine initializes and manages the simultaneous acquisition of three leads from the ADS1294 and filters the signals using a FIR filter. The algorithm based on the CWT with splines detects the QRS complex (R or S wave) and then the T-wave end using a search window. Finally, the heart rate (60/RR interval) and the RT interval (from R peak to T-wave end) are calculated for analysis of its dynamics. The micro-SD memory stores the three leads and the RR and RT intervals, and an LCD screen displays the beat-to-beat values of heart rate, RT interval and the electrode connection. The algorithm implemented on the FPGA achieved satisfactory results in detecting different morphologies of QRS complexes and T wave in real time for the analysis of heart rate and RT interval dynamics. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

17 pages, 2873 KiB  
Article
The Design of a Dynamic Configurable Packet Parser Based on FPGA
by Ying Sun and Zhichuan Guo
Micromachines 2023, 14(8), 1560; https://doi.org/10.3390/mi14081560 - 5 Aug 2023
Cited by 1 | Viewed by 2480
Abstract
To meet the evolving demands of programmable networks and address the limitations of traditional fixed-type protocol parsers, we propose a dynamic and configurable low-latency parser implemented on an FPGA. The architecture consists of three protocol analysis modules and a TCAM-SRAM. Latency is reduced [...] Read more.
To meet the evolving demands of programmable networks and address the limitations of traditional fixed-type protocol parsers, we propose a dynamic and configurable low-latency parser implemented on an FPGA. The architecture consists of three protocol analysis modules and a TCAM-SRAM. Latency is reduced by optimizing the state machine and parallel extraction matching. At the same time, we introduce the chain mapping idea and container concept to formulate the matching and extraction rules of table entries and enhance the extensibility of the parser. Furthermore, our system supports dynamic configuration through SDN control, allowing flexible adaptation to diverse scenarios. Our design has been verified and simulated with a cocotb-based framework. The resulting architecture is implemented on Xilinx Ultrascale+ FPGAs and supports a throughput of more than 80 Gbps, with a maximum latency of only 36 nanoseconds for L4 protocol parsing. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

24 pages, 11422 KiB  
Article
Implementation of Field-Programmable Gate Array Platform for Object Classification Tasks Using Spike-Based Backpropagated Deep Convolutional Spiking Neural Networks
by Vijay Kakani, Xingyou Li, Xuenan Cui, Heetak Kim, Byung-Soo Kim and Hakil Kim
Micromachines 2023, 14(7), 1353; https://doi.org/10.3390/mi14071353 - 30 Jun 2023
Cited by 3 | Viewed by 1982
Abstract
This paper investigates the performance of deep convolutional spiking neural networks (DCSNNs) trained using spike-based backpropagation techniques. Specifically, the study examined temporal spike sequence learning via backpropagation (TSSL-BP) and surrogate gradient descent via backpropagation (SGD-BP) as effective techniques for training DCSNNs on the [...] Read more.
This paper investigates the performance of deep convolutional spiking neural networks (DCSNNs) trained using spike-based backpropagation techniques. Specifically, the study examined temporal spike sequence learning via backpropagation (TSSL-BP) and surrogate gradient descent via backpropagation (SGD-BP) as effective techniques for training DCSNNs on the field programmable gate array (FPGA) platform for object classification tasks. The primary objective of this experimental study was twofold: (i) to determine the most effective backpropagation technique, TSSL-BP or SGD-BP, for deeper spiking neural networks (SNNs) with convolution filters across various datasets; and (ii) to assess the feasibility of deploying DCSNNs trained using backpropagation techniques on low-power FPGA for inference, considering potential configuration adjustments and power requirements. The aforementioned objectives will assist in informing researchers and companies in this field regarding the limitations and unique perspectives of deploying DCSNNs on low-power FPGA devices. The study contributions have three main aspects: (i) the design of a low-power FPGA board featuring a deployable DCSNN chip suitable for object classification tasks; (ii) the inference of TSSL-BP and SGD-BP models with novel network architectures on the FPGA board for object classification tasks; and (iii) a comparative evaluation of the selected spike-based backpropagation techniques and the object classification performance of DCSNNs across multiple metrics using both public (MNIST, CIFAR10, KITTI) and private (INHA_ADAS, INHA_KLP) datasets. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

15 pages, 1410 KiB  
Article
High-Speed Hardware Architecture Based on Error Detection for KECCAK
by Hassen Mestiri and Imen Barraj
Micromachines 2023, 14(6), 1129; https://doi.org/10.3390/mi14061129 - 27 May 2023
Cited by 9 | Viewed by 1755
Abstract
The hash function KECCAK integrity algorithm is implemented in cryptographic systems to provide high security for any circuit requiring integrity and protect the transmitted data. Fault attacks, which can extricate confidential data, are one of the most effective physical attacks against KECCAK hardware. [...] Read more.
The hash function KECCAK integrity algorithm is implemented in cryptographic systems to provide high security for any circuit requiring integrity and protect the transmitted data. Fault attacks, which can extricate confidential data, are one of the most effective physical attacks against KECCAK hardware. Several KECCAK fault detection systems have been proposed to counteract fault attacks. The present research proposes a modified KECCAK architecture and scrambling algorithm to protect against fault injection attacks. Thus, the KECCAK round is modified so that it consists of two parts with input and pipeline registers. The scheme is independent of the KECCAK design. Iterative and pipeline designs are both protected by it. To test the resilience of the suggested detection system approach fault attacks, we conduct permanent as well as transient fault attacks, and we evaluate the fault detection capabilities (99.9999% for transient faults and 99.999905% for permanent faults). The KECCAK fault detection scheme is modeled using VHDL language and implemented on an FPGA hardware board. The experimental results show that our technique effectively secures the KECCAK design. It can be carried out with little difficulty. In addition, the experimental FPGA results demonstrate the proposed KECCAK detection scheme’s low area burden, high efficiency and working frequency. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

16 pages, 6624 KiB  
Article
An Improved Blind Zone Channelization Structure and Rapid Implementation Method
by Ziliang Jia and Hongxia Liu
Micromachines 2023, 14(5), 1091; https://doi.org/10.3390/mi14051091 - 22 May 2023
Viewed by 1293
Abstract
The paper proposes an enhanced design for broadband digital receivers that aims to improve signal capture probability, real-time performance, and the hardware development cycle. To overcome the issue of false signals in the blind zone channelization structure, this paper introduces an improved joint-decision [...] Read more.
The paper proposes an enhanced design for broadband digital receivers that aims to improve signal capture probability, real-time performance, and the hardware development cycle. To overcome the issue of false signals in the blind zone channelization structure, this paper introduces an improved joint-decision channelization structure that reduces channel ambiguity during signal reception. Xilinx’s high-level synthesis (HLS) tools are used for accelerated algorithm implementation, and techniques such as pipelining and loop parallelization are employed to reduce system latency. The entire system is implemented on FPGA. The simulation results demonstrate that the proposed solution effectively eliminates channel ambiguity, improves algorithm implementation speed, and meets the design requirements. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

15 pages, 6731 KiB  
Article
FPGA Implementation for Elliptic Curve Cryptography Algorithm and Circuit with High Efficiency and Low Delay for IoT Applications
by Deming Wang, Yuhang Lin, Jianguo Hu, Chong Zhang and Qinghua Zhong
Micromachines 2023, 14(5), 1037; https://doi.org/10.3390/mi14051037 - 12 May 2023
Cited by 10 | Viewed by 2419
Abstract
The Internet of Things requires greater attention to the security and privacy of the network. Compared to other public-key cryptosystems, elliptic curve cryptography can provide better security and lower latency with shorter keys, rendering it more suitable for IoT security. This paper presents [...] Read more.
The Internet of Things requires greater attention to the security and privacy of the network. Compared to other public-key cryptosystems, elliptic curve cryptography can provide better security and lower latency with shorter keys, rendering it more suitable for IoT security. This paper presents a high-efficiency and low-delay elliptic curve cryptographic architecture based on the NIST-p256 prime field for IoT security applications. A modular square unit utilizes a fast partial Montgomery reduction algorithm, demanding just a mere four clock cycles to complete a modular square operation. The modular square unit can be computed simultaneously with the modular multiplication unit, consequently improving the speed of point multiplication operations. Synthesized on the Xilinx Virtex-7 FPGA platform, the proposed architecture completes one PM operation in 0.08 ms using 23.1 k LUTs at 105.3 MHz. These results show significantly better performance compared to that in previous works. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

16 pages, 6717 KiB  
Article
A Real-Time FPGA-Based Metaheuristic Processor to Efficiently Simulate a New Variant of the PSO Algorithm
by Esteban Anides, Guillermo Salinas, Eduardo Pichardo, Juan G. Avalos, Giovanny Sánchez, Juan C. Sánchez, Gabriel Sánchez, Eduardo Vazquez and Linda K. Toscano
Micromachines 2023, 14(4), 809; https://doi.org/10.3390/mi14040809 - 31 Mar 2023
Cited by 1 | Viewed by 1751
Abstract
Nowadays, high-performance audio communication devices demand superior audio quality. To improve the audio quality, several authors have developed acoustic echo cancellers based on particle swarm optimization algorithms (PSO). However, its performance is reduced significantly since the PSO algorithm suffers from premature convergence. To [...] Read more.
Nowadays, high-performance audio communication devices demand superior audio quality. To improve the audio quality, several authors have developed acoustic echo cancellers based on particle swarm optimization algorithms (PSO). However, its performance is reduced significantly since the PSO algorithm suffers from premature convergence. To overcome this issue, we propose a new variant of the PSO algorithm based on the Markovian switching technique. Furthermore, the proposed algorithm has a mechanism to dynamically adjust the population size over the filtering process. In this way, the proposed algorithm exhibits great performance by reducing its computational cost significantly. To adequately implement the proposed algorithm in a Stratix IV GX EP4SGX530 FPGA, we present for the first time, the development of a parallel metaheuristic processor, in which each processing core simulates the different number of particles by using the time-multiplexing technique. In this way, the variation of the size of the population can be effective. Therefore, the properties of the proposed algorithm along with the proposed parallel hardware architecture potentially allow the development of high-performance acoustic echo canceller (AEC) systems. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

Review

Jump to: Research

24 pages, 3083 KiB  
Review
Digital Electronic System-on-Chip Design: Methodologies, Tools, Evolution, and Trends
by Marcian Cirstea, Khaled Benkrid, Andrei Dinu, Romeo Ghiriti and Dorin Petreus
Micromachines 2024, 15(2), 247; https://doi.org/10.3390/mi15020247 - 7 Feb 2024
Cited by 2 | Viewed by 3915
Abstract
This paper reviews the evolution of methodologies and tools for modeling, simulation, and design of digital electronic system-on-chip (SoC) implementations, with a focus on industrial electronics applications. Key technological, economic, and geopolitical trends are presented at the outset, before reviewing SoC design methodologies [...] Read more.
This paper reviews the evolution of methodologies and tools for modeling, simulation, and design of digital electronic system-on-chip (SoC) implementations, with a focus on industrial electronics applications. Key technological, economic, and geopolitical trends are presented at the outset, before reviewing SoC design methodologies and tools. The fundamentals of SoC design flows are laid out. The paper then exposes the crucial role of the intellectual property (IP) industry in the relentless improvements in performance, power, area, and cost (PPAC) attributes of SoCs. High abstraction levels in design capture and increasingly automated design tools (e.g., for verification and validation, synthesis, place, and route) continue to push the boundaries. Aerospace and automotive domains are included as brief case studies. This paper also presents current and future trends in SoC design and implementation including the rising, evolution, and usage of machine learning (ML) and artificial intelligence (AI) algorithms, techniques, and tools, which promise even greater PPAC optimizations. Full article
(This article belongs to the Special Issue FPGA Applications and Future Trends)
Show Figures

Figure 1

Back to TopTop