Next Article in Journal
A Comprehensive and Unified Approach to Web Service Trust Evaluation Based on Uncertainty Methodology
Next Article in Special Issue
An Information-Theoretic Analysis of the Cost of Decentralization for Learning and Inference under Privacy Constraints
Previous Article in Journal
Effect of Ti Content on the Microstructure and Properties of CoCrFeNiMnTix High Entropy Alloy
Previous Article in Special Issue
A Fast Approach to Removing Muscle Artifacts for EEG with Signal Serialization Based Ensemble Empirical Mode Decomposition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem

by
Mikhail Selianinau
and
Yuriy Povstenko
*
Department of Mathematics and Computer Sciences, Faculty of Science and Technology, Jan Dlugosz University in Czestochowa, al. Armii Krajowej 13/15, 42-200 Czestochowa, Poland
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(2), 242; https://doi.org/10.3390/e24020242
Submission received: 8 January 2022 / Revised: 30 January 2022 / Accepted: 3 February 2022 / Published: 5 February 2022
(This article belongs to the Special Issue Theory and Applications of Information Processing Algorithms)

Abstract

:
In this paper, we deal with the critical problems in residue arithmetic. The reverse conversion from a Residue Number System (RNS) to positional notation is a main non-modular operation, and it constitutes a basis of other non-modular procedures used to implement various computational algorithms. We present a novel approach to the parallel reverse conversion from the residue code into a weighted number representation in the Mixed-Radix System (MRS). In our proposed method, the calculation of mixed-radix digits reduces to a parallel summation of the small word-length residues in the independent modular channels corresponding to the primary RNS moduli. The computational complexity of the developed method concerning both required modular addition operations and one-input lookup tables is estimated as O k 2 / 2 , where k equals the number of used moduli. The time complexity is O log 2 k modular clock cycles. In pipeline mode, the throughput rate of the proposed algorithm is one reverse conversion in one modular clock cycle.

1. Introduction

Along with the improvement of computer technology, the development and implementation of new effective approaches to the organization and realization of computational tasks are some of the main ways to increase the data processing speed. At present, high-performance computing is developing extremely rapidly. These reasons lead to qualitatively new requirements imposed on number-theoretic methods and computational algorithms. Practically, all well-known approaches to high-performance computing use certain parallel forms of data representation and processing. In recent decades, special consideration has been given to the so-called modular computational structures. Their arithmetic foundation is the Residue Number System (RNS), whose ideological roots go back to the classic topics of number theory and abstract algebra. The RNS is a non-positional number system with inherent parallelism and occupies a place of particular importance due to its carry-free properties, which provide a high potential for accelerating arithmetic operations.
As is well known, the RNS has some advantages over a conventional Weighted Number System (WNS) in the design and implementation of high-performance computing applications, devices, and systems. From its appearance in the mid-1950s to the present, RNS arithmetic has attracted the constant attention of researchers in computer technology [1,2], number-theoretic methods [3,4,5], digital signal and image processing [2,5,6,7,8], communications systems [5,9], cryptography [2,8,10,11], and other fields [10].
The main advantage of RNS is its unique ability to decompose the large word-length numbers into a set of smaller word-length residues, which are processed in parallel in the independent modular channels. The inherent parallelism of RNS enables avoiding the carry-overs obtained in addition, subtraction, and multiplication, which are usually time-consuming in the WNS. In this regard, the modularity and carry-free properties make computation fast and efficient. Therefore, the RNS presents one of the most efficient means for increasing data processing speed.
Due to its carry-free property, the residue arithmetic is exceptionally suitable for a broad class of applications in which addition and multiplication are the dominant arithmetic operations. In any case, it has excellent potential for many substantial applications in such areas as digital signal processing, cryptography, distributed information and communication systems, information security systems, fault tolerance, cloud computing, and others. Moreover, these RNS applications may be effectively embedded in processor platforms functioning according to the conventional information-processing approach [2,5,8]. For the reasons mentioned above, residue arithmetic represents an efficient mathematical tool for the high-speed implementation of various computational tasks.
The reverse conversion and base extension are the most critical topics in residue arithmetic. As opposed to conventional WNS, these operations, on a par with other central non-modular procedures such as magnitude comparison, sign determination, overflow detection, general division, scaling, etc., are relatively harder for implementation. They are time consuming and costly due to their more complicated structure compared to modular operations.
As is known, to perform non-modular operations, it is necessary to carry out the binary reconstruction of the integer by its residue code, which in general is hampered by the non-weighted nature of the RNS. This circumstance negates to a substantial extent the main advantages of residue arithmetic.
Therefore, the development of novel approaches and methods for fast number reconstruction by its residue code has significant importance in high-performance computing based on parallel algorithmic structures of RNS, especially for high-speed implementing digital signal processing applications and public-key cryptosystems. That should enable the extensive use of residue arithmetic in many priority areas of science and technology.
In this paper, we present a novel approach to the parallel reverse conversion from the residue code into the mixed-radix representation. In the proposed method, the calculation of mixed-radix digits reduces to a parallel summation of the small word-length residues in the independent modular channels corresponding to the primary RNS moduli.
The paper is structured as follows. Section 2 and Section 3 discuss the basic theoretical concepts of the research. Section 4 describes the mathematical background of the proposed reverse conversion method. Section 5 and Section 6 present a numerical example and an analysis of the computational cost, respectively. Section 7 provides discussion, and Section 8 concludes the paper.

2. The Basic Concepts of the Residue Arithmetic

The abstract algebra and number theory create the theoretical basis of the residue arithmetic [12,13].
An RNS is defined by an ordered set m 1 , m 2 , , m k of k pairwise relatively prime moduli, where each modulus m i 2 ( i = 1 , 2 , , k ) , and the greatest common divisor of m i and m j equals 1, i.e., gcd m i , m j = 1 for i j . For convenience, we assume that the default order of moduli is ascending, i.e., m 1 < m 2 < < m k .
In the given RNS, it is possible to represent M k integer numbers, where M k is the product of all moduli, M k = i = 1 k m k . Therefore, the set Z M k = 0 , 1 , , M k 1 is usually used as an RNS dynamic range.
Every number X Z M k has a unique representation in the form of a k-tuple of small integers ( χ 1 , χ 2 , , χ k ) , which is called a residue code, where χ i is a least non-negative remainder of a division of X by m i i = 1 , 2 , , k . We can notationally write this relation as χ i = X m i , where χ i Z m i = 0 , 1 , , m i 1 .
The main advantage of the residue arithmetic over conventional binary arithmetic consists of parallel carrying out addition, subtraction, and multiplication at the level of small word-length residues. The modular operations + , , × on integers A = α 1 , α 2 , , α k and B = β 1 , β 2 , , β k are performed independently in each modular channel in compliance with the computational rule:
A   B = α 1 , α 2 , , α k β 1 , β 2 , , β k = = α 1 β 1 m 1 , α 2 β 2 m 2 , , α k β k m k ,
where α i = A m i and β i = B m i , i = 1 , 2 , , k .
In other words, the arithmetic operations on long-word operands are decomposed into modular channels with operands that are no larger than the corresponding modulus. Moreover, all the modular channels are entirely independent of each other. The carry-free nature of modular operations (1) is one of the most attractive features of residue arithmetic [1,3,8].
Therefore, compared with the conventional WNS, the RNS simplifies and speeds up the addition and multiplication operations. This fundamental advantage of the residue arithmetic strongly appears in the case of implementing computational procedures, which mainly contain long segments consisting of only sequences of modular arithmetic operations. In this case, the primary moduli set is chosen so that the final results of the computational procedure always belong to the used dynamic range for any allowed values of input operands. At the same time, the intermediate results can even exceed the boundaries of the dynamic range.
Along with the carry-free modular operations, there are also the so-called non-modular operations such as residue-to-binary conversion, base extension, magnitude comparison, sign determination, overflow detection, general division, scaling, etc. These operations are complicated and quite time consuming, and their significant computational complexity limits the applications of the residue arithmetic and restricts its widespread usage for high-speed computing.
To perform the non-modular operations, it is required to consider all residues in the k-tuple χ 1 , χ 2 , , χ k . Furthermore, it is necessary to determine the integer value of the number by its residue code, which in general is hampered by the non-positional nature of the RNS. The crucial problem of efficient implementation of non-modular operations is constantly receiving considerable attention by modern researchers [2,5,8].
The applicability of residue arithmetic is mainly determined by the computational complexity and feasibility of non-modular operations, which are used as a basis for implementing more complex computational algorithms in RNS. At the same time, the fundamental problem in the residue arithmetic, which unfortunately up to now is yet completely unresolved; it consists of reducing the computational complexity of non-modular operations. Due to a lack of efficient methods and algorithms for non-modular operations implementation, the residue arithmetic is mainly suitable when the modular additions and multiplications make up the bulk of required computations. In this case, the number of used non-modular operations is relatively small. This circumstance bounds the widespread use of the RNS to a narrow class of specific tasks.

3. Reverse Conversion of the Residue Code to Conventional Representation

The root problem of residue arithmetic is that the weighted value of the integer X depends on all the residues χ 1 , χ 2 , , χ k . The reconstruction of an integer by its residue code, i.e., the reverse conversion, is one of the most difficult non-modular operations in residue arithmetic. Moreover, this operation underlies all the other non-modular procedures.
Despite the currently extensive studies on residue arithmetic and its applications, there is a need to develop novel efficient approaches and methods of an integer number reconstruction by its residue code. This should enable us the extensive use of residue arithmetic for high-speed computing in many priority fields, first of all, in various digital signal processing and cryptographic applications.
There are two canonical techniques of reverse conversion: the canonical method based on the Chinese Remainder Theorem (CRT) and the residue code conversion to a weighted representation in the Mixed-Radix System (MRS) [1,2,5,8,14,15,16,17,18]. In general, all other conversion methods represent different variants of these two methods.
Below, we describe the mathematical background of these methods.

3.1. CRT-Base Conversion Method

When the moduli m 1 , m 2 , , m k are pairwise relatively prime, the integer number X and its residue code χ 1 , χ 2 , , χ k are related by the equation:
X = i = 1 k M i , k χ i , k M k ,
where M i , k = M k / m i , χ i , k = M i , k 1 χ i m i is a normalized residue modulo m i i = 1 , 2 , , k , Y 1 m denotes the multiplicative inverse of an integer Y modulo m.
In essence, Equation (2) represents the CRT [10,19,20].
In the last decades, considerable efforts are directed to reducing the complexity of the CRT implementation and the possibility of its application in high-speed computing [2,5,8,21,22,23]. The main idea of these methods is to replace the inner multiplications and additions modulo M k with simpler operations (see (2)).
Consider the CRT-number
X k = i = 1 k M i , k χ i , k .
As follows from (2), the difference X k X is a multiple of M k . Therefore, the following exact integer equality holds
X = X k ρ k X M k .
The unique integer number ρ k X is a normalized rank (or, briefly, rank) of the number X [3,4,7].
Equation (4) is called a rank form of the integer X. In essence, the rank ρ k X is a reconstruction coefficient that indicates how many times the dynamic range M k is exceeded when converting the residue code χ 1 , χ 2 , , χ k to the integer X.
In contrast to (2), Equation (4) does not contain a very time-consuming reduction modulo M k . Therefore, when we have the efficient method for the rank ρ k X computation, the reverse conversion algorithm constructed on the basis of (4) has a substantial lead over the canonical CRT implementation (2).

3.2. MRS-Base Conversion Method

In the MRS defined by a set m 1 , m 2 , , m k of pairwise relatively prime moduli, the integer X Z M k is represented by the k-tuple x k , x k 1 , , x 1 of mixed-radix digits, resulting in
X = x 1 + x 2 M 1 + x 3 M 2 + + x k M k 1 = i = 1 k x i M i 1 ,
where x i Z m i ( i = 1 , 2 , , k ) [1,2,8].
It is well known that the MRS surpasses the RNS when performing non-modular operations such as magnitude comparison, sign determination, and overflow detection. Therefore, the mixed-radix representation has received the widest appliance for the implementation of non-modular procedures along with the other generally accepted integral characteristics of the residue code such as the rank of a number, core function, interval index, parity function, diagonal, and quotient functions [3,4,7,24,25,26,27,28,29,30,31,32,33].
The RNS-to-MRS reverse conversion establishes an association between the residue code χ 1 , χ 2 , , χ k of the number X and its mixed-radix representation x k , x k 1 , , x 1 . The mixed-radix digits x i ( i = 1 , 2 , , k ) in (5) are computed according to the following calculation relations [1]:
x 1 = χ 1 ,
x 2 = χ 2 x 1 m 1 1 m 2 m 2 ,
x 3 = χ 3 x 1 m 1 1 m 3 x 2 m 2 1 m 3 m 3 ,
x k = χ k x 1 m 1 1 m k x 2 m 2 1 m k x k 1 m k 1 1 m k m k .
This sequential calculation procedure called a chained algorithm can be written in the general form
x i = X ( i ) m i ,
where
X ( i ) = X , if i = 1 , X ( i 1 ) x i 1 m i 1 1 , if i = 2 , 3 , , k .
From (6) and (7), it follows that the considered computational process requires two modular operations: subtraction and multiplication by the multiplicative inverse. Thus, the most crucial advantage of this algorithm is its high modularity. However, its strictly sequential nature prevents general use for the construction of appropriate high-performance parallel computing procedures.

4. A Novel CRT-Base RNS-to-MRS Reverse Conversion Method

Now, we describe a proposed new method for calculating mixed-radix digits x 1 , x 2 , , x k of the number X by its residue code ( χ 1 , χ 2 , , χ k ) .
Consider the CRT-number X k . According to (3), we have
X k = i = 1 k 1 M i , k 1 m k χ i , k + M k 1 χ k , k .
By Euclid’s Division Lemma, the integer m k χ i , k can be written as
m k χ i , k = χ i , k 1 + m k χ i , k m i m i ,
where
χ i , k 1 = m k χ i , k m i = m k M i , k 1 χ i m i m i = m k M i , k 1 χ i m i = M i , k 1 1 χ i m i ,
x denotes the largest integer less than or equal to x.
Substituting (9) into (8), we obtain
X k = X k 1 + M k 1 S k X ,
where
X k 1 = i = 1 k 1 M i , k 1 χ i , k 1 ,
S k X = i = 1 k R i , k χ i ,
R i , k χ i = m k χ i , k m i i = 1 , 2 , , k .
Taking into account (9), we have
R i , k χ i = m k χ i , k χ i , k 1 m i .
Since R i , k χ i Z m k , we can reduce the right side of equality modulo m k .
Hence, the residue R i , k χ i can be calculated as
R i , k χ i = χ i , k 1 m i m k = M i , k 1 1 χ i m i m i m k i = 1 , 2 , , k 1 .
At the same time, from (13) it follows that
R k , k χ k = χ k , k = M k , k 1 χ k m k = M k 1 1 χ k m k .
Similarly, taking into account Equations (10)–(13), the numbers X i ( i = k 1 , k 2 , , 1 ) can be written by turns as
X k 1 = X k 2 + M k 2 S k 1 X ,
X k 2 = X k 3 + M k 3 S k 2 X ,
X 2 = X 1 + M 1 S 2 X ,
X 1 = M 0 S 1 X ,
where M 0 = 1 , S 1 X = χ 1 , the integers S l X l = 2 , 3 , , k are calculated according to (12)–(15) in the case when the index k is replaced by l.
Finally, substituting the above equations for X l ( l = k 1 , k 2 , , 1 ) by turns into (10), we obtain
X k = i = 1 k M l 1 S l X .
At the same time, according to Euclid’s Division Lemma, we have
S l X = R l X + m l Q l X ,
where R l X = S l X m l and Q l X = S l X / m i are the remainder and quotient of the division S l X by the modulus m l , respectively.
Therefore, taking into account (12), when the index k is replaced by l, the integers R l X and Q l X can be computed as
R l X = i = 1 l R i , l χ i m l ,
Q l X = 1 m l i = 1 l R i , l χ i .
From (19), it follows that Q l X equals the number of occurred overflows when calculating the sum R l X of residues R 1 , l χ 1 , R 2 , l χ 2 , , R l , l χ l modulo m l l = 2 , 3 , , k .
Note that R 1 X = χ 1 and Q 1 X = 0 since S 1 X = χ 1 .
Substituting (17) into (16), we obtain
X k = X k R + X k 1 Q + M k Q k X ,
where
X k R = l = 1 k M l 1 R l X ,
X k 1 Q = l = 1 k 1 M l Q l X .
Let us draw attention to Equations (21) and (22). It is evident that the number X k R is represented by the k-tuple x k R , x k 1 R , , x 1 R of mixed-radix digits, where x l R = R l X , l = 1 , 2 , , k (see Equation (5)). At the same time, x l R Z m l and X k R M k 1 .
Bearing in mind that Q 1 X = 0 , the number X k 1 Q can be written as
X k 1 Q = l = 1 k 1 M l 1 Q l X ,
where Q 1 X = 0 , Q 2 X = Q 1 X = 0 , and Q l X = Q l 1 X for l 3 . Therefore, taking into account (19), the integer Q l X can be calculated as
Q l X = 1 m l 1 i = 1 l 1 R i , l 1 χ i l = 3 , 4 , , k .
Hence, Q l X < l 1 since R i , l 1 χ i m l 1 1 .
Thus, the integer X k 1 Q (see Equations (23) and (5)) can be represented by a k-tuple x k Q , x k 1 Q , , x 1 Q of mixed-radix digits under the condition that x l Q Z m l l = 1 , 2 , , k , where x 1 Q = x 2 Q = 0 , x l Q = Q l X for l > 2 . Consequently, that entails the fulfillment of the condition Z l 1 Z m l , which leads to inequality
m l l 1 l = 1 , 2 , , k .
Thus, when the moduli set m 1 , m 2 , , m k meets the conditions (25), we have that X k 1 Q < M k .
Note that the integer X k 1 Q is a multiple of the number M 2 = m 1 m 2 because of x 1 Q = x 2 Q = 0 (see Equation (5)).
Now, let us return to Equation (20). According to Euclid’s Division Lemma, the sum of two mixed-radix numbers X k R and X k 1 Q results in
X k R + X k 1 Q = X k R + X k 1 Q M k + M k X k R + X k 1 Q M k .
Hence, substituting (26) into (20), we obtain
X k = X k R + X k 1 Q M k + M k Q k X + X k R + X k 1 Q M k .
Taking into account the rank form of the number X (4), from (27) we have
X = X k R + X k 1 Q M k .
From (28), it follows that the mixed-radix representation of the number X, i.e., k-tuple x k , x k 1 , , x 1 , can be calculated as a result of the addition of two mixed-radix numbers X k R = x k R , x k 1 R , , x 1 R and X k 1 Q = x k Q , x k 1 Q , , x 1 Q (see (21) and (23)) in the basis m 1 , m 2 , , m k . Note that x 1 R = χ 1 , x 1 Q = x 2 Q = 0 . At the same time, the digits x 2 R , x 3 R , , x k R and x 3 Q , x 4 Q , , x k Q are calculated as the sum of the residues R 1 , l χ 1 , R 2 , l χ 2 , , R l , l χ l modulo m l along with the counting of occurred overflows according to (18) and (24) l = 2 , 3 , , k .
Therefore, the mixed-radix digits x l R and x l Q are computed as
x 1 R = χ 1 , x l R = i = 1 l R i , l χ i m l l = 2 , 3 , , k ,
x 1 Q = x 2 Q = 0 , x l Q = 1 m l 1 i = 1 l 1 R i , l 1 χ i l = 3 , 4 , , k ,
where
R i , l χ i = M i , l 1 1 χ i m i m i m l i l ,
R l , l χ l = M l 1 1 χ l m l l = 2 , 3 , , k .
Furthermore, in the MRS with the bases m 1 , m 2 , , m k , we calculate the sum of two numbers X k R and X k 1 Q . As a result, we obtain the mixed-radix representation x k , x k 1 , , x 1 of the number X.
Table 1 given below presents the pre-calculation components (see Equations (31) and (32)). It should be recalled that R 1 , 1 χ 1 = χ 1 . The abbreviation LUT means lookup table. The bit-length of residues is b l = log 2 m l l = 1 , 2 , , k . Here, and further, x denotes the smallest integer greater than or equal to x.
Table 2 presents the results of calculations in the modular channels according to Equations (29) and (30). It should be reminded that in the first modular channel corresponding to the modulus m 1 , the calculations are not carried out, so x 1 R = χ 1 and x 2 Q = 0 .
The stated above allows us to formulate the following substantial theorem.
Theorem 1.
(About RNS-to-MRS reverse conversion).
Let an arbitrary RNS be defined by an ascending-ordered set of k pairwise relatively prime moduli m 1 , m 2 , , m k ( m l l 1 , l = 1 , 2 , , k , k 2 ), and let the residue code χ 1 , χ 2 , , χ k of the number X Z M k be given. Then, the mixed-radix representation x k , x k 1 , , x 1 of the number X can be computed as a result of the summation of two mixed-radix numbers, namely, the appropriate number X k R = x k R , x k 1 R , , x 1 R and the correction number X k 1 Q = x k Q , x k 1 Q , , x 1 Q , where the digits x l R and x l Q l = 1 , 2 , , k are calculated according to (29) and (30), respectively, taking into account (31) and (32).

5. A Numerical Example of the Proposed Conversion Method

The main idea of the proposed approach to reverse conversion is illustrated below by a simple numerical example. For convenience, we consider a four-moduli RNS.
Example 1.
Let the RNS moduli-set be m 1 , m 2 , m 3 , m 4 = 5 , 7 , 9 , 11 . Suppose that we wish to calculate the digits of the mixed-radix representation x 4 , x 3 , x 2 , x 1 of the given number X by its residue code χ 1 , χ 2 , χ 3 , χ 4 = ( 3 , 6 , 4 , 2 ) .
Step 1. The calculation of the primitive constants in a given RNS.
M 4 = 3465 , M 3 = 315 , M 2 = 35 , M 1 = 5 , M 0 = 1 ,
M 1 , 4 = 693 , M 2 , 4 = 495 , M 3 , 4 = 385 , M 4 , 4 = 315 ,
M 1 , 4 1 m 1 = 2 , M 2 , 4 1 m 2 = 3 , M 3 , 4 1 m 3 = 4 , M 4 , 4 1 m 4 = 8 ,
m 1 1 m 4 = 9 , m 2 1 m 4 = 8 , m 3 1 m 4 = 5 , M 3 1 m 4 = 8 ,
M 1 , 3 = 63 , M 2 , 3 = 45 , M 3 , 3 = 35 ,
M 1 , 3 1 m 1 = 2 , M 2 , 3 1 m 2 = 5 , M 3 , 3 1 m 3 = 8 ,
m 1 1 m 3 = 2 , m 2 1 m 3 = 4 , M 2 1 m 3 = 8 ,
M 1 , 2 = 7 , M 2 , 2 = 5 ,
M 1 , 2 1 m 1 = 3 , M 2 , 2 1 m 2 = 3 ,
m 1 1 m 2 = 3 , M 1 1 m 2 = 3 .
Step 2. The calculation of the residue sets R 1 , l χ 1 , R 2 , l χ 2 , , R l , l χ l according to (31) and (32) l = 1 , 2 , 3 , 4 .
We obtain
R 1 , 1 χ 1 = χ 1 = 3 ,
R 1 , 2 χ 1 = 1 · 3 5 · 3 7 = 5 ,
R 2 , 2 χ 2 = 3 · 6 7 = 4 ,
R 1 , 3 χ 1 = 3 · 3 5 · 2 9 = 1 ,
R 2 , 3 χ 2 = 3 · 6 7 · 4 9 = 2 ,
R 3 , 3 χ 3 = 8 · 4 9 = 5 ,
R 1 , 4 χ 1 = 2 · 3 5 · 9 11 = 2 ,
R 2 , 4 χ 2 = 5 · 6 7 · 8 11 = 6 ,
R 3 , 4 χ 3 = 8 · 4 9 · 5 11 = 8 ,
R 4 , 4 χ 4 = 8 · 2 11 = 5 .
As a result, the following sets of residues occur
R 1 , 1 χ 1 = 3 ,
R 1 , 2 χ 1 , R 2 , 2 χ 2 = 5 , 4 ,
R 1 , 3 χ 1 , R 2 , 3 χ 2 , R 3 , 3 χ 3 = 1 , 2 , 5 ,
R 1 , 4 χ 1 , R 2 , 4 χ 2 , R 3 , 4 χ 3 , R 4 , 4 ( χ 4 ) = 2 , 6 , 8 , 5 .
Step 3.The summation of the residues R 1 , l χ 1 , R 2 , l χ 2 , , R l , l χ l modulo m l along with the counting of occurring overflows according to (18) and (19), respectively l = 2 , 3 , 4 .
Recall that R 1 X = R 1 , 1 χ 1 = 3 , and Q 1 X = 0 . We have
R 2 X = 5 + 4 7 = 9 7 = 2 ,
R 3 X = 1 + 2 + 5 9 = 8 9 = 8 ,
R 4 X = 2 + 6 + 8 + 5 11 = 21 11 = 10 ,
Q 2 X = 5 + 4 / 7 = 9 / 7 = 1 ,
Q 3 X = 1 + 2 + 5 / 9 = 8 / 9 = 0 ,
Q 4 X = 2 + 6 + 8 + 5 / 11 = 21 / 11 = 1 .
Therefore, the mixed-radix representations of the numbers X 4 R and X 3 Q (see (21) and (23)) are computed:
x 4 R , x 3 R , x 2 R , x 1 R = R 4 X , R 3 X , R 2 X , R 1 X = 10 , 8 , 2 , 3 ,
x 4 Q , x 3 Q , x 2 Q , x 1 Q = Q 3 X , Q 2 X , 0 , 0 = 0 , 1 , 0 , 0 .
Step 4.The calculation of the mixed-radix digits x 4 , x 3 , x 2 , x 1 .
The addition of two numbers X 4 R = 10 , 8 , 2 , 3 and X 3 Q = 0 , 1 , 0 , 0 according to (28) gives the mixed-radix representation 0 , 0 , 2 , 3 of the number X.
Let us now verify the obtained result. According to (5), we have
X = 0 , 0 , 2 , 3 = 0 · 315 + 0 · 35 + 2 · 5 + 3 = 13 .
This result holds because the residue code of the integer number X = 13 is 3 , 6 , 4 , 2 , since 13 5 = 3 , 13 7 = 6 , 13 9 = 4 , 13 11 = 2 . Thus, this result coincides with the condition of the example.

6. The Computational Cost of the Reverse Conversion Method

As it follows from the results mentioned above, the calculation of the mixed-radix digits x 1 , x 2 , ⋯, x k reduces to the independent and parallel summation of small residues R 1 , l χ 1 , R 2 , l χ 2 , ⋯, R l , l χ l modulo m l in lth modular channel l = 1 , 2 , , k , taking into account the number of the overflows occuring during the modular addition operations (see (29)–(32)).
Let us evaluate the time required to perform the parallel reverse conversion.
First, we consider the calculation of mixed-radix digits of the numbers X k R = x k R , x k 1 R , , x 1 R and X k 1 Q = x k Q , x k 1 Q , , x 1 Q (see (29) and (30)). As can be seen, there are no modular addition operations in the first modular channel corresponding to the modulus m 1 . In the second channel, we have only one addition operation modulo m 2 . Furthermore, two additions modulo m 3 are performed in the third channel and so on. Thus, in the lth modular channel, we have l 1 additions modulo m l l = 2 , 3 , , k . These calculations are easily parallelized and pipelined. Therefore, the required computation time for calculating digits x l R and x l Q is T l = log 2 l modular clock cycles.
Thus, the time for obtaining the mixed-radix representations of the numbers X k R and X k 1 Q is determined by the time in the kth modular channel and equals T k = log 2 k modular clock cycles.
The summation of X k R and X k 1 Q on the bases m 1 , m 2 , , m k involves two additional modular clock cycles taking into account the inter-digit carries. Therefore, the execution time of the reverse conversion equals T c o n v = T k + 2 modular clock cycles. Thus, the overall time is t c o n v = T c o n v t m o d , where t m o d denotes the modular clock cycle time. At the same time, when pipelined, the throughput rate of the proposed conversion method is one conversion in one modular clock cycle.
Consider now the evaluation of the required computational cost. Due to the small word-length of residues in the k-tuple ( χ 1 , χ 2 , , χ k ) , the pre-computation and lookup table techniques are suitable for reverse conversion implementation. So, we can use one-input lookup tables depending on the residues word-length in each modular channel.
At the beginning stage of the reverse conversion, in the lth channel corresponding to the modulus m l , the number of lookup tables required to store the residue set R 1 , l χ 1 , R 2 , l χ 2 , , R l , l χ l equals N l u t l = l . At the same time, the word length of recorded residues is b l = log 2 m l bits l = 2 , 3 , , k . In the first modular channel, N l u t 1 = 0 since S 1 X = χ 1 .
Then, the overall number of one-input lookup tables in all modular channels is equal to
N l u t = l = 2 k N l u t l = k 2 + k 2 2 .
The summation of the residues R 1 , l χ 1 , R 2 , l χ 2 , , R l , l χ l modulo m l requires N a d d l = l 1 modular addition operations l = 2 , 3 , , k . At the same time, all independent calculations are realized in parallel in corresponding modular channels.
Taking into account that x 1 Q = x 2 Q = 0 , the summation of two numbers X k R = x k R , x k 1 R , , x 1 R and X k 1 Q = x k Q , x k 1 Q , , x 1 Q on the final stage of the reverse conversion requires 2 k 2 modular addition operations.
Hence, the overall number of modular addition operations in all modular channels is equal to
N a d d = l = 2 k N a d d l + 2 k 2 = k 2 + 3 k 8 2 .
When pipelined, the throughput rate of the proposed method is one reverse conversion in one modular clock cycle.

7. Discussion

As it follows from [1], the calculation of the mixed-radix digits x 1 , x 2 , , x k (see (6) and (7)) requires k 1 both addition and multiplication operations; in this case, the overall conversion time is k k 1 / 2 · t a d d + t m u l , where t a d d and t m u l denote an execution time of addition/subtraction and multiplication, respectively. The computational cost of the pipelined implementation of this algorithm is k k 1 / 2 , both multiplication and addition operations, while the conversion time is k 1 t a d d + t m u l . The main drawback of this method is its strictly sequential nature.
The parallel conversion method circumscribed in [16] uses the additional lookup tables. At the same time, k ( k + 1 ) / 2 lookup tables and k ( k + 1 ) / 2 adders are required. The conversion time is t l u t + k 1 t a d d due to the need to generate the inter-digit carries when performing addition operations. As noted in [34], the method proposed in [16] does not allow obtaining the claimed depth of O log 2 k in terms of RNS processing elements. In this regard, an improved method was proposed by adding extra k ( k + 1 ) / 2 multipliers to hardware resources used in [16]. The implementation time is t l u t + t m u l + 2 log 2 k + 1 t a d d . Hence, the time complexity of this conversion algorithm is O log 2 k .
In [15], the mixed-radix conversion is realized by the cascaded scheme of lookup tables and adders. The computational cost for the sequential implementation is k ( k 2 ) / 4 double-size lookup tables and k ( k 2 ) / 4 adders, while the conversion time equals k / 2 · t l u t + t a d d . When pipelined, the throughput rate is determined by the time equals t l u t + t a d d . This method works well when the used moduli do not have a very large word-length, since the size of lookup tables increases significantly with a word-length growth.
The paper [17] presents the parallel reverse conversion method, which uses the lookup table technique and requires no arithmetic or logical units. As reported, this algorithm is better than the ones presented in [15,16]. It is based on solving k ( k 1 ) / 2 linear Diophantine Equations and requires k ( k 1 ) / 2 lookup tables of size m i × m j , while a conversion time is k 1 t l u t . When pipelined, its effective conversion rate is one conversion per t l u t . So, this method is attractive for DSP implementation. However, it is not suitable for implementing cryptographic applications because of the enormous size of the required lookup tables, especially when processing large numbers.
In the paper [9], the reverse conversion method is based on modular reduction by a modified canonic CRT algorithm. This enables minimizing the bit-width of intermediate data processing. The lookup tables translate the b i -bit input residues i = 1 , 2 , , k into b o u t -bit output integers, where b i = log 2 m i , b o u t = 1 2 log 2 i = 1 k b i , and k is the number of RNS moduli. As a result, the modular reduction of the modified k-tuple of b o u t -bits integers is carried out over a ring of size 2 b o u t such that only the b o u t least significant bits of the binary representation are maintained. In this case, all the b o u t -bit outputs in the modified k-tuple are added together by adder tree without regard to overflow, propagating the b o u t least significant bits to the output. The reverse conversion requires k lookup tables and k 1 adders. The scope of used lookup tables is 2 b × 2 b o u t , b { b 1 , b 2 , , b k } . The overall conversion time is t l u t + log 2 k t a d d .
Some reverse conversion methods use the special moduli sets with a limited number of moduli, such as m = 2 n + d d 1 , 0 , 1 [2,8,35,36,37,38,39,40]. Their main drawback consists in a small number of the selected moduli, typically from three to five. These moduli sets are suitable for the efficient implementations of DSP algorithms but completely not applicable for large numbers processing widely used in cryptography. For example, to represent 1024-bit word-length cryptographic numbers using four RNS moduli, each modular channel must have residues of 256-bit length, which is not qualified for high-performance computing.
Table 3 compares the results across multiple techniques of the reverse conversion. Here, we use the following abbreviations: LUT—lookup table, ADD–adder, MUL—multiplier. The bit length b b 1 , b 2 , , b k , b l = log 2 m l l = 1 , 2 , , k .
As seen from above, the proposed parallel reverse conversion method has time complexity of the order O log 2 k . In pipelined mode, it enables the high throughput rate and has one reverse conversion in one modular clock cycle. At the same time, the computational complexity is of the order of O ( k 2 / 2 ) in terms of the number of both required arithmetic operations and one-input lookup tables.

8. Conclusions

In this paper, a novel approach to parallel reverse conversion of the residue code χ 1 , χ 2 , , χ k of the number X to mixed-radix representation x k , x k 1 , , x 1 is described.
The calculation of the mixed-radix digits x k , x k 1 , , x 1 is reduced to a parallel summation of the small word-length residues R 1 , l χ 1 , R 2 , l χ 2 , ⋯, R l , l χ l modulo m l in lth modular channel l = 1 , 2 , , k , taking into account the number of the overflows occuring during the modular addition operations. These modular operations are performed fast and independently in each modular channel and easily pipelined.
The computational cost of the proposed reverse conversion method is presented. In all modular channels, the general number of modular addition operations is equal to N a d d = k 2 + 3 k 8 / 2 . At the same time, the summary number of reqiured one-input lookup tables makes up N l u t = k 2 + k 2 / 2 .
The execution time of the reverse conversion equals T c o n v = log 2 k + 2 modular clock cycles. At the same time, when pipelined, the throughput rate of the proposed conversion method is one conversion in one modular clock cycle.
The proposed parallel reverse conversion method coincides with the development vector of modern high-performance computing using residue arithmetic. It can find a widespread application for implementing a broad class of tasks in various areas of science and technology, first of all, in digital signal processing and cryptography.

Author Contributions

Conceptualization, M.S.; investigation, Y.P.; methodology, M.S.; writing—original draft preparation, M.S.; writing—review and editing, Y.P. All authors have read and improved the final version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Szabo, N.S.; Tanaka, R.I. Residue Arithmetic and Its Application to Computer Technology; McGraw-Hill: New York, NY, USA, 1967. [Google Scholar]
  2. Molahosseini, A.S.; de Sousa, L.S.; Chang, C.H. (Eds.) Embedded Systems Design with Special Arithmetic and Number Systems; Springer: Cham, Switzerland, 2017. [Google Scholar]
  3. Akushskii, I.Y.; Juditskii, D.I. Machine Arithmetic in Residue Classes; Soviet Radio: Moscow, Russia, 1968. (In Russian) [Google Scholar]
  4. Amerbayev, V.M. Theoretical Foundations of Machine Arithmetic; Nauka: Alma-Ata, Kazakhstan, 1976. (In Russian) [Google Scholar]
  5. Omondi, A.R.; Premkumar, B. Residue Number Systems: Theory and Implementation; Imperial College Press: London, UK, 2007. [Google Scholar]
  6. Soderstrand, M.A.; Jenkins, W.K.; Jullien, G.A.; Taylor, F.J. (Eds.) Residue Number System Arithmetic: Modern Applications in Digital Signal Processing; IEEE Press: New York, NY, USA, 1986. [Google Scholar]
  7. Chernyavsky, A.F.; Danilevich, V.V.; Kolyada, A.A.; Selyaninov, M.Y. High-Speed Methods, and Systems of Digital Information Processing; Belarusian State University: Minsk, Belarus, 1996. (In Russian) [Google Scholar]
  8. Ananda Mohan, P.V. Residue Number Systems. Theory and Applications; Springer: Cham, Switzerland, 2016. [Google Scholar]
  9. Michaels, A.J. A maximal entropy digital chaotic circuit. In Proceedings of the 2011 IEEE International Symposium of Circuits and Systems (ISCAS), Rio de Janeiro, Brazil, 15–18 May 2011; pp. 717–720. [Google Scholar]
  10. Ding, C.; Pei, D.; Salomaa, A. Chinese Remainder Theorem: Applications in Computing, Coding, Cryptography; World Scientific: Singapore, 1996. [Google Scholar]
  11. Omondi, A.R. Cryptography Arithmetic: Algorithms and Hardware Architectures; Springer: Cham, Switzerland, 2020. [Google Scholar]
  12. Burton, D.M. Elementary Number Theory, 7th ed.; McGraw-Hill: New York, NY, USA, 2011. [Google Scholar]
  13. Hardy, G.H.; Wright, E.M. An Introduction to the Theory of Numbers, 6th ed.; Oxford University Press: London, UK, 2008. [Google Scholar]
  14. Akkal, M.; Siy, P. A new mixed radix conversion algorithm MRC-II. J. Syst. Archit. 2007, 53, 577–586. [Google Scholar] [CrossRef]
  15. Chakraborti, N.B.; Soundararajan, J.S.; Reddy, A.L.N. An implementation of mixed-radix conversion for residue number applications. IEEE Trans. Comput. 1986, 35, 762–764. [Google Scholar] [CrossRef]
  16. Huang, C.H. Fully parallel mixed-radix conversion algorithm for residue number applications. IEEE Trans. Comput. 1983, 32, 398–402. [Google Scholar] [CrossRef]
  17. Miller, D.F.; McCormick, W.S. An arithmetic free parallel mixed-radix conversion algorithm. IEEE Trans. Circuits Syst. II 1998, 45, 158–162. [Google Scholar] [CrossRef]
  18. Yassine, H.M.; Moore, W.R. Improved mixed-radix conversion for residue number architectures. IEE Proc. G - Circuits Devices Syst. 1991, 138, 120–124. [Google Scholar] [CrossRef]
  19. Knuth, D.E. The Art of Computer Programming, Volume 2: Seminumerical Algorithms, 3rd ed.; Addison-Wesley: Boston, MA, USA, 1998. [Google Scholar]
  20. Shoup, V. A Computational Introduction to Number Theory and Algebra, 2nd ed.; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
  21. Phatak, D.S.; Houston, S.D. New distributed algorithms for fast sign detection in residue number systems (RNS). J. Parallel Distrib. Comput. 2016, 97, 78–95. [Google Scholar] [CrossRef]
  22. Shenoy, M.A.P.; Kumaresan, R. A fast and accurate RNS scaling technique for high speed signal processing. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 929–937. [Google Scholar] [CrossRef]
  23. Vu, T.V. Efficient implementations of the Chinese Remainder Theorem for sign detection and residue decoding. IEEE Trans. Comput. 1985, 34, 646–651. [Google Scholar]
  24. Miller, D.D.; Altschul, R.E.; King, J.R.; Polky, J.N. Analysis of the residue class core function of Akushskii, Burcev, and Pak. In Residue Number System Arithmetic: Modern Applications in Digital Signal Processing; IEEE Press: Piscataway, NJ, USA, 1986; pp. 390–401. [Google Scholar]
  25. Gonnella, J. The application of core functions to residue number system. IEEE Trans. Signal Process. 1991, 39, 69–75. [Google Scholar] [CrossRef]
  26. Abtahi, M. Core function of an RNS number with no ambiguity. Comput. Math. Appl. 2005, 50, 459–470. [Google Scholar] [CrossRef] [Green Version]
  27. Kong, Y.; Asif, S.; Khan, M.A.U. Modular multiplication using the core function in the residue number system. Appl. Algebra Eng. Commun. Comput. 2016, 27, 1–16. [Google Scholar] [CrossRef]
  28. Kolyada, A.A.; Selyaninov, M.Y. Generation of integral characteristics of symmetric-range residue codes. Cybern. Syst. Anal. 1986, 22, 431–437. [Google Scholar] [CrossRef]
  29. Selianinau, M. An efficient implementation of the CRT algorithm based on an interval-index characteristic and minimum-redundancy residue code. Int. J. Comput. Meth. 2020, 17, 2050004. [Google Scholar] [CrossRef]
  30. Lu, M.; Chiang, J.-S. A novel division algorithm for the residue number system. IEEE Trans. Comput. 1992, 41, 1026–1032. [Google Scholar] [CrossRef]
  31. Dimauro, G.; Impedovo, S.; Modugno, R.; Pirlo, G.; Stefanelli, R. Residue-to-binary conversion by the “quotient function”. IEEE Trans. Circuits Syst. II Analog Digital Signal Process. 2003, 50, 488–493. [Google Scholar] [CrossRef]
  32. Dimauro, G.; Impedovo, S.; Pirlo, G.; Salzo, A. RNS architectures for the implementation of the ’diagonal function’. Inf. Process. Lett. 2000, 73, 189–198. [Google Scholar] [CrossRef]
  33. Pirlo, G.; Impedovo, D. A new class of monotone functions of the residue number system. Int. J. Math. Models Meth. Appl. Sci. 2013, 7, 802–809. [Google Scholar]
  34. Hitz, M.A.; Kaltofen, E. Integer division in residue number systems. IEEE Trans. Comput. 1995, 44, 983–989. [Google Scholar] [CrossRef]
  35. Bergerman, M.V.; Lyakhov, P.A.; Voznesensky, A.S.; Bogaevskiy, D.V.; Kaplun, D.I. Designing reverse converter for data transmission systems from two-level RNS to BNS. J. Phys. Conf. Ser. 2020, 1658, 012005. [Google Scholar] [CrossRef]
  36. Daphni, S.; Vijula Grace, K.S. A review analysis of reverse converter based on RNS in signal processing. Int. J. Sci. Technol. Res. 2020, 9, 1686–1689. [Google Scholar]
  37. Sousa, L.; Paludo, R.; Martins, P.; Pettenghi, H. Towards the integration of reverse converters into the RNS channels. IEEE Trans. Comput. 2020, 69, 342–348. [Google Scholar] [CrossRef]
  38. Mojahed, M.; Molahosseini, A.S.; Zarandi, A.A.E. A multifunctional unit for reverse conversion and sign detection based on the 5-moduli set. Comp. Sci. 2021, 22, 101–121. [Google Scholar] [CrossRef]
  39. Salifu, A. New reverse conversion for four-moduli set and five-moduli set. J. Comp. Commun. 2021, 9, 57–66. [Google Scholar] [CrossRef]
  40. Taghizadeghankalantari, M.; TaghipourEivazi, S. Design of efficient reverse converters for Residue Number System. J. Circuits Syst. Comp. 2021, 30, 2150141. [Google Scholar] [CrossRef]
Table 1. The pre-calculation components.
Table 1. The pre-calculation components.
Input ResidueNumber and Skope of LUTsOutput Residue Set
χ 1 k 1 ,   2 b 1 × b l   l = 2 , 3 , , k R 1 , 2 χ 1 , R 1 , 3 χ 1 , , R 1 , k χ 1
χ 2 k 1 ,   2 b 2 × b l   l = 2 , 3 , , k R 2 , 2 χ 2 , R 2 , 3 χ 2 , , R 2 , k χ 2
χ k 1 2,   2 b k 1 × b l   l = k 1 , k R k 1 , k 1 χ k 1 , R k 1 , k χ k 1
χ k 1,   2 b k × b k R k , k χ k
Table 2. The results of calculations in the modular channels.
Table 2. The results of calculations in the modular channels.
Modular ChannelInput DataOutput Data
m 2 R 1 , 2 χ 1 , R 2 , 2 χ 2 x 2 R ,   x 3 Q
m 3 R 1 , 3 χ 1 , R 2 , 3 χ 2 , R 3 , 3 χ 3 x 3 R ,   x 4 Q
m k 1 R 1 , k 1 χ 1 , R 2 , k 1 χ 2 , , R k 1 , k 1 χ k 1 x k 1 R ,   x k Q
m k R 1 , k χ 1 , R 2 , k χ 2 , , R k , k χ k x k R
Table 3. RNS-to-MRS reverse conversion methods.
Table 3. RNS-to-MRS reverse conversion methods.
MethodNumber and Scope of LUTsADDMULConversion Time
[1],
sequential k 1 k 1 k k 1 2 t m u l + t a d d
[1],
sequential,
pipelined k k 1 2 2 b + 1 × b k k 1 2 k 1 t l u t + t a d d
[16],
parallel k k + 1 2 2 b × b k k + 1 2 t l u t + k 1 t a d d
[34],
parallel k k + 1 2 2 b × b k k + 1 2 k k + 1 2 t l u t + t m u l + 2 log 2 k + 1 t a d d
[15],
sequential k k 2 4 2 2 b × 2 b k k 1 4 k 2 t l u t + t a d d
[15],
parallel k k 2 4 + k 1 2 2 b × 2 b k k + 2 4 3 t l u t + k 2 t a d d
[17],
parallel k k 1 2 2 2 b × b k 1 t l u t
[9]k 2 b × 2 1 2 log 2 k b k 1 t l u t + log 2 k t a d d
Our method,
parallel k 2 + k 2 2 2 b × b k 2 + 3 k 8 2 log 2 k + 2 t m o d
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Selianinau, M.; Povstenko, Y. An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem. Entropy 2022, 24, 242. https://doi.org/10.3390/e24020242

AMA Style

Selianinau M, Povstenko Y. An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem. Entropy. 2022; 24(2):242. https://doi.org/10.3390/e24020242

Chicago/Turabian Style

Selianinau, Mikhail, and Yuriy Povstenko. 2022. "An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem" Entropy 24, no. 2: 242. https://doi.org/10.3390/e24020242

APA Style

Selianinau, M., & Povstenko, Y. (2022). An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem. Entropy, 24(2), 242. https://doi.org/10.3390/e24020242

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop