A VLSI architecture for RNS with MI adders

A VLSI architecture for RNS with MI adders

67 A VLSI architecture for R N S with MI adders Ferruccio Barsi and Enrico Martinelli Istituto di Elaborazione dell'Informazione del CNR, Pisa, Italy...

718KB Sizes 0 Downloads 69 Views

67

A VLSI architecture for R N S with MI adders Ferruccio Barsi and Enrico Martinelli Istituto di Elaborazione dell'Informazione del CNR, Pisa, Italy

Received 18 July, 1990

Abstract. Over several years, RNS applications were limited to addition, subtraction and multiplication with results expected within a predetermined range because of the absence of explicit information on number magnitude in the residue representation. Hybrid notations have been proposed to overcome this obstacle. In this paper, an architecture for adding and overflow checking is presented which is based upon Residue Number Systems with Magnitude Index (RNS with MI) and its area-time complexity is evaluated. It is shown that considerable execution time reduction may result for a wide class of applications at the cost of a slight increase of area occupancy as compared with binary realizations.

Keywords. Addition, hybrid number systems, overflow detection, residue number systems with magnitude index, VLSI architecture, VLSI complexity

1. Introduction

Since their appearance, Residue Number Systems (RNS's) showed their limitations mainly consisting in the absence of an explicit information on number magnitude [1-4]. As a consequence, operations such as magnitude comparison, sign and overflow detection, and division were difficult and time consuming as intermodular operations, i.e., operations involving all residue digits were required, whose complexity is equivalent to a residue-to-weighted system conversion process. The renewed interest in RNS for Digital Signal Processing arithmetic and the advent of VLSI technology led several authors to propose architectural solutions attempting to speed up intermodular operations [5-12]. However, it was imElsevier INTEGRATION, the VLSI journal 11 (1991) 67-83 0167-9260/91/$03.50 © 1991 - Elsevier Science Publishers B.V.

68

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders

mediately apparent that the best solution would be to define a number system exhibiting at the same time the modular properties of R N S and the explicit knowledge of number magnitude of weighted systems. The first attempts in this direction were b y Sasaki [13,14], and R a o and Trehan [15]; unfortunately, the proposed solutions altered the modular nature of residue arithmetic. In recent years, hybrid notations have been proposed starting from the observation that weighted and residue systems can be considered as extreme solutions to the general problem of representing numbers [16-18]. In particular, it has been shown that hybrid system adders with overflow detecting capabilities and able to support repeated operations are conceivable [17], provided that an appropriate redundancy is added to the representation. In this paper a VLSI architecture is proposed for adders constructed in Residue N u m b e r Systems with Magnitude Index (RNS's with MI) and its complexity is evaluated in terms of area occupancy and execution time, regardless of the particular application. In this architecture, intermodular operations are performed by means of a mixed radix conversion procedure [2] and the modularity of the whole structure is preserved, whereas the use of a single row of processing elements (PE's) allows to contain area occupancy. More detailed considerations are presented for a usual application which consists in adding undefined sequences of integers. In this case, it has been shown that the time required to perform, when necessary, intermodular operations does not affect the overall execution time which coincides with the time required to perform a single m o d rn addition, where m represents the generic modulus of the residue system. As far as execution time is concerned, this result is equivalent to saying that the modularity of the residue systems is maintained in the proposed structure. This is obtained at the expenses of a slight increase of area occupancy

Ferruecio Barsl was born in Lucca, Italy, in 1942. He received the Dr. Eng. degree in Electronic Engineering from the University of Pisa, Italy. Since 1969 he joined the Consiglio Nazionale deUe Ricerche of Italy, where he is presently a Research Manager. His research interests are computer architecture, arithmetic codes, fault tolerant computing and computer graphics. Currently, he is working in VLSI architectures and nonconventional arithmetic implementations.

Enrico Martinelli was born in Lucca, Italy, in 1944. In 1970 he received the Dr. Eng. degree (cum laude) in Electronic Engineering from the University of Pisa, Italy. Since 1971 he has been a researcher of the Consiglio Nazionale delle Ricerche of Italy at the Istituto di Elaborazione dell'Informazione in Pisa, where he has carried out researches on operating systems, performance evaluation, computer aided design, computer graphics and circuit switching networks. Presently he is working in the design of VLSI algorithms and structures and RNS-based arithmetic units for high speed applications.

F. Barsi and E. Martinelli / A VLSI architecture for RNS with M I adders

69

as compared with conventional, area-time optimal, binary realizations of adders [19].

2. Residue number systems with magnitude index The idea of representing integers in a hybrid notation with the aim of taking advantage of favourable features of both positional and residue number systems has been considered in the past by several authors [13-18]. We refer here to the notation proposed in [17] which is briefly reviewed for the sake of clearness. RNS's with MI can be defined by representing any integer X as X-

(Rx,

(1)

Ix}

with

Rx= I X[~,

(2)

where I X [ . represents the least non-negative remainder of dividing X by/~ and [X/I~I is the greatest integer not exceeding X/t~. From the fundamental theorem of remainder X=t~

-ff

+ [Xl,=l~Ix+

Rx

(4)

it is seen that the representation of X consists of two separate parts: - a magnitude index part 1x which locates X into one out of a set of intervals of width #; - a residue part R x which precisely specifies X in the range [I x • #, ( I x + 1). F)It will be assumed that integers R x and 1 x are given different representations to enhance arithmetic properties of such systems: as a natural choice magnitude index is represented in a weighted notation to speed up comparisons and R x is given a residue notation to speed up arithmetic operations. In particular, magnitude comparison results much easier than in standard residue systems, since the residue parts of the representations are involved only if magnitude indexes coincide. Unfortunately magnitude index notation is not dosed under addition. In fact, suppose that two integers X = { R x , l x } and Y = { R r , I y } are added. Their sum

s = x + Y=Rx+l

y+

.(Ix+Iy)

(5)

has the following residue with MI representation S =- { R s , I s )

(6)

70

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders

where

Rs=

Ial~ =

(7)

[Rx+Ryl,

Is= [ S ] = I x + ly+ l R x ~ Rr I =Ix+lyH- Rx+ R Y - ]Rx+ Rv[I~

(8) i.e., the residue part of the representation is the mod/~ sum of residue parts of the operands, whereas the magnitude index of S may result greater than the sum of magnitude indexes of X and Y. More generally, when t integers X 1, X 2.... , X t are added, the representation of their sum S = { R s, I s ) has the following residue and magnitude index parts

R s = ~ Rx,

(9)

i=1

t I x + - ~1 ( ~.,[Xi] ~ - ~ X i tt) = Yt ' . I x + 8 s

Is= i=1

i=1

i=1

(10)

i=1

where 8 s ~ < l t ( ~ "t -/ 1t ) ]

(11)

3. Addition and overflow detection in RNS with MI

The choice of the number system is essential to determine the complexity of the addition procedure and, consequently, of the corresponding computing structure. In weighted systems, the presence of carries contributes in slowing down the execution speed, while the overflow detection can be considered as a by-product of the adding procedure. On the contrary, residue addition is very fast because of the independence of residue digits, but detecting overflow requires the knowledge of the magnitude of results and thus time consuming intermodular operations are necessary. The hybrid notation proposed here may drastically reduce the need for intermodular operations in checking additive overflow, provided that some redundancy is added and the two parts of the representation are given a slightly different meaning. In this section the concept of RNS's with MI, as defined above by eqns. (1)-(4), will be generalized as follows. An integer X is said to be represented in an extended RNS with MI if

Jx) •~ x ~ 0

X = ~ x + P,'Jx

F. Barsi and E. Martinelli / A VLSI architecture for RNS with M I adders

71

It is worth noting that ~ x and o¢x will be expressed in a residue and in a binary number system, respectively. Moreover, a representation X = ( ~ x , oCx} will be said to be normalized if and only if

Jx=Ix and, consequently, ~ x = Rx It is important to observe that an extended RNS with MI does not guarantee the uniqueness of the representation. In fact, the same integer X may be expressed as X = { ~ x , oCx}, with X = ~ x + g . J

x

or t

X - { ~ x , o¢~}, with X = ~ x + g ' J x from which the following relation is derived Performing addition in extended RNS's with MI is very simple. In fact, suppose that two integers X = { ~ x , oCx} and Y = { ~ r , J r } are given. A representation of their sum will be S=- { ~ s , J s }

(12)

S = ~s + g d s

(13)

where ~s=~x+~r, Js=Jx+Jr (14) i.e., the representation of the sum will be obtained symply by adding the residue and M I parts of the operand representations. Now consider integers X's in the range [ - P , P) represented in an extended RNS with MI in the normalized form. As 0 ~ ~ x < g, the MI part of the X's will be in [I-, I+], where / - = [ - P - ( g - 1l) = g - P + 1] _ 1 (15) g

1+=l-;-1 P-1

(16)

As a consequence, detecting overflow occurrences from the range [ - P, P) will be reduced to check the MI parts of numbers. In the hypothesis that t normalized integers X 1, X 2..... Xt in [ - P , P) are added according to (12)-(14), it is obtained

s-

Js}

with t

~s=

t

E ~,, J s = E o¢~ i=1

i=1

(14')

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders

72

and, observing that 0 ~<~s ~< t ( / x - 1 )

tI- <~Jes <~tI + it is concluded that the original residue and MI ranges must be extended to avoid loss of information. In particular, this requires that one or more pairwise prime moduli are added to extend the residue range/~: let m R be the product of such redundant moduli with (/x, mn) = 1 and

P,mR >/t(l~ - 1) + 1 The above inequality shows that the redundancy mR introduced in the system is dependent on the number of operands t which are considered. In other words, for given m n,

t* = m n +

tm-Xl

(17)

~t-1

represents the greatest number of normalized integers which can be added without overflow from the residue redundant range. Checking a sum S for overflow from the range [ - P , P), is performed by observing magnitude index I s of the sum, which is related to J s by eqn. (10), i.e.

Is=Js+ An overflow will be detected if and only if

Js < I - - Ss Or Js > I + - Ss

(18)

However, the actual value of 8s is unknown. This implies restating inequalities (18) as a sufficient condition as follows

Js < I - - [ t(l't~ l) l or J s > I +

(19)

In fact, from (19) and (11)

.~s < I - - l t(l~;1) ] < ~ l - - 8 s and, obviously

J s > I+> I + - S s The simplest way to take into account 8s consists in extending the representation by means of a control part
x-

%<)

will be given a control part (gx = 1. The above addition procedure (12)-(14) will be completed by adding the control parts of the operands. In this hypothesis inequalities (19) will take the form

J s < I - - - - ~ s (1~ - 1) ] [

l

or~s>

I+

(20)

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders

73

where t

gs = E g, i=1

(21)

Moreover, from eqn. (17), it is realized that any @s exceeding ~'m~=ma +

~- 1

indicates the risk of residue range overflow. Relations (20) and (17') can be further simplified assuming that m R
(22)

i.e., the redundancy does not exceed the duplication of the residue range. In this hypothesis

d~s < l - - ( Cgs + l - --~ l ) = l - - Cgs + l or Js > I +

(20')

and C~max = m R

(17")

To conclude, from inspection of MI and control parts of the sum, the following procedure is suggested to perform addition and detect overflow as soon as possible. Procedure The addition of any two integers, whether normalized or resulting from the sum of normalized numbers, is obtained by: 1. Adding residue, MI and control parts of operands to obtain S = { ~ s , J s , Cgs}; 2. If condition (20) is satisfied an O V E R F L O W is detected and the result is discarded or recovered, else 3. If ~s ~< ~m~x then S = { ~ s , °¢s, Cgs} is assumed as the L E G I T I M A T E RESULT, else the result is not consistent with the capabilities of the residue range and must be recovered by preliminary operand normalizations.

Normalizing an operand is equivalent to restore the correct value R x within the range ~ and to update the value I x by an increment [~x/lZ]. More precisely, observing that 0 ~<~x
/;

mR =

mR

(23)

where [1/~[m R is the multiplicative inverse of/~ mod mn, whose existence is guaranteed recalling that (/~, mn) = 1.

74

F. Barsi and E. Martinelli / A VLSI architecture for RNS with M I adders

4. An architecture for multiple additions in extended RNS with MI In this Section we propose a VLSI layout for the implementation of the addition procedure in an extended RNS with MI. The structure will be designed and evaluated according to the VLSI asymptotic complexity model first introduced by Thompson [21], and will be matched with a conventional binary adder. According to the procedure, two main parts can be distinguished in the layout: an operating part, which provides for adding residue, magnitude index and control digits; and a control part dedicated both to detect overflow and, when required, to normalize the result produced by the operating part. The overall organization is summarized in Fig. 1. Designing the operating part is fairly plain, since only an array of adders is required; namely, one binary adder for MI parts, one for control parts and n + 1 adders for residue digits. Overflow detection from the range [ - P , P) is accomplished by means of the structure of Fig. 2. In this scheme ADDER2 and ADDER3 perform additions Js - ( I - - Ws + 1) and I+-,,¢s respectively, and, depending on the sign bit of the results they produce, an overflow condition bit is set at the output of the OR gate. System constants I ÷ and I - + 1 (see condition (20')) are stored in registers R1 and R2.

X

Y ¢,

,¢ <

t

J

~" ©

ADDERS

~s

3s

Cs mR= Cmax

[

i


I

request generator

OVERFLOW [ DETECTOR

,.d © Z © ¢.)

I

i

[-

I--



1

T

l

[__~NORMALIZER.F_._ overflow

a'r normalization request

9~ 3 s Cs Fig. 1. The o v e r a l l

RNS

with MI a d d e r

structure.

F Barsi and E. Martinelli / A VLS! architecture for RNS with M1 adders

[~+AEDERiSi i S D~ [

+t

75

CS~

! over.ow

si °bit

1 = overflow

l = normalize

overflow

normalization request

Fig. 2. The o v e r f l o w detector.

Fig. 3. The n o r m a l i z a t i o n r e q u e s t generator.

A normalization is started under the conditions that there is no overflow and the result S: produced by the operating part is no longer significant as ~s > mR (see eqn. (17")). In order to detect the latter condition, we can use the circuit of Fig. 3. Register R stores system constant m R which is added to Cgs: the sign bit of the sum is 1 as long as numbers are legitimate so that the N O R gate output switches

clock

R

o-

ADDERI

MUL

ADDER O MMUL

g~DDER

R

MIMuL J (RW.+1

W=X,Y

NORM fJ ,vl rm

Fig. 4. The n o r m a l i z e r structure.

76

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders log m Xiq'l

Xi'l'l

O(Io~)

iT )

MULTIPLIER

Xi, 1

j = 2,..., n+l i = 1,..., j-1

X0, k= ~Wk

; k = 1,,.., n

Fig. 5. The j t h PE of the normalizer.

to 1 (normalization request) if neither overflow condition has been set nor the result produced by the operating part is legitimate. As shown in Section 3, the normalization requires a base extension of both I ~ x l ~ and I ~ y I~ to the redundant modulus mR, that is an intermodular operation [2]. The structure we suggest to implement base extension of each operand is shown in Fig. 4. There are n processing elements (PE's) which operate in parallel. The j t h PE ( j = 2,..., n-F 1, m n + 1 - - - - m R ) , whose structure is reported in Fig. 5, performs j - 1 computations, each consisting of a mod mj addition and a mod mj multiplication, according to the MRC algorithm [2]. The first of such computa-

x .... ,

I~xl=,

ADDER

It xl lo

[ MULTIPL,E.[

Fig. 6. The structure of block N O R M .

F. Barsi and E. Martinelli / A VLSI architecture for R N S with 341 adders

77

tions accepts the residue digits 9t~ and ~ , W = X, Y as input data Xi_aj and X,_I. j, respectively; it is worth noting that a set of memory elements to store j - 1 multiplicative inverses is required by the j th PE. The whole procedure is iterated n times by assuming intermediate results at a computation step as input data for the next step. Referring again to Fig. 4, it is observed that, besides processing dements, there exists an additional block NORM, consisting of a modular adder and two modular multipliers, which are necessary, from (23), to evaluate the normalized residue part and the corresponding MI increment of operands; the structure of block NORM is shown in Fig. 6. Once operands have been normalized, they are added again and checked, according to the scheme of Fig. 2.

5. Complexity evaluation of the architecture

5.1. MOD m adders and multipliers Before evaluating the complexity of the architecture, some preliminary considerations are necessary about adders and multipliers which are used. As for adders, we refer to the structure proposed by Brent and Kung [19] for binary addition and whose complexity figures, for binary ranges representing mod m numbers, are A A = O ( l°gm~ -log --l°gm) log m

TA = O( T + log ------ff--) where T is the number of strings of input data. Mod m addition is accomplished provided that binary addition is followed by a control step checking for overflow from range m. Figure 7 shows the mod m adder structure, whose area and addition time are A'=O(

l°gmT -(log logmT + T ) )

( logm TA*=Olog ~ + T

)

It is worth noting that, in the hypothesis T = log log m, which corresponds to Brent and Kung's assumption leading to an optimal design, mod m adder depicted in Fig. 7 exhibits the same area and time complexity as a binary adder with log m bit inputs. A solution to the problem of mod m multiplication has been proposed by Alia and Martinelli [22], who suggested the following scheme (Fig. 8), based on a

78

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders X

ROM

(m)

Y

BINARY ADDER

O(log~ - )

BUFFER

O(T)

t+

+ BINARY ADDER sign bit I

+ BUFFER

+ MULTIPLEXER

~-O(const)

IX+Vim Fig. 7. The residue adder. binary adder and on the binary multiplier by Melhorn and Preparata [20], exhibiting the following complexity figures

A M=O(l°gTm

lOgT) m

log m TM = O( T + log -----T~ ) where T is the number of strings of input data. The scheme presented in [22] stems from the observation that, for any pair of integers X and Y, it is

I X Y l m = X r - k " rn, k = [ X Y / m l The major obstacle in performing modular multiplication lies in computing = k, with a maximum error not exceeding 1; it has been shown that a sufficient condition consists in multiplying X Y by an approximation of 1/rn obtained by considering the 2n leftmost digits. This is equivalent to say that a 2 log m • 2 log m bit multiplication is required.

F. Barsi and E. Martinelli / A V L S I architecture f o r R N S with M I adders

X ,~

0(~).,

i-

I1:

-i

I (l/m)

O'TI '

Y

IO(L~-);

.- Ii

BINARY MULTIPLIER (log m x log m)

'

I ROM (m)

79

Io, ,

t XY

BINARY MULTIPLIER O ( ~ ) (2log m x 2log m ) j.

I

BINARY MULTIPLIER (log

m x log m) +

RESIDUE ADDER

O(T+log~ - )

IXYIm

Fig. 8. The residuemultiplier. It is easy to realize that area and time complexity of mod m multiplication are m

+

T ~ = O(log log~rm + T) 5.2. R N S with M 1 adder

In the RNS with MI which will be considered here, it is assumed that the residue part has n + 1 moduli mi, i = 1. . . . . n + 1, where n

/X= H m i i=l m R ---- m n + 1

Moreover, the following assumptions will be adopted: (i) all moduli and the MI range will be assumed as m

i =

v~(m),

p -- = # ( m ) #

i = 1,..., n + 1

(24)

80

F. Barsi and E. Martinelli / A VLSI architecture for RNS with MI adders

(ii) the number of moduli is an upper bound to their bit length: l o g m i=O(n)

i = l ..... n + a

(25)

To evaluate the complexity of the overall architecture layout, let's refer again to Fig. 1, where the operating part consists of n + 3 adders arranged in a single row whose dimensions are

(lo

W0=O ( n + 3 ) ~ L 0 = O(log

m)

=O n

log /'H

T)

Both overflow and number legitimacy detection procedures require only a constant number of binary adders and less complex devices (registers and sparse gates). The structure implementing the base extension algorithm has the following dimensions, referring to Figs. 4 and 5:

As a consequence, the total area Ares of the RNS with MI adder is given by Ares = O(max( Wo, Wn ) " ( Lo + Ln) )

( mo m ( logm logm )) ~. • log ~F + Y1" +nT

=O n

= O ( n logyl.rn .(~T-m-+nT)) and, recalling assumption (25), ( ,ogm Ares-~-O rl T

)

. n r = O ( n 2 log m)

(26)

The addition time can be evaluated according to the following considerations. Both residue and binary adders exhibit the same response time and, consequently, the operating part produces its results within a time interval T0 = O ( T + log log_____~m ) In the control part, the normalization structure is the slowest one, as O(n) steps must be carried out, each consisting of a modular addition and multiplication. Then we have

F. Barsi and E. Martinelli / A VLSI architecture for RNS with M I adders

81

Denoting by f the probability of a normalization request, the control part response time affects the single addition by the quantity fT,, and the following figure is obtained for the total addition time

Relation (27) shows that total addition time Tres is strongly dependent on the function f, which is determined by the current application. In this work a very usual application will be considered, i.e., the case when undefined sequences of normalized integers are added. Recalling assumption (24), eq (17) takes the form t * = m R and then f <~ 1 ~ ( m R - 1)

as at least m R - 1 implies

additions are allowed without normalization request. This

f= O(1/m) Substituting for f in (27), it is obtained

and, finally, from assumption (25)

(.,) which shows that time complexity of the overall RNS with MI adder coincides with time complexity of a single mod m adder. A reasonable evaluation of results referred in relations (26) and (27') can be carried out by considering a conventional binary adder processing data in the same range [ - P , P), with P = O ( # m ) , for which [19] it is log(m#)

Tbin = O( T' + lOg

)

log(m#) T' )

where T ' denotes the number of strings of input data. From assumption (24) we can write #=~(m") and then, after some algebra: . 4 bin = O

(logm ( ?'/---~-"r~ •

log n + log

Tbin = O T ' + l o g n + l o g

log m ] ~ ]

(28)

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders

82

Recalling that Brent and Kung's adder is time optimal for 1 ~< T ' ~< log log(m/t) = log(n + 1) + log log m it i~ obtained Tbin = O(log log(m/t)) = O(log n + log log m ) On the other hand, assumption (25) can be given the equivalent formulation n = f~ (log m ) and, necessarily log n ----~2(log log m so that

Tbi n

becomes

Tbin = O(log n) Comparing this relation with (27') where, for 1 ~< T~
O ( n log m • log n)

(28')

which is minimized, according to time optimization requirements, when T ' = O(log log(m/t)) = O(log n + log log m ) = O(log n ) i.e. Abi n =

O(n log

m)

F r o m a comparison of A b i n and Tbin with the corresponding expressions of Ares and Tre~ it can be concluded that our solution enhances the speed at the cost of a larger area with respect to the binary adder.

Acknowledgement This research has been supported by the National Program on Solid-State Electronics and Devices of the Italian National Research Council.

References [1] Garner, H.L., The residue number system, IRE Trans. Electron. Comput. EC-8 (June 1959) 140-147.

F. Barsi and E. Martinelli / A VLSI architecture for R N S with M I adders

83

[2] Szabo, N.S. and R.I. Tanaka, Residue Arithmetic and its Applications to Computer Technology (McGraw-Hill, New York, 1967). [3] Szabo, N., Sign detection in non redundant residue systems, I R E Trans. Electron. Comput. EC-11 (August 1962) 495-500. [4] Banerji, D.K. and J.A. Brzozowski, Sign detection in residue number systems, I E E E Trans. Comput. C-18 (April 1969) 313-320. [5] Taylor, F.J. and C.H. Huang, An autoscaler residue multiplier, I E E E Trans. Comput. C-31 (April 1982) 321-325. [6] Taylor, F.J., A VLSI residue arithmetic multiplier, I E E E Trans. Comput. C-31 (June 1982) 540-546. [7] Taylor, F.J., An overflow free residue multiplier, I E E E Trans. Comput. C-32 (May 1983) 501-504. [8] O'Keefe, K.H., A note on fast base extension for residue number systems with three moduli, IEEE Trans. Comput. C-24 (November 1975) 1132-1133. [9] Huang, C.H., A fully parallel mixed-radix conversion algorithm for residue number applications, I E E E Trans. Comput. C-32 (April 1983) 398-402. [10] Ulman, Z.D., Sign detection and implicit-explicit conversion of numbers in residue arithmetic, IEEE Trans. Comput. C-32 (June 1983) 590-594. [11] Alia, G. and E. MartineUi, A VLSI algorithm for direct and reverse conversion from weighted binary number system and residue number system, I E E E Trans. Circ. Syst. CAS-31 (December 1984) 1033-1039. [12] Alia, G., F. Barsi and E. MartineUi, A fast VLSI conversion between binary and residue systems, Inf. Process. Lett. 18 (3) (March 1984) 141-145. [13] Sasaki, A., Addition and subtraction in the residue number system, I E E E Trans. Electron. Comput. EC-16 (April 1967) 157-164. [14] Sasaki, A., The basis for implementation of additive operations in residue number system, IEEE Trans. Comput. C-17 (November 1968) 1066-1073. [15] Rao, T.R.N. and A.K. Trehan, Binary logic for residue arithmetic using magnitude index, IEEE Trans. Comput. C-19 (August 1970) 752-757. [16] Barsi, F. and P. Maestrini, Arithmetic codes in residue number systems with magnitude index, IEEE Trans. Comput. C-27 (December 1978) 1185-1188. [17] Alia, G., F. Barsi and E. Martinelli, Addition and overflow handling in a class of redundant RNS with Magnitude Index, IEI Internal Report B4-33, Pisa, Italy, Dec. 1987. [18] Barsi, F., Arithmetic properties of a class of hybrid number systems, IEI Internal Report B4-01, Pisa, Italy, Feb. 1989. [19] Brent, R.P. and H.T. Kung, A regular layout for parallel adders, I E E E Trans. Comput., C-31 (2) (March 1982) 260-264. [20] Mehlhorn, K. and F.P. Preparata, Area-time optimal VLSI integer multiplier with minimum computation time, Inf. Contr. 58 (1983) 137-156. [21] Thompson, C.D., A Complexity Theory for VLSI, Ph.D. Thesis, Carnegie-Mellon University, Computer Science Dept., Aug. 1980. [22] Alia, G. and E. Martinelli, The VLSI residue multiplication and its implication in the direct and reverse positional-to-residue conversion, to appear in IEEE Trans. Comput.