1 September1996 OPTICS COMMUNICATIONS ELSEVIER
Optics Communications129 (1996) 323-330
Optical negabinary addition with higher-order substitution rules Guoqiang Li, Liren Liu *, Lan Shao, Yaozu Yin Information Optics Laboratory, Shanghai Institute of Optics and Fine Mechanics, Academia $inica, P.O. Box 800-211, Shanghai 201800, China
Received28 September1995;accepted27 February 1996
Abstract
Optical higher-order negabinary addition is investigated for fast operation. A new method for recognizing multiple reference patterns in a channel is presented, and thus the substitution rules are logically minimized, resulting in fewer channels for processing. A content-addressable-memory based incoherent optoelectronic symbolic substitution architecture is proposed. The system is simple and flexible, and the multi-channel operations for all the modules are performed in parallel. An experiment is demonstrated.
1. Introduction Addition is the most fundamental arithmetic operation. Optical systems are suitable for this operation because of their capabilities of parallel information processing and large space bandwidth product. The speed of ripple-carry-addition is limited due to the serial carry propagation. One of the techniques popular to us for implementing this algorithm is symbolic substitution [1-3] (SS) which performs 2-D parallel bitwise pattern transformation logic. To decrease the processing steps, Kozaitis [4] suggested the use of higher-order symbolic substitution, by which two or more pairs of bits are recognized and the rules can be applied fewer times to complete a given operation and hence the system throughput will increase. Another method for parallel addition is based on the carry-look-ahead [5,6] (CLA) technique. In this method, the operands are divided into modules of
* Correspondingauthor. Fax: + 86 21 59528885, E-mail:
[email protected].
equal width. In each module, the smaller number of bits are added in parallel and the particular modules are cascaded to perform the addition of longer digital words. On this basis, Datta and Seth [7] have incorporated the higher-order SS into the modified CLA adder in an iterative array architecture. However, in their scheme the binary encoding can only process positive numbers and the optical system requires an array of modules including cylindrical lenses. Other carry-free operations employ residue [8] and the modified signed-digit [9-1 I] (MSD) representation. The residue number system has difficulties in scaling, sign-detection and implementing large prime modulo logic elements when large numbers are processed. The MSD is a redundant number system and bipolar-valued digits need to be handled. An approach similar to SS for fast computation is content-addressable-memory [12] (CAM). In a CAM, all the possible inputs known as reference patterns and their corresponding outputs are prestored. During processing, the inputs are compared to all the reference patterns and the corresponding outputs are
0030-4018/96/$12,00 Copyright9 1996ElsevierScienceB.V. All rights reserved. PII S0030-4018(96)00165-4
324
G. Li et al. / Optics Communications 129 (1996) 323-330
obtained at the same time. Therefore CAM-based optical SS implementation schemes have been proposed, including coherent angularly multiplexed hologram [9-15], joint transform correlator [9,10] and incoherent optical systems [5,16-18]. The first method is difficult to realize because of its alignment problem and low diffraction efficiency [5,16,19]. For this reason, no experimental results of such holographic CAM-based SS systems have been presented. Most MSD optical computing systems fall into this category. Since negabinary encoding can uniquely represent any positive or negative number without a sign bit,
(a)
the arithmetic can be performed at digital rather than word level [20,21]. In this paper, the negabinary CLA addition is realized with the higher-order SS operation. As a proof of the concept, 2-bit-wide modules are cascaded and the corresponding 16 rules are derived. Each bit is encoded spatially and separately by four subpixels, and the two bits to be added are overlapped as the input. To realize the CAM pattern recognition, AND operation is performed between the input and the complemented spatial reference pattern, and the low intensity pixel represents the occurrence of the reference pattern. With this method, multiple reference patterns having the
(b)oooo
bi+zbiai+zai ai+lai
Ci+3 Ci+2~
O0 DO O0 O0 01 D1 OOOO 10 10
~ ci+3ci+2
bi+zbi
si+ISi
$i+ 1si
0000 11
(c)
bN-IbN-2aN-IaN-2
Iteration time I
bi§
oz ~ OD
oo 01
01..__ O1 O1 01 ~ O 0 1O
ai.lai
11
O1 11 10.___,.11 10 O0
11._..oo OO 11 11 00 O1 O0 11.__.,.11 10 01
1011
1100
10
10_._,. O0 10 11
01~00 I1
1ooo O0
O0
O1
11
11
1O
bl b0 a 1a0
b3 b2 a3 a 2
CN+I ~ I
F
-2
II ~ "Si§
IsiCi+ici
N-2
II ~'~i d
Isici+lci s3
s2
SN~l SN~ISN CN+ICN
sN-ISN.2CH.iCN_2
(d) 1011
~t2+1
o11o
~
0011
11Ol
1001
SN*I $N Fig. I. Negabinary CLA addition with higher-order substitution roles. (a) Basic addition module. (b) Substitution rules. (c) Iterative modular
array architecture. (d) Numerical example.
G. Li et al./ Optics Communications 129 (1996) 323-330
same output pattern can be recognized simultaneously in a channel by AND-ing these reference patterns to form a combinational reference pattern. Thus the 16 rules are classified into 7 cases in terms of the 7 different output patterns. After logical minimization, these rules are reduced to 10 combinational ones. For implementation, an optoelectronic CAMbased SS architecture for multi-channel operation is suggested. In the system, the CAM recognition is accomplished by an incoherent correlator, and the output patterns are substituted and fedback by a PC computer as the input for next iteration. The operations of all the modules are performed in parallel, and the system is very simple and flexible. For demonstration, a numerical experiment is shown.
2. Higher-order negabinary addition Symbolic substitution performs logic or arithmetic operation in terms of the predetermined substitution rules. These rules correspond to a kind of spatial truth table. Before presenting the higher-order negabinary addition rules, we briefly discuss the rules for 1-bit addition. 2.1. Negabinary addition
Negabinary is a positional number system based on the integer powers of - 2 . It is evident that any positive or negative number has a unique representation in this system. Consider two N-bit operands a and b given in the form
N-I a = ~_~ a i ( - - 2 ) i , a n d b - -
i=0
N-I • b i ( - 2 ) i, i=0
a i , b i E {0,1}.
There are four combinations for the digits a, and b~, namely, 00, 01, 10, and 11. As in the positive binary case, the negabinary addition of the first three combinations does not generate any carry. However, addition of the fourth combination will generate twin carries to the next two higher bit positions. This is due to the following identification: 1. ( - 2 ) ' +
1. ( - 2 ) '
=0"(--2)i+
1" ( - - 2 ) / + ' -+- 1" ( - - 2 ) i+2.
325
The nature of the twin-carry generation in negabinary is different from the algorithms of other number systems and matches well the representation of higher-order symbolic addition. 2.2. Higher-order addition
If the above addition rules are used for N-bit numbers, at most ( N + 1) times of iteration are required since only one pair of digits is processed at each position. Using higher-order SS rules, since longer words are handled as in the CLA adder, the iteration times can be reduced and thus higher computational speed can be expected. For demonstration, we apply this method for addition of two N-bit negabinary numbers, using 2-bit-wide modules. Both numbers are separated into such modules. If N is odd, a zero can be padded to the most significant bit. Hence without the loss of generality, N is assumed to be even in the following. First we consider the basic operations on each module with the inputs ai+la i and bi+lb i (Fig. l(a)). The result of the addition will contain a 2-bit sum si+ lsi and a 2-bit carry ci+3ci+ 2. The outputinput relation can be represented by a higher-order SS rule as the right part of the figure, instead of Boolean logic expression. The left-hand side of the rule is the reference pattern, while the right hand side is the output pattern. In the reference pattern, the operands ai+ lai are placed over the other operands b,+ l b i, and in the output pattern, the lower row corresponds to the sum s,+ is, and the upper row correspond to the carry ci+3ci+2 . There are totally 16 combinations for the four input bits. Correspondingly, the 16 higher-order substitution rules are listed in Fig. l(b). The values of the digits rather than their spatial encoding are shown. To perform the N-bit addition, N / 2 + 1 modules are cascaded and operated iteratively as shown in Fig. l(c). After the jth iteration, the sum s2j_ lS2j_ 2 from module j is the final form of the addition and need not be processed further, but for every other module (numbered k, k > j ) , the sum S2k_lS2,_ 2 from itself and the carry C2k_ 1C2k_2 from the module k - 1 are taken as the addend and the augend respectively for the ( j + 1)th iteration. We should pay attention to the leftmost module (numbered N / 2 + 1). In the 3rd iteration, the twin carries cu+ ~cN
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
~lmll
IH
ii
I~NIH
I
326
I
........
- -
9. . . . . . . . . .
*.
:
. . . .
,:~
::::::
. . . . . . . . . . . . . . . . . . . . . . . . . .
G. Li et al. / Optics Communications 129 (1996) 323-330
a l oo
lo
ol
~ []! Fig. 2. (a) Spatial encoding of the operands. (b) Example for the input 10 and 01.
from module N / 2 in the former two iterations are added, generating the sum su+~sN. No carries will be produced from this module. From the 4th iteration on, the sum SN+ I SN and the carries CN+I CN are added in this module iteratively. To complete the whole operation, the maximum number of iteration time is N / 2 + 1. For illustration, an example for the addition of - 9 (1011) and 2 (0110) with N = 4 is demonstrated in Fig. l(d).
3. Optical
:
implementation
3.1. Spatial encoding Usually, the binary values 0 and 1 are spatially represented by a column bright-dark pattern with two or more subpixels, and the encoded digits are put one over the other. Here the encoding method for the conventional shadow-casting system [22] is used, and the encoded digits are superimposed as the input, where each pixel representing a pair of digits is composed of four subpixels (shown in Fig. 2(a)). In each module, the two encoding pixels are put side by side. The encoding patterns of the operands 10 and 01 are illustrated in Fig. 2(b). 3.2. Multiple-pattern recognition and substitutionrule reduction
bright subpixels of the reference pattern, and the high intensity position indicates the occurrence of the reference pattem, whereas with the additive technique, the OR operation is performed between the input and all the dark subpixels of the reference pattern, and the low intensity position indicates the occurrences of the reference pattern. Besides these, Eichmann et al. [18] proposed a technique in which the AND operation was performed between the input and the complemented reference pattem and the low intensity position implies the match of the reference pattern. One of the problem that faces the SS and CAM is the large number of reference pattems to be recognized. Since each rule requires a recognitionsubstitution channel, too many rules need a bulk of hardware. For easier implementation, the reference pattems must be logically minimized using the methods such as the Karnaugh map and the QuineMcCluskey technique [24]. Based on the spatial encoding, we present here a technique for recognizing multiple reference pattems in a channel with a combinational reference pattem. From Fig. l(b), it is seen that there are 7 different output patterns, i.e., 0000 (corresponds to the carry 00 and the sum 00), 0001, 0010, 0011, O110, 1100, and 1101. If we classify those reference patterns having the same output pattern into a category (see Table 1), and try to construct a single combinational reference pattem for each category, the substitution rules can be reduced and fewer channels are needed.
Table 1 Classification of the output-input patterns Class
Output pattern
Input patterns
1
00
0 0 1 1 01 00,01,11 0100 00,01 100011 00,10,11 11 10 0100 00,01,10,11 01 01 10 10 11 10 10,11
2 3
In an optical CAM-based SS system, the input to each module is compared with all the reference patterns in parallel. If a match happens at a position, the input is substituted by the corresponding output. To recognize the given input, two techniques can be adopted: the additive and the multiplicative logic [1,18,23]. With the multiplicative technique, the AND operation is performed between the inputs and all the
4 5 6 7
00 00 01 00 10 00 11 01 10 11 00 11 01
327
G. Li et a l . / Optics Communications 129 (1996) 323-330
(a)
((I)
(e)
(t)
~
~)
(i)
Fig. 3. Recognition of single and multiple reference patterns. (a) Input pattern. (b) CAM pattern for 1001(10+01). (c) The result of AND operation between the input and CAM patterns. (d), (e), (f), and (g) The CAM patterns for 1001. ll00, 0110, and 0011, respectively. (h) The combinational reference pattern. (i) Periodic CAM pattern. ~) The result showing the occurrences of multiple reference patterns.
As an example, consider the two operands 1110110100 0001011011. Assume we desire to recognize the input pattern I001(10 + 01). First the inputs are divided into 5 2-bit wide modules and are spatially encoded in Fig. 3(a). The coded reference pattern is complemented and replicated 5 copies periodically as the corresponding CAM reference pattern (Fig. 3(b)). Then overlap the input pattern and the CAM pattern to carry out the AND operation (Fig. 3(c)). The occurrence of the reference pattern is indicated by a dark 4 • 2 subpixel area (position 4 in the example). Optical elements can be used to integrate the light intensity of the 8 subpixels and a low threshold value must be chosen. Since the input combinations 1100, 1001, 0110, and 0011 have the same output pattern (see class 4 in Table 1), it is better to recognize the four reference patterns with a single mask and substitute the same output in parallel. To perform this, one can complement the coded four reference patterns as Figs. 3(d), (e), (f), and (g) for 1001, 1100, 0110, and 0011 respectively, and superpose them, i.e., execute the AND operation, to get a combinational reference pattern (Fig. 3(h)). With a periodically replicated
version of this pattem (Fig. 3(i)), all the occurrences of the four symbols can be recognized simultaneously (see positions 1, 2, 4, 5 in Fig. 3(j)) and the output is obtainable in parallel. Similarly, the combinational reference patterns for the other 6 classes in Table 1 can be designed. However, the prime implicants must be determined first. Because the original reference pattern comprises two separate digit pairs, in this method the resulting combinational reference pattern for 0000, 1101, and 0111 of class 1 can also recognize 1000, 0010 of class 3, and 0101 of class 5 which have different output patterns. Similar errors exist for that of class 3. Therefore, the 3 reference patterns in class 1 and the case of 1111 of class 3 should be encoded separately. This results in a total of 10 logically minimized combinational reference patterns as shown in Table 2.
3.3. Experiment For implementation, an optoelectronic CAM-based higher-order SS system depicted in Fig. 4 is utilized. Table 2 Logically minimized CAM masks Channel 1
2 3 4 5
6 7 8 9 10
Output pattern 00
Input patterns
00
00 00
00
11 01
00
01 11
01
01 00 00,01
10
00, 10
10
11 11
11
11 10 01 00 00, 01, 10, 11
10
01 01
00
10 10
11 01
11 I0 10, 11
00 00 00 oo
CAM mask
I-I[--" d
lo oo
00 00 Ol 11
['~
328
G. Li et al. / Optics Communications 129 (1996) 323-330
Fig. 4. Incoherent optoelectronic CAM-based SS system.
The optical part is an incoherent correlator for recognition. Three spatial light modulators (SLMs) are employed. Two of them, SLM] and SLM 2, are for the input of the operands a and b, respectively, and they are set closely. The third one, SLM3, is for the encoding of the CAM combinational reference patterns. In the input elements, the digit pairs are spatially encoded in a row as Fig. 5(a), where the input pattern of the previous numerical example (1011 + 0110) is shown. In SLM 3, the 10 combinational reference patterns are arranged consecutively in a column as Fig. 5(b). In this architecture, the input to each module is correlated with the CAM mask in parallel, and the intensity over the corresponding 4 • 2 subpixel area is integrated by the spherical
a loll
~ ~ M o d u l e
b OllO ~
~
2, ,
1
Input
(a) CAM Pattern
Correlation
Decoding Mask
Pattern
Medsde
1
2 3 4 5 6 7
:::~:::~::
9
9
3 4
9 9
9 9
:::~:::~:::
5
9
9
:::~:::~:::
6 7
9 9
9 9
8
9
9
9 I0
9 9
9 9
9 -B.
9
:::~:::~::: ,ooB,,.B.., . . . . . . ,.,,, 9 ,B, 9 .B,,
9
9 tO
to)
2
2
9 9 -~-
8
1
(c)
9
(a)
Fig. 5. (a) Encoding of the input. (b) CAM mask with 10 channels. (c) Correlation pattem in the focal plane. (d) Decoding mask.
Fig. 6. Experimental results. (a) The correlation intensity pattern in the first iteration. (b) The result after decoding. (c) The input for the 2nd iteration. (d) The result after decoding.
lens. Fig. 5(c) is the intensity pattem in the correlation plane (i.e., the focal plane), and only the squared points relate to the exact superposition of the input and the CAM pattern and need to be detected, so a decoding mask (Fig. 5(d)) having transparencies at the squared points is used. Note that the module order is reversed in this plane in comparison with that of the input dtie to the inverse imaging property of the lens. The intensity is detected by a CCD camera, which is connected to a PC computer. For each module, the result of correlation contains 10 channels, only the dark one represents a match between the input and the combinational reference pattern, and then the intermediate output pattern is substituted and treated as the input for the next iteration by the computer. This process will repeat at most N/2 + 1 times. In the experiment, the detector threshold is set between the accumulated off-states, i.e., the total fan-in times the off-state leakage light, and a single one level.
G. Li et al. / Optics Communications 129 (1996) 323-330
The experimental parameters are as follows. The size of each subpixel is 2 • 2 mm, the distance D between the input and CAM mask is 400 mm, and the focal length ( f ) of the lens is 240 mm. The experimental result is shown in Fig. 6. In the first iteration, the intensity distribution pattern in the correlation plane and that after decoding are shown in Figs. 6(a) and (b), respectively. In Fig. 6(b), it is seen that for module 1, the 10th channel is dark, corresponding to the output l l(carry)01(sum); for module 2, the 7th channel is dark, corresponding to the output 00(carry)l l(sum). In the second iteration, the operands to be added are 11 and 11, and are spatially encoded as Fig. 6(c). The intensity pattern after decoding is shown in Fig. 6(d), where the 6th channel is dark, corresponding to the output 00(carry)l l(sum). Therefore, the final sum is 1101.
4. Discussion The speed of the higher-order negabinary addition is faster than the original ripple-carry addition, but it is still dependent on the operand length and is thus slower than the carry-free MSD addition. However, the proposed technique is superior to those MSD schemes in terms of the hardware complexity and the amount of memory. For example, in the two-step CAM-based MSD system [15], 12 holograms and 12-channel holographic correlators are needed to be aligned for each of the output bits, and in the one-step CAM-based system, 56 holograms and 56-channel holographic correlators are needed to be recorded and arranged for a single output bit. As described in the first part, the implementation of such coherent systems with a large number of channels is difficult, and they suffer from the limited diffractive efficiency and space bandwidth products. The proposed method in this paper requires only 10 channels to obtain all the output bits. Moreover, the suggested incoherent CAM system is much easier to implement, and the detector only needs to tell if there is light or no light. Generally speaking, this technique is more practicable in processing. Multiplication can be performed with the higherorder negabinary addition technique. First all the partial products are generated and shifted, and then they are added in an iterative tree structure [25] with
329
multiple CAM systems. Similarly, with the negabinary encoding, the higher-order substitution rules for subtraction can also be designed. Division is implemented on the basis of subtraction.
5. Conclusion In this paper, the modified CLA negabinary addition has been performed with the higher-order SS logic. In negabinary bitwise addition, twin carries may be generated. Based on the spatial encoding, we proposed a novel method for the recognition of multiple reference patterns in a channel with a combinational pattern, and the substitution rules are logically minimized. Thus fewer channels are used. A CAM-based incoherent optoelectronic SS system is also suggested. The system exploits the advantages of optics in parallel processing. To further increase the speed of operation, one can use wider addition modules in which more than 2 pairs of digits are handled at one step.
Acknowledgements This work was supported by the National Natural Science Foundation of China.
References [1] K.-H. Brenner, A. Huang and N. Streibl, Appl. Optics 25 (1986) 3054. [2] D.P. Casasent and E.C. Botha, Opt. Eng. 28 (1989) 425. [3] H.-H. Jeon, M.A.G. Abushagur, A.A. Sawchuk and B.K. Jenjins, Appl. Optics 29 (1990) 2113. [4] S.P. Kozaitis, Optics Comm. 65 (1988) 339. [5] A. Kostrzewski, D.H. Kim, Y. Li and G. Eichmann, Optics Lett. 15 (1990) 915. [6] A. Dana and M. Seth, Appl. Optics 33 (1994) 8146. [7] A, Datta and M. Seth, Optics Comm. 115 (1995) 245. [8] A. Huang, Y. Tsunida, J.W. Goodman and S. Ishihara, Appl. Optics 18 (1979) 149. [9] M.S. Alam, A.A.S. Awwal and M.A. Karim, Appl. Optics 31 (1992) 2419. [10] M.S. Alam, M.A. Karim, A.A.S. Awwal and J.J. Westerkamp, Appl. Optics 31 (1992) 5614. [11] M.S. Alam, K. Jemili and M.A. Karim, Opt. Eng. 33 (1994) 3419.
330
G. Li et al. / Optics Communications 129 (1996) 323-330
[12] M.M. Mirsalehi and T.K. Gaylord, Appl. Optics 25 (1986) 2277. [13] M.S. Atam, Optics Lett. 19 (1994) 353. [14] A.K. Cherri and M.A. Karim, Appl. Optics 27 (1988) 3824. [15] Y. Li and G. Eichmann, Appl. Optics 26 (1987) 2328. [16] Y. Li, D.H. Kim, A. Kostrzewski and G. Eichmann, Optics Lett. 14 (1989) 1254. [17] B. Ha and Y. Li, Appl. Optics 33 (1994) 3647. [18] G. Eichmann, A. Kostrzewski, D.H. Kim and Y. Li, Appl. Optics 29 (1990) 2135.
[19] S. Zhou, S. Campbell, P. Yeh and H.K. Liu, Appl. Optics 34 (1995) 793. [20] G. Li, L. Liu, L. Shao and Y. Yin, Optics Lett. 19 (1994) 1337. [21] G. Li and L. Liu, Optics Comm. 113 (1994) 15. [22] J. Tanida and Y. Ichioka, J. Opt. Soc. Am. 73 (1983) 800. [23] G. Li and L. Liu, Optics Comm. 101 (1993) 170. [24] M.M. Mirsalehi and T.K. Gaylord, Appl. Optics 25 (1986) 3078. [25] K. Hwang and A. Louri, Opt. Eng. 28 (1989) 364.