©EUROMICRO EUROMICROJournal 6 (1980)406-409
GD80- A Multi-MicroprocessorArchitecturefor Computer Graphics P. Verebely
ComputerandAutomationInstituteof the HungarianAcademyof Sciences,H-1502Budapest,Kendeu. 13-17, Hungary
GD80 is a modular family of refresh, oector-type graphic displays. Five micropro.~e~.<~or,.~fon different tasks communicate on two separate busses usin a, interprocessors interrupts, memory windows and shared memory with hardware supported mutual exclusion. Two iaTent~cal 8 bit microprocessors are dedicated for peripheral handlin9•ana' communication, two 16 bit micropnog~,ammable bit-slice processors for picture generation and display list processT~ng, and ,~ 48 b~t wnc~ for fast floating point arithmetic calculations and matrix-i)ector transformations. ?'~.ispaper describes the main building blocks, the multiprocessing methods and some typical configuration¢~ built using the modules.
fers are performed with p r i o r i t y and masterslave protocols which are s i m i l a r to the functions of the DEC PDPII UNIBUS. The t r a f f i c on both buses are controlled by simple bus arb i t r a t o r s which we c a l l Bus Controllers (BCI and BC2 r e s p e c t i v e l y ) . There are no CPU's above the Bus Controllers: only in the case of " t r a f f i c jams" (bus errors) w i l l be sent an e r r o r message d i r e c t l y (not through the bus) to the supervisor processor (GPC), which can i n i t i a l i z e both buses, i f necessary. U2 bus serves only f or high speed - burst - data transfers between the Common Memory and the Display Control Unit. UI is a general purpose bus f or data transfers between the processors and the Common Memory. UI has an additional feature: the interprocessor interrup f a c i l i t y . The CommonMemory has a capacity of 16-256 kbytes: i t is a dual-port semiconductor memory. The tasks of t h i s memory are to store the display l i s t (the vector format description of the picture on the screen of the CRT) and to serve as a "post box" for a l l the processors (that means intermediate buffer f o r
1. DESIGN PRINCIPLES Recent advances in semiconductor memory and microprocessor technologies permitted us to develop a general purpose, multi-microprocessor architecture. We have started with analysing the functions of graphic systems. In previous systems a minicomputer completed a l l the functions of i n t e r a c t i v e working places: handling the i n t e r a c t i v e input devices, creation of a display l i s t from a high level description or mathematical model, picture transformations, a p p l i c a t i o n programming and communication. These were software programs communicating with each other. Picture generation from the display l i s t stored in the mini's main memory: this was the task of the Display Consol. In the GD80 system the tasks l i s t e d previously are d i s t r i b u t e d between processors. Some conf i g u r a t i o n s may not contain a l l the processors needed to complete a l l the functions of the ideal display system. In t h i s case e i t h e r the function is missing or the completion of the function is taken over by one of the other processors - on a lower l e v e l . In the d i s t r i b u t i o n of functions i n t o modules we used the top-down approach. Our i n t e n t i o n was to provide a set of modules both in hardware and software - with the use of which f a i r l y d i f f e r e n t configurations and a p p l i cation systems can be b u i l t . These configurations include f i v e basic graphic and some other, non-graphic systems as w e l l . In building these systems from the modules we have used the bottom-up approach.
F~ e
2. SYSTEMCOMPONENTS The general architecture of the GD80 system is shown on Figure 1. The system is b u i l t around two buses, UI and U2. Both allow an addressing space Of 256 kbytes (18 address lines) and byte or 16 b i t word data transfer. The data trans-
Fig. 1. GD80 General Architecture. 406
P. Vereb~ly processor - processor communication). Some processors may have programs here, too. There are two I/0 processors in the system: the Host Interface (HIF: communication processor) and the Graphic Peripheral C o n t r o l l e r (GPC). The l a t t e r has the task to p h y s i c a l l y handle opera t o r ' s i n t e r a c t i v e input devices (keyboards, control d i a l s , tracking b a l l , j o y s t i c k , t a b l e t , etc.) and eventual conventional peripheral devices (paper tape I / 0 , m a t r i x p r i n t e r , p l o t t e r , e t c . ) and slow background stores (magnetic tape, floppy d i s k ) . The Host Interface performs low level communication protocols to a host machine including terminal emulators. (HIF can be subs t i t u t e d by a d i r e c t channel adapter in s p e c i f i c c o n f i g u r a t i o n s . ) Both processors are industry standard 8 b i t microprocessors. They have a window to UI bus: accesses into the upper 32 kbyte of t h e i r addressing space are converted into data transfers of UI. The Display Control Unit (DCU) generates p i c tures on the screen of the CRT using i t s i n t e r nal peripherals, the graphic (character-, v e c t o r - , f u n c t i o n - and i n t e n s i t y - ) generators. I t has also the task to handle the l i g h t pen. The Display Control Unit is a 16 b i t microprogrammable processor microprogrammed s p e c i a l l y to the p i c t u r e generation in an i n f i n i t e loop; t h i s is called refresh ( i f the CRT is not a storage tube). The Display Processing Unit (DPU) has the same a r c h i t e c t u r e as the Display Control Unit but has another microprogram and other internal peripherals. The microprogram performs a powerful minicomputer i n s t r u c t i o n set which allows the a p p l i c a t i o n program to run on t h i s processor. The i n t e r n a l peripherals are high speed background stores ( c a r t r i d g e disks) and a 48 b i t microprogrammable "coprocessor": the Transformation Processing Unit (TPU), which performs b a s i c a l l y as a fast f l o a t i n g point a r i t h m e t i c extension of the DPU. These "processor-twins" have another task i f converting a high level geometric description or mathematical model into a display l i s t which may be interpreted l a t e r by the Display Control Unit. In t h i s case the Display Processing Unit converts the data structures of the - input - high level description i n t o the display l i s t format (performs as a " l i s t - p r o c e s s o r " ) and passes the a r i t h m e t i c c a l c u l a t i o n s , which are b a s i c a l l y geometric transformations (coordinate system transformation, r o t a t i o n , s c a l i n g , t r a n s l a t i o n , c l i p p i n g and perspective transformations), to the Transformation Processing Unit. To perform fast matrix c a l c u l a t i o n s , the l a t t e r is equipped with a p a r a l l e l m u l t i p l i e r module and a matrix store. The communication between the "twins" is implemented using FIFO's and dual-port memories. A l l the microprogrammable processor can be equipped with w r i t a b l e control store f o r hardware - firmware diagnostics and to provide user a l t e r a b l e i n s t r u c t i o n sets (user microprogramming). As one could see the system has two kinds of
407
processors. The Host Interface and Graphic Peripheral C o n t r o l l e r are 8 b i t industry standard microprocessors, the Display Control Unit, the Display Processing Unit are h o r i z o n t a l l y microprogrammed special processors. 3.
MULTIPROCESSINGMETHODSTO PROVIDE INTERPROCESSOR COMMUNICATION
There are several d i f f e r e n t methods we have used in the GD80 system. These are the f o l l o w ing: 3.1. Common Bus f o r data transfer and i n t e r p r o cessor i n t e r r u p t UI Bus is used f o r data transfers between processors and Common Memory. A f t e r having the message data block put i n t o the Common Memory, the "source" sends an i n t e r r u p t to the " d e s t i nation" processor, that the data block is ready to be read. Later the "Destination" Processor sends back an acknowledge i n t e r r u p t to the "source" processor, that he has used the data block or has got the message ( t h i s was only an example f o r the use of the interprocessor i n t e r rupt). 3.2. Memory sharing with mutual exclusion I f m u l t i p l e processors are working on the same memory (they share memory), the problem of the mutual exclusion always comes up. We solved t h i s problem with a small hardware flagmemory instead of sophisticated software algorithms or hardware locks. 3.3. Dual-port memories, FIFO's Dual-port memory is used as Common Memory with a second bus (U2 Bus) to achieve simultaneous accesses to d i f f e r e n t memory location on the two buses. The DCU cannot w a i t , otherwise the picture on the screen w i l l s t a r t f l i c k e r i n g . Dual-port memories and FIFO's are used between the DPU and TPU. The FIFO's are used f o r buffering because the speed of the two processors may vary. The dual-port memories are used to send immediate messages which cannot wait for u n t i l they come through the FIFO ( c o n t r o l ) , status and e r r o r f l a g s ) . 3.4. Bus Window The MIPROBUS WINDOW is used to convert HIF and GPC MIPROBUS cycles into UI data transfer. The Common Memory can be handled on the MIPROBUS as own memory (except access time). 3.5. Microprogrammed multiprocessing This method is used in the TPU to handle three p a r a l l e l 16 b i t subprocessors. The a r c h i t e c t u r e of the TPU can be microprogrammed and changed run-time. 4. CONFIGURATIONS 4.1. Graphic systems As we t o l d previously, one of our goals was to o f f e r configurations with widely d i f f e r i n g i n t e l l i g e n c e and graphics performance ( f o r the
408
~
CD80
u
F u!sm
oiU
Fig. 2. GD80 Basic Terminal (GD 80 BT).
Fig. 3. GD80 IT.
~
u2
BUS
sake of the l a t t e r we have two sets - of graphic generators with d i f f e r e n t performance and four screen types a v a i l a b l e ) . We call these configurations typical or standard configurations. There are f i v e of them, but because of the modul a r a r c h i t e c t u r e of the system, not a l l the parameters are defined j u s t specifying the conf i g u r a t i o n type and graphics performance ( i . e . memory size, peripherals set, e t c . ) : -
BT GC AGS SGS IT
Figure 2 shows a Basic Terminal configuration. The Common Memory is 16 kbyte, the only processor 'is the GPC (besides DCU of course), with the task to handle the alphanumeric keyboard and the communication i n t e r f a c e . A l l the display f i l e s are generated in the host machine, and sent through the communication l i n k to the GD80 t e r minal. GPC puts the received data without any change into the Common Memory, s t a r t s and stops the DCU, and sends back input information (from alphanumeric keyboard and l i g h t pen) to the host. The GD80 I n t e l l i g e n t Terminal is a more sophisticated graphic system. Higher level~ complex commands are executed in the terminal by the DPU. The connection to the big host can be a d i r e c t channel adapter as shown in Figure 3, or a s e r i a l synchronous/asynchronous l i n k via modems. The Graphic Computer (GD80 GC in Figure 4) is a very simple stand-alone system using the GPC as processing power. The background store is a dual floppy disk, with a mini-operating system on i t . The system can be programmed in BASIC and assembler with l i m i t e d graphic functions available on the language l e v e l .
Fig. 4. Graphic Computer (GD 80 GC).
GD80 GD80 GD80 GD80 GD80
Fig. 5. GD80, AGS, SGS.
: : : : :
Basic Terminal Graphic Computer Autonomous Graphic System S a t e l l i t e Graphics System I n t e l l i g e n t Terminal
An Autonomous Graphic System (GD80 - AGS) is shown on Figure 5. There are four processors in the system (DCU, DPU, TPU, GPC) + 64 kbytes of Common Memory. The peripherals include a i0 Mbyte cartridge disk d r i v e , 9 track magnetic tape, m a t r i x p r i n t e r , p l o t t e r , j o y s t i c k , alphanumeric and functional keyboard, t a b l e t and a l i g h t pen. The stand-along system can be used as an i n t e r a c t i v e workstation f o r printed c i r c u i t board layout design or as an NC machine tool program s t a t i o n both with high performance graphics with real-time manipulation c a p a b i l i t y . The GD80 AGS i f extended with the HIF communication processor (dashed in Fig. 5) can be converted into a s a t e l l i t e graphic system (SGS). This system completes most of the work l o c a l l y but i t can also send data and command blocks to a big host for f u r t h e r processing and a f t e r i t bring back the r e s u l t .
P. Vereb~ly
409
the disk not only n h v s i c a l l y , high level commands (data base manaeement, e t c . ) can be executed here as well. The CPH1 is the general purnose nrocessor ( i t includes DPU and TPU, i . e . a 16 and an 48 F i t processor).
COBUS
CONCLUSIONS
v T 34O
v, 340
VT 3 4 0
VT 3 4 0
VT 340
The multi-microprocessor design of the GD80 system lead to a clear, modular design both in hardware and software. A large range of CAD applications can be covered using GDSOgraphic configurations, but general purpose, nongraphic systems are also very easy to build.
Fig. 6. GD 80 KC. REFERENCES
\
:~'~ ....
Fig. 7. GD 80 TS.
coi~s
Fig. 8. GD 80 DP. 4.2. Non-graphic systems The GD80 KC (Figure 6) is an alphanumeric display concentrator. The GPC can handle up to eight displays ( f i v e of them shown). The HIF has the task to provide high level communication in a network. The common memory serves as interprocessor buffer. The GD80 TS is a time sharing minicomputer with two peripheral processors (Figure 7). The GPC handles alphanumeric consols, the HIF the floppy disks and eventually a communication l i n k . The GPC and HIF perform as a time sharing e d i t o r . The DPU in the background can assemble, compile or run programs (including disk handling). The GD80 DP is a general purpose high performance data processing system (Figure 8). Two peripheral processors are dedicated for communication and conventional peripheral handling. CPU2 is a disk management processor: i t handles
[1] G.A. Anderson, E.D. Jensen, Computer I n t e r connection Structures, ACM Computer Surveys Vol. 7, Nr. 3 (1975) 197-213. [2] G. Adams, T. Rolander, Design Motivations for Multiple Minicomputer Systems, Computer Design (March 1978) 81-89. [3] J. Acres, A. Lynch, Ring Network Architecture Supports Distributed Processing, Data Communications (March-April 1976) 51-55. [4] C.G. B e l l , W.A. Wulf, C.mmp - A Multimicroprocessor, AFIPS Proc. FJCC 1972, Vol. 41, Part I I . , 765-777. [5] O. Caprani, U.H. Jensen, V. Ougaard, Microprocessors connected to a Common Memory, Microprocessing and Microprogramming (1973) 175-181. [6] W.D. Farmer, E.E. Newhall, An Experimental Distributed Switching System to Handle Bursty Computer T r a f f i c , I s t Symp. Optimization of Data Communication Systems (1969). [7] S.H. F u l l e r , L. Raskin, P.I. Rubinfe, P.J. Sindhu, R.J. Swan, Multi-microprocessors: An Overview and Working Example, Proc. IEEE (Feb. 1978) 216-228. [8] J.D. Grimes, Distributed Processing Concepts Using Microprocessors, 14th IEEE CS Intern. Conf. (1977) 140-144. [9] E.C. Joseph, Distributed Processing Architectures - Past, Present and Future Trends, Distributed Systems: Intern. State of the ARt Report (1976) 319-347. [10] H. Jackson, Multiprocessing: Access of Common Memory, Microprocessing and Microprogramming (1977) 158-167. [11] R.R. Ramseyer, Multi-Micro Processor Implementation of General Purpose Mainframe CPU Systems, MS Thesis, Univ. Pennsylv. (1976). [12] D.P. Siewicrek, Modularity and Multi-Microprocessor Structures, Proc. 7th Ann. Workshop on Microprogramming (1974) 186-193. [13] L.C. Widoes, J r . , Architectural Considerations for General Purpose Multi-processors, 13th IEEE Computer Society International Conference (1976) 251-254.