Fifth generation fixed-point DSP

Fifth generation fixed-point DSP

Product focus Digital signal processors 96-bit floating-point DSP The DSP96002 is the first member of Motorola's family of single-chip HCMOS, 96-bit g...

198KB Sizes 125 Downloads 162 Views

Product focus Digital signal processors 96-bit floating-point DSP The DSP96002 is the first member of Motorola's family of single-chip HCMOS, 96-bit general-purpose floating-point DSPs. It was designed for numerically-intensive applications requiring fast IEEE floating-point arithmetic and access to large memory subsystems, i.e. graphics and numeric processing. Motorola claims a peak performance of 40.5 MFLOPS with the 27 MHz device, which will be sampled in the fourth quarter of this year. The device has a 'dual-natured' architecture-- there are two independent data memory spaces, two address arithmetic units, two on-chip DMA controllers, and two buses (see block diagram). This duality makes it easier to write software for numerically intensive applications because, for example, data are naturally partitioned into X and Y coordinates for graphics and imageprocessing applications, and into real and imaginary spaces for performing complex arithmetic. The architecture exhibits a high I Address 32 : /

degree of parallelism- up to three floating operations, two data moves and two address pointer updates can be executed in a single instruction cycle. The CPU consists of three 32-bit execution units operating in parallel. First the data ALU performs all arithmetic (fixed and floating-point) and logical operations. It consists of five b l o c k s - a format conversion unit, a general-purpose register file for storing ALU results, a floating-point multiplier, a floating-point add/ subtract unit, including a 32-bit barrel shifter, and a special function unit. Second, an address generation unit (AGU) performs all address storage and address calculations to address data operands in memory. Third, the program controller permits data transfers between any two locations in any combination of memory spaces without intervention of the DSP96002 core. It consists of a program address generator, program decode controller and the program interrupt controller. The DSP96002 features 1024 words of data RAM equally divided

Fifth generation fixed-point DSP Texas Instruments has disclosed details of its fifth generation of fixedpoint DSP chips, the TMS320C5x, which is source-code compatible

I

~[~-~]Address

J ExternalJ ] address : switch =

lEx!ernall ,aaaress ,

switch

Dual addressJ[ - I BUS Ic°ntr°U A r b i t r a t i o n I conhlrol

Internal data switch

Program 1024 x 32

RAM

RAM

Y memory 512 x 32 RAM

Dual c h a n n e l [ I DMA II controller

64 x 32 Bootstrap ROM

512 x 32 Cosine ROM

512 x 32 Sine ROM

r°0ramJ I Pr°gram I[ address

generator

X memory 512 x 32

Jl

generation unit

- '

Port A / h o s t interface,

into X data and Y data memory, 1024 word of full-speed on-chip program RAM and two preprogrammed data ROMs. On-chip bootstrap ROM allows convenient loading of user programs into the program RAM. Two independent expansion bus ports facilitate interfacing to SRAMs, DRAMs and VRAMs. A package of software development tools, DSP96000CLASx, is available now to allow the user to develop assembly language for DSP6002 applications, to link it and then to fully test it. The package consists of a simulator, an assembler, a linker and callable modules. (Motorola Ltd, 3501 Ed Bluestein Blvd, Austin, TX 7872•, USA. Colvilles Road, Kelvin Estate, East Kilbride, Glasgow G 75 OTG, UK. Tel: (03552) 39101) []

decode controller

controllerinterrupt

32

~--

Port B/host I - ' hi I Arbitration ~interface (BHI) control ~ . ~

con%,

I Program

/

' [ Data unit I . I E E E floating point

I.s2x 32 integer ALU

debug

Program control unit CLK 3 2 - B i t buses

Serial debug port

Block diagram of the DSP96002

Vol 13 No 7 September 1989

481

Product focus I External memory

I

Program data RAM(8kx16)

interface.. A ~ D(15-0)K/~H A(15-0) MUX

I Data RAM I JTAG test/EMLL- (544 x 16) control [ ~'~ ~ Pe~itPh~arcaeI

Boot R O M (2kx16}

Program data buses MUX CPU 32-bit ALU

16 x 16 bit muttplier 32-bit ACC Acc/prod shifter Pre/post Context switch shifter registers PLU Status registers Program control Instruction registers registers

m~ ~ ) L --i I-" ~

Timer

Software F wait states

Block diagram of the TMS320C50 with all TMS320Clx and TMS320C2x DSPs. The latest generation is designed to perform an instruction in 35 ns, giving a performance of 28.6 MIPS, claims TI. It is targetted at telecommunication, automotive, military and computer peripheral applications. The first chip in this family is the TMS320C50, samples of which will be made available in a 50 ns version capable of 20 MIPS in the first quarter of 1990. It is said to outperform existing fixed-point DSPs by 2-4 fold. The 0.8pro CMOS device has an advanced Harvard architecture and a high degree of parallelism. There is single-cycle address generation and progam execution, with a majority of instructions operating in a single cycle. The architecture (see block diagram) is maximized for efficient bit manipulation, zero overhead context switching (through the use of stack registers), and block-repeat execution of code. Large on-chip memories and many peripherals are integrated to maximize system performance and minimize cost. The DSP incorporates the JTAG IEEE P1149.1 standard for improved testability and ease of emulation. The central processor unit consists of a multiplier, which performs a multiply in a single machine cycle; an arithmetic logic unit (ALU) which performs single-cycle 16- or 32-bit logical or arithmetic operations, such as add and subtract; five shifters, of which three are barrel shifters, give flexibility to extended precision arithmetic and to the handling of

482

overflow associated with large numbers of multiply/accumulates; and a parallel logic unit (PLU), which operates independently from the ALU, for bit manipulations. There are also 19 program control registers for servicing of interrupts. There are three on-chip memory b l o c k s - - a 544 X 16-bit RAM (for storage and recovery of CPU calculations), and a 8192 x 16-bit RAM (for program execution at full speed), and a 2048 x 16-bit ROM (used as a boot loader). The TMS320C50 can also address 128 k x 16 bit of external memory. Peripherals are linked through a common bus structure, the TI Bus; this will facilitate the development of spin-off devices. The peripherals are: a full duplex serial port, operating at rates of IOMIPS, provides direct communication with serial devices or may be used in a multiprocessor configuration; an internal timer; 32 software wait state generators which allow the device to be used with slower off-chip memory and I / 0 devices; and a parallel I / 0 port of 16bit width. (Texas Instruments, 12501 Research Blvd, Austin, TX 78759, USA. Tel: (512) 250-7655. Manton Lane, Bedford MK41 7PA, UK. Tel: (0234) 270111) []

ST18 family upgrade The SGS-Thomson ST18 family has been upgraded with the addition of two 32-bit DSPs-- ST18940 (a microcontroller; contains ROM) and ST18941 (a microprocessor; ROMless).

ST18940/1 features an advanced Harvard architecture and a high degree of parallelism- in a single operation cycle the device can read two independent operands, perform a multiplication and an ALU operation, write a result back to memory, modify three address pointers and perform an I/O operation. With a cycle time of 10 ns, SGS-Thomson claims a throughput of 10 MIPS. The device is upwardly source compatible with the earlier 16-bit members of the family. It is aimed at advanced DSP applications in telecommunications, speech and image processing, spectrum analysis, high-speed control systems and digital filtering. This announcement quickly follows that of the ST18930/1, a CMOS version of the initial member of the ST18 family, TS68930, but with a faster instruction cycle time (80 ns) and additional hardware and software facilities. In comparison with the earlier members of the family, ST18940/1 provides enhanced arithmetic capabilities (and is particularly suited for fast Fourier transform, convolution and echo calling), addressing modes and additional I/O functions. The architecture (see block diagram) is based on four independent address calculation units (ACUs), three internal 16-bit data buses and three internal data memories, with a separate 32-bit program bus. The ST18940 has a 3 k x 32-bit program ROM and 512 x 16-bit coefficient ROM. The ST18941 microprocessor version can address up to 64 k of program memory on a dedicated bus, thus providing true realtime emulation of the ST18940 ROM version. In addition it has two internal RAMs (X and Y); a 128 × 16bit coefficient RAM is included for coefficient memory emulation. The two external buses, the system bus and the local bus, allow the device to be connected to a host processor or to other DSPs without additional glue logic. With the 16-bit local bus, either peripherals, such as analogue interfaces, can be controlled or up to 64 k × 16-bit of external memory can be accessed in

Microprocessors and Microsystems