ELSEVIER
J. Eng. Technol. Manage. 13 (1996) 1-28
Journal of ENGINEERINGAND TECHNOLOGY MANAGEMENT JET-M
ATM networks and their applications at the NASA Lewis Research Center A case study Catherine Murphy Bakes
a,*,
Fredric N. Goldberg b
a Department of Administratit,e Sciences, Graduate School of Management, Kent State UniL~ersity, Kent, OH 44242, USA b MS 142-1, Telecommunications and Networking Branch, NASA Lewis Research Center, 21000 Brookpark Road. Clet'eland, OH 44135. USA
Abstract
This paper provides an overview of ATM networks and a discussion of their role in supporting high-level technological research at the NASA Lewis Research Center in the U.S. NASA Lewis operates a variety of local and wide area networks and is introducing ATM to meet specialized requirements for transporting multimedia traffic, including text, voice, video, real time visualization data, and data collected from scientific experiments. ATM employs fast packet switching and statistical multiplexing to allow many sources to flexibly share network bandwidth and it provides an integrated transport, multiplexing, and switching technology for BISDN. SONET/SDH may be used as a physical layer protocol for optical transmission of ATM cells. ATM, together with SONET, provides standardized high-speed transmission for a wide variety of traffic types and performance requirements. NASA Lewis researchers are experimenting with cluster computing for solving large problems and are testing ATM for use as the transport fabric to support this activity. Video experiments have been conducted over a private ATM network which has links to the Ames and Langley Research Centers. NASA Lewis is exploring the use of ATM to facilitate collaboration between experimental and analytical researchers at geographically dispersed locations and to provide personnel at remote sites with access to NASA Lewis facilities. ATM service classes, the BISDN protocol reference model, and congestion control issues are also addressed in this paper. Kevwords: ATM; Broadband ISDN; SONET; Congestion control; Cluster computing; Cooperative computing
Corresponding author. Tel.: (330) 672-2750; Fax: (330) 672-2448: E-mail:
[email protected]. 0923-4748/96/$15.00 Copyright © 1996 Elsevier Science B.V. All rights reserved. PII S0923-4748(96)00003- 3
2
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
1. Introduction
This paper provides an overview of asynchronous transfer mode (ATM) ~ networks and a discussion of their role in supporting high-level technological research at the NASA Lewis Research Center. NASA Lewis' mission includes conducting research, development, and implementation in the fields of aeronautics, structures, and space experimentation, as well as supporting the national effort to develop a space station. NASA Lewis has evolved from a center whose primary research activities were experimental in nature to one where, while experimentation is still an important element, more of the research is computationally driven. It occupies a main campus of approximately 300 acres, with approximately 100 buildings housing approximately 5,000 research and support personnel. A secondary campus, which acts as an extension of the main campus, is located at an adjacent industrial park. NASA Lewis researchers collaborate extensively with various government, industrial, and academic centers within and outside the U.S. For solving research problems, they have access to local and remote computing facilities and personnel at remote centers have access to Lewis facilities. NASA Lewis has been challenged to develop a communications network that provides the local and wide area support needed to accommodate these activities. The computing resources at NASA Lewis include a Cray YMP, a T3D/64, and other smaller, but very powerful, compute servers, primarily concentrated in one central building. In addition, NASA Lewis has developed a central resource of parallel processors using a cluster of high powered workstations that act as a single unit to aid in the solution of scientific research problems. Researchers, scattered throughout the campus, must be able to address these compute servers from their own work areas. A researcher's primary support tool is an advanced workstation that is highly window oriented and capable of providing high speed number crunching and high resolution video displays. Optical fiber is the transmission medium for the campus backbone infrastructure which carries local traffic between buildings at NASA Lewis. This network includes many strands of single mode and multimode fiber and allows any attached station to access a potential bandwidth ranging from 100s of Mbps to multiple Gbps. The backbone can interface to various fiber, coaxial cable, and copper local area networks (LANs) within buildings and has been extended to selected servers and researchers' workstations that require high capacity connectivity. To further exploit the benefits of fiber, the public wide area network (WAN) providers have installed fiber throughout the entire geographic area of interest to the Lewis community. This enables Lewis researchers to access remote sites from their desktop workstations and enables users at remote sites to access Lewis facilities via a seamless fiber network at bandwidths up to and beyond rates of 155 Mbps. Future network initiatives will include Gbps speeds, especially for backbones. The NASA Lewis Research Center has active programs in fluid dynamics and solid dynamics. These require a communications network capable of transporting multimedia
i A glossary o f terms appears at the end o f this article.
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
3
traffic, including text, voice, interactive and noninteractive video, real time (RT) visualization, and data collected from scientific experiments. The diverse tasks supported by powerful desktop workstations, which operate as standalone devices, work cooperatively in local clusters, operate in client server mode, access central compute servers, and address remote sites, also impact network requirements. NASA Lewis operates a variety of LANs and WANs and is introducing ATM to meet its specialized requirements. Computational fluid dynamics (CFD) researchers, who analyze air flows in search of improved aircraft and spacecraft designs, are among the most active users of NASA Lewis computing and networking resources. When conducting an analytical study, a CFD researcher usually begins by entering a problem definition statement on a local workstation. A typical problem statement, including a set of initial conditions, would be represented by up to one megabyte of data. In a common scenario, the local network transports the problem statement to a local compute server or to a gateway for forwarding to a remote compute server. The initial conditions are used to generate a grid and solutions are obtained for points on the grid. The actual calculations may require several hours and are performed on a supercomputer or on a network of smaller high speed processors operating in a parallel or clustered configuration. Unless more supercomputers become available to support this type of research, the use of logically clustered processors is likely to grow. Once a problem has been computed, a solution or partial solution, typically represented by 50 to 100s of megabytes of numerical or graphical values, is returned to the local workstation through the same network. The researcher then uses interactive visualization techniques to examine selected cuts or surfaces of the fluid. The campus backbone serves not only the fluid dynamics researchers, but also the solid dynamics researchers and all other members of the NASA Lewis research community. Solving some heat transfer problems requires expertise in both solid dynamics and fluid dynamics. For example, a CFD researcher might model a passage and transfer the results to a structures researcher. The two researchers would then iterate back and forth in search of better solutions. Future research is likely to include interdisciplinary engine simulations, that model the combustor, fluids, heat, and structures, and to involve multiple disciplines, computer codes, and computers. Research into complex geometries, high temperature behavior, and narrow passages will increase and will involve looking at data at time intervals. This requires high speed imaging and video animation that is fast enough to observe behavior in real time. The NASA Lewis research community is developing the next generation of computing applications and exploring their network implications. A major thrust in this endeavor is the concept of information extraction and transfer. At the heart of this research will be an ATM network that supports multimedia transmission of information. High resolution animated images, depicting a time dependent reactive flow, will be carried in Gbps streams. It will become possible for widely dispersed researchers, from the fields of fluid dynamics, solid dynamics, and heat transfer, to integrate their expertise and simultaneously, from their workstations, attack a problem such as an interdisciplinary engine simulation.
4
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) l 28
While narrowband Integrated Services Digital Networks (ISDN) have integrated voice, data, and low grade video, they are not suited to offer high speed services such as RT visualization or LAN interconnection (De Prycker et al., 1993). Broadband ISDN (BISDN) networks are intended to have the ability to transport a richer mixture of multimedia services and a broader spectrum of traffic types, with bit rates ranging from a few kbps to 100s of Mbps and holding times that vary from sub-second to hours (Eckberg et al., 1991). They are being developed for the transmission of delay tolerant and RT traffic, broadband and narrowband rates, bursty and continuous traffic streams, connection oriented (CON) and connectionless (CLS) calls, and point-to-point and multicast communications (Crutcher and Waters, 1992; Hong and Suda, 1991). In addition to conventional services such as telephone, electronic mail, file transfer, telemetry, telecontrol, telealarm, and facsimile, potential BISDN applications also include high quality video distribution, video libraries, video telephony, multimedia conferencing, distance education, voice and video messaging, information retrieval from multimedia databases, and existing quality and high definition television distribution (Hac and Mutlu, 1989). To support this wide range of applications, the ITU-T (formerly the CCITT) has proposed interactive and distribution service classifications for BISDN (Crutcher and Waters, 1992; Hac and Mutlu, 1989). Interactive services are for two-way information exchange between a pair of subscribers and are divided into conversational, messaging, and retrieval services. Distribution services are for one-way distribution of information from a central source to any number of authorized receivers and include services with and without user presentation control. National and international standards organizations have recommended that ATM be used to provide a standardized transport, multiplexing, and switching technology for BISDN and that SONET, known internationally as the SDH, be used as the physical layer transmission standard (Cheung, 1992; Lyles and Swinehart, 1992). ATM combines the advantages of packet switching and circuit switching and allows all services to be transported and switched in a common digital format over a single network, rather than over several overlay networks with one for each service (Newman, 1994). ATM, together with SONET, is expected to offer reliable high-speed transport, bandwidth flexibility, and integrated transmission and switching for BISDN CON and CLS services and constant bit rate (CBR) and variable bit rate (VBR) traffic. While ATM was originally developed as an international standard to support public telephony networks, its domain also extends to private LANs and WANs and ATM interfaces are available for end user workstations (Biagioni et al., 1993; Eckberg, 1992; Le Boudec et al., 1993; Leslie et al., 1993; Lyles and Swinehart, 1992; Newman, 1994). ATM employs fast packet switching with statistical multiplexing to allow many sources to share the available bandwidth (De Prycker et al., 1993). An ATM network uses point-to-point links for user network interfaces (UNIs) interconnecting end-user devices and ATM switches and for network node interfaces (NNIs) interconnecting pairs of ATM switches. ATM is CON at the lowest level and transfers information using a virtual channel assigned for the duration of the connection. It switches fixed size packets, called "cells," which have 5 byte headers and 48 byte information payloads (Fig. I). (Throughout this paper, the term " b y t e " is assumed to be synonymous with an
CM. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
5 bytes <
5
48 bytes x
l Header i
.
.
.
.
.
.
.
.
.
.
.
Information field
.
.
.
.
.
.
.
>
]
Fig. 1. ATM cell structure.
8 bit "octet.") With current high quality transmission media and switching systems, ATM performs no error control on the data field and implements only core functions such as frame delimiting and bit transparency on a link-by-link basis. ATM switching nodes therefore need only minimal functionality for flow and error control. This, combined with short fixed size cells and the use of SONET for physical layer transmission, enable ATM networks to operate efficiently at very high speeds and to transport traffic with tight delay and delay jitter constraints. At the NASA Lewis Research Center, experimentation is often used to determine parameters for simulations. ATM will enable close coupling and dynamic interaction between the experiments and simulations. It will also benefit researchers who access remote facilities by enabling traffic that originates locally to be transported thousands of miles without any interruption or degradation in integrity. It will be possible for ATM traffic to be carried over a LAN from a local workstation or server to a gateway, embedded in a SONET signal, transported to a remote gateway, and finally carried over a remote LAN to a compute server.
2. Multiplexing and fast packet switching In traditional synchronous time division multiplexing (STM), bandwidth is organized in a periodic frame which has multiple time slots and a single framing slot to indicate the start of the frame (Bae and Suda, 1991). Each time slot in the STM frame is assigned to a particular call and the call is identified by the position of its slot within the frame. Slots are assigned based on peak rate so that required service quality can be guaranteed even at the peak rate. While this is suitable for fixed rate services, bandwidth is wasted if traffic falls below the peak rate. By using statistical multiplexing, ATM eliminates the inflexibility and inefficiency of STM and allows a greater number of sources to share the available bandwidth (Bae and Suda, 1991). It organizes call information into fixed size packets called "cells," each with a 5 byte header and a 48 byte information payload (Fig. 1). ATM allocates cells to calls in an asynchronous manner based on demand, with no bandwidth consumed unless information is actually transported. It supports CBR services by allocating bandwidth based on peak rate and accommodates VBR and bursty sources, which generate traffic at high rates for short time periods and at lower rates at other times, by assigning some bandwidth lower than the peak rate. However, network congestion can result if many sources become active simultaneously. Unlike STM, which uses positioned-channel multiplexing, ATM uses labeled-channel multiplexing to associate cells with calls (Cheung, 1992; Hac and Mutlu, 1989). A label
6
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
in each cell header, called the connection identifier, explicitly associates the cell with a given virtual channel on a physical link to identify the component of the aggregate packet stream to which the cell belongs. Because it multiplexes calls on a cell level and without using a rigid channel structure, ATM can readily add new services or drop old ones and can allocate bandwidth based on demand (Hong and Suda, 1991). Thus ATM is more bandwidth efficient than STM and allows more calls to share network resources. In traditional synchronous communication, there is a fixed physical connection between both ends of a circuit (Cavanagh, 1992). To assure the continued presence of the connection, information is exchanged periodically between the endpoints. While this overhead function is required on less reliable analog networks, it is not necessary for reliable, high throughput, fast packet switching, fiber networks. In general, error control schemes may be implemented on a link-by-link or an edge-to-edge basis (Bae and Suda, 1991; Lea, 1992). With a link-by-link scheme, cells are processed at each node in the network and retransmission of lost or errored cells takes place between adjacent switching nodes. In existing networks, where packet transmission time is frequently the performance bottleneck, the overhead of this protocol processing is considered negligible. With an edge-to-edge scheme, retransmission takes place only between source and destination nodes. When throughput and error rates are low, there is no significant difference between the two schemes. If the error rate increases, link-by-link schemes perform marginally better while edge-to-edge schemes " w a s t e " successful transmissions over earlier links if an error occurs later on a path. However, on high speed networks with low error rates, the edge-to-edge method performs better than the link-by-link method and also requires fewer network resources such as buffers and computation time. In high speed ATM networks, with short cell transmission times, the performance bottlenecks become the processing delay at the switching nodes and the channel propagation delay (Bae and Suda, 1991). The high ratio of propagation delay to cell transmission time makes it possible to have many cells simultaneously in transit between two nodes. This may cause a conflict when using the sliding window method of flow control. High throughput requires a large window, but this would impose little control impact. ATM networks therefore implement flow and error control independently and use simplified protocols that perform most functions on an edge-to-edge basis.
3. Virtual channels and virtual paths There are two multiplexing alternatives in ATM networks, virtual channels (VCs) and virtual paths (VPs) (Cheung, 1992; Cox et al., 1993). User connections usually employ VCs. Typically, when a user application is initiated, a VC is established and a route is selected. Subsequently, all cells transmitted on that VC are forwarded along the selected route. A VP is a bundle of VCs which have the same end-points and share a common path through the network (Fig. 2). It is switched as a unit and may be used by two hosts to multiplex many individual application streams together. Each cell header contains a connection identifier which is used in multiplexing, demultiplexing, and switching the
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
Transmission Path
v,
vc~::X!)
vP
VP
~- -
VC
VP
~-
vc
VP
~==
vc
/ ,/
),/
7
J
VC Virtual Channel VP Vimml Path Fig. 2. Relationship between VCs, VPs, and the transmission path in A T M networks.
cell through the network. The connection identifier consists of a VC identifier (VCI) and a VP identifier (VPI) that identify the VC and the VP to which the cell belongs. The VCI and VPI are assigned when the connection is established and they remain for the duration of the connection. A VC link is a means of unidirectional transport of ATM cells between a point where a VC1 value is assigned and a point where that value is translated or terminated. All the VC links in a VP have the same end points. Each VC link is identified by a VCI which has only local significance per link between nodes in the VC. When a connection is released, the VCI values on the involved links are released and may be reused by other connections. A VP link is a group of VC links terminated by the points where a VPI value is assigned and where that value is translated or removed. One or both endpoints connect to a routing function for cells passing through the connection. A VC connection (VCC) is a concatenation of VC links. It has end-to-end significance between two end-users and preserves cell sequence integrity to facilitate reconstruction of cells at the receiving node. A VP connection (VPC) is a concatenation of one or more VP links and supports one or a group of VC links. VPCs/VCCs can be employed from user or network to user or network and all cells associated with an individual V P C / V C C are transported along the same route. ATM supports semipermanent VPCs and on demand VCCs (De Prycker, 1992). VPCs are statically allocatable and do not require end-user signaling. VCCs are dynamically allocatable based on user demand and require broadband end-user signaling. Unlike 32 bit IP addresses, a V P I / V C I connection identifier is too short to be used as an explicit address. It is a label which is used by an ATM switch to select an entry from a routing table and to relay traffic to the next node (Biagioni et al., 1993). When a cell arrives on an incoming link, the switch locates the incoming port number and the incoming connection identifier in the routing table to find the new connection identifier and the number of the output port to which the cell is to be forwarded. The new connection identifier, which will be used by the next switch for subsequent routing operations, is placed in the header of the outgoing cell and the cell is transmitted over the specified outgoing link. ATM's use of short connection identifiers has the advantage of minimizing cell processing delays. Two types of switching are used in an ATM network for traffic multiplexing and routing (Handel and Huber, 1991). VP switching occurs at the end-points of VP links, translates incoming VPIs to outgoing VPIs according to the VPC destination, and leaves
8
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
VCI values unchanged. VC switching occurs at the end-points of VC links and, consequently, of VP links. It translates both VPI and VCI values and, therefore, implies VP switching. Certain network applications involve communication between multiple participants, multiple traffic streams per source, or asymmetric arrangements (Crutcher and Waters, 1992). These include multimedia conferencing with voice, video, and graphical content, request/response from video library retrieval systems, multiparty communication such as conferencing and distribution services, and group management procedures. In the case of a multicast connection, which transports information from one source to many destinations, the connection identifier maps to multiple connection identifiers and outgoing links (Cox et al., 1993). Each switch involved in supporting the multicast connection replicates incoming cells, assigns the new connection identifiers, selects the set of outgoing links, and forwards cells. For a multicast VC, as for an unicast VC, the data rate is flexible. For each VC, new endpoints may be dynamically added or old ones removed and, for each endpoint, transmission and reception may be independently enabled. Due to the simplicity of the algorithm for switching and routing fixed size cells, ATM switches implement cell routing in hardware (Biagioni et al., 1993). The routing information used by the hardware is updated when a connection is established. In the case of a multiple switch ATM network, the switches must cooperate to route the connection through the network, but only at the time when the connection is established. ATM switching fabrics use cell-based backplanes and matrix switches that can operate internally at Gbps rates and allow V P I / V C I mapping and switching at input speeds of many 100s Mbps. An ATM LAN is based on a network of switches and dedicated point-to-point links to each host (Biagioni et al., 1993; Le Boudec et al., 1993; Lyles and Swinehart, 1992; Newman, 1994). It uses local ATM switches as the nodes of the network and host ATM interfaces to connect hosts to the network. While LANs based on a shared medium like FDDI offer a fixed bandwidth among connected hosts and may become saturated by a small number of hosts, an ATM LAN can offer each host a dedicated access rate and transfer data from any number of hosts in parallel. Consequently, the aggregate bandwidth on an ATM LAN increases as hosts are added.
4. Traffic descriptors, performance parameters, and service classes ATM applications have a wide variety of traffic characteristics (Irvin, 1993). Some services, such as large file transfers, generate a CBR stream. Others, such as compressed full motion video, are continuously varying in bit rate. More information is transmitted at times when there is a great deal of motion than during periods when there is less motion. Still picture video and packetized voice services are bursty (Lyles and Swinehart, 1992). ATM applications also have a wide range of network performance requirements. Traffic such as RT image and document retrieval, RT video and videoconferencing, LAN interconnection, and data for RT control requires a fixed high bandwidth for the
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
9
call duration, RT service, and low delay jitter (Bae and Suda, 1991; Hac and Mutlu, 1989; Hong and Suda, 1991). Some of these services have such strict delay requirements that they are considered lost if late. Bulk information transport applications often require a high bandwidth service with strict error control, but do not need RT delivery. Examples include the transfer of large data files and delay tolerant document, image, and video delivery services. Low bandwidth, delay sensitive services, such as packetized voice, interactive data, and enquiry-response messages, may have a wide range of end-to-end delay requirements. For voice services, the loss of small amounts of information may be tolerable, but not delay. During establishment of each ATM connection, a service contract should be negotiated between the user and the network to specify a set of traffic descriptors (TDs) and performance requirements for that connection (De Prycker, 1993; Eckberg, 1992; Eckberg et al., 1991). During the call, renegotiation may take place on an exception basis or as a routine element of call management. The service contract should be used to reach agreement to limit the traffic the network is expected to carry and to reserve the network resources needed to support the connection. During cell transport, the network monitors resource usage to ensure that the service agreement is observed. In order to ensure adequate and consistent performance to users in the presence of unpredictable traffic, ATM selects VC paths and allocates network resources based on anticipated traffic (Biagioni et al., 1993; Cox et al., 1993). When opening a connection, an originating host may specify its desired resources, minimum acceptable resources, maximum resource requirements, or even no specification at all. The network will accommodate resource requests to the best of its ability and try to allocate resources so that almost all information bursts can be delivered intact. It may grant less than the desired resources or refuse the connection request if the minimum required resources are not available. It may also adjust resources allocated to established VCs in order to avoid blocking new VC requests. TD parameters describing a connection should be simple, unambiguous, understandable, useful for specifying traffic related aspects of user-network service contracts, and related to those that can be monitored and enforced (De Prycker, 1993; Eckberg, 1992; Roberts, 1991). They should facilitate connection admission decisions by providing a quantitative and accurate prediction of the impact of a given connection on the performance of shared network resources. They should also enable the network to monitor admitted connections and to discriminate between excessive and non-excessive traffic. Common TDs include peak rate (i.e., the maximum cell generation rate over a short interval), long term average rate (i.e., the cell generation rate over a longer interval), and burstiness (Bae and Suda, 1991; Yazid and Mouftah, 1992). Burst intensity (i.e., the ratio of the burst cell generation rate to the call's average rate), mean burst duration, and mean burst interarrival time are critical TDs for bursty o n / o f f sources (Burgin and Dorman, 1991). A VBR source may need TDs to describe the peak rate, average rate, and anticipated holding time. An adequate description of a CBR source requires an indication of the cell generation rate and anticipated holding time. Other candidate TDs include the ratio of the burst cell generation rate to the link rate, burst factor (i.e., the product of the mean burst duration and the difference between the peak and average
10
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
Service class
A
Timing Relation between Source and Destination Bit Rate
Timing Required
I
i
C
D
I Timing not Required
1_
CBR
Connection Model AAL Types
B
VBR CLS
CON 1
2
3/4,5
3/4,5
Fig. 3. ATMservice classes.
rates), cell jitter ratio (i.e., the ratio of the variance to the mean of cell interarrival times), and squared coefficient of variation of interarrival times. In draft Recommendation 1.35B, "Broadband ISDN Performance," the ITU-T defines cell loss ratio, cell insertion rate, errored cell ratio, mean cell transfer delay, cell delay variation, severely errored cell ratio, and cell transfer capacity as a preliminary set of performance parameters for ATM (Anderson and Nguyen, 1991). An ATM service class is a set of services that requires the same quality of service (QOS) (Bae and Suda, 1991). To reflect the spectrum of intended applications, ITU-T has defined four service classes based on the presence or absence of a timing relation between the source and destination, CBR or VBR, and connection mode (Fig. 3) (De Prycker et al., 1993). For Class A, there is a timing relationship between the source and destination, the bit rate is constant, and the service is CON. Examples include pulse code modulation (PCM) encoded 64 kbps voice, transport of T1 or El signals, and uncompressed video. Class A is sometimes called circuit emulation. For Class B, there is a timing relationship between the source and destination, the bit rate is variable, and the service is CON. Examples include compressed voice and video services. For Class C, there is no timing relationship between the source and destination, the bit rate is variable, and the service is CON. Examples include CON data transfer, signaling, X.25, and frame relay service. For Class D, there is no timing relationship between the source and destination, the bit rate is variable, and the service is CLS. Examples include CLS packet data, LAN interconnection traffic, SMDS, and electronic mail.
5. Protocol reference models Most networks are implemented as a set of independent layers in which each layer transparently provides a set of services to the layer above (Hac and Mutlu, 1989). The set of layers, along with its associated set of communications protocols, is called a protocol reference model (PRM). The BISDN and ISO OSI PRMs have commonalities and differences (Bae and Suda, 1991; Cheung, 1992). The BISDN model applies to a broad range of communications services while the OSI model was established for data communications services only (De Prycker et al., 1993). Because of the major role played by signaling in managing
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
/ //
/
/
/
/
/
/
/
11
/
Manageraent Plane //
ControlPlane /
Higher Layer Protocols
/
/
User Plane
Higher Layer Protocols
ATM Adaptation L a y e r
I
ATM Layer / Physical Layer
/ /"
Fig. 4. BISDN protocol reference model.
connections for multiservice networks, the BISDN PRM extends the OSI model to include user and control types of information flow (Crutcher and Waters, 1992). It consists of three separate planes, analogous to three protocol suites, to segregate user, control, and management functions (Fig. 4). The user (U) plane provides for the end-end transport of user information along with associated flow and error controls. The control (C) plane deals with signaling information to control call and connection setup, supervision, and release. Like the OSI PRM, the U and C planes have layered structures. However, the seven layers of the OSI architecture involve too much processing to apply to reliable fiber optic BISDN networks. In the BISDN PRM, the layers of the U and C planes are physical, ATM, ATM adaptation layer (AAL), and higher service layers (Fig. 5). The lower layers provide functions common to both planes and the higher layers provide separate functions for each. The management (M) plane provides plane management and layer management functions. It is responsible for maintaining the network and
Higher layer functions
Higher layers
Convergence Segmentationand massembly
L
Generic flow control Cell header generation/extraction Cell VPI/VCItranslation Cell multiplexand demultiplex Cell rate deeoupling HEC sequencegeneration/verification Cell delineation Transmissionfram adaptation Transmission framegeneration/recovery Bit timing Physical medium
ATM
P h
TC
L Y a s i
¢
Y e r
PMD
Fig. 5. BISDN layers, sublayers, and functions.
12
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
performing operational functions such as performance monitoring, failure detection, and fault localization. The BISDN physical layer (PL) is subdivided into physical medium dependent (PMD) and transmission convergence (TC) sublayers (De Prycker et al., 1993). PMD is responsible for the correct transmission and reception of bits on the physical medium (Bae and Suda, 1991). It is medium dependent and performs electro-optical conversion when the medium is fiber. TC maps cells passed from the ATM layer to a synchronous, plesiochronous, or cell-based bit stream for transmission by PMD. At the receive side, it passes cells to the ATM layer after extracting them from the PMD bit stream. The ATM layer implements all the functionality of ATM and its role is common to all services (Bae and Suda, 1991; Cheung, 1992). The ATM layer, without performing any error control or other processing on the information payload, passes the payload to or from the AAL in an end system and relays cells in an ATM switch. VCs and VPs comprise two networking levels of the ATM layer (Breuer, 1991; De Prycker et al., 1993). A VCC extends between two points where the ATM layer accesses the AAL. A VP link extends between two points where the PL accesses the ATM layer. The ATM layer multiplexes, onto a single cell stream, cells belonging to connections with different connection identifiers. It translates connection identifiers at ATM switches and crossconnects and routes cells based on the VPI and maybe the VCI. Other ATM layer functions include delineating cells, adding the cell header after the cell is received from the AAL, extracting the cell header before the cell is delivered to the AAL, and detecting errors in the header. The ATM layer also provides a flow control mechanism and policing function at the UNI. Each ATM cell consists of a 5 byte header and a 48 byte payload (Fig. 1) (Cheung, 1992). The header contains ATM layer protocol information and the payload carries user data plus any headers or trailers required by higher level protocols. Different cell header formats (Figs. 6 and 7) have been adopted for the UNI at the network edge and the NNI at network nodes (Bae and Suda, 1991). The difference is that the first 4 bits are used as a generic flow control (GFC) field in the UNI header and as 4 additional VPI bits in the NNI header. The GFC field, which is defined only across the UNI, is used to alleviate short term overload problems by controlling the flow of traffic from the user across the UNI. The V P I / V C I pair constitutes the cell connection identifier and indicates the cell's route through the network. The longer VPI in the NNI header reflects the wider use of
GFC
VPI
VPI
VCI VCI
VCI
PT HEC
Fig. 6. ATM cell headerformatat the BISDNUNI.
I CLP
C.M, Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
13
VPI VPI
VCI
VCI: VCI
i
IC~P ] PT
HEC Fig. 7. ATM cell headerformatat the BISDNNNI. VPs across an NNI. The payload type (PT) indicator identifies the type of information in the cell payload, such as user or connection management information, and may also provide network congestion information. Cell loss priority (CLP) assists in minimizing service degradation if the network must discard cells due to congestion. It is set to 0 in a cell that is to receive priority treatment and to 1 in a cell that is subject to being discarded. Header error control (HEC) is used to correct or detect errors in the header in order to prevent errors in the VPI/VCI from causing cells to be misdelivered. It provides one bit error correction or multiple bit error detection capabilities for the cell header. The functions provided by the AAL and higher layers of the BISDN PRM are service dependent (Bae and Suda, 1991). AAL is mainly responsible for mapping user, control, and management information, passed down from higher layers, into the information field of an ATM cell, and vice versa (De Prycker et al., 1993). The boundary between the ATM layer and AAL corresponds to differences between functions applied to the cell header and to the information field. AAL specific information, such as information field length and segment numbers, is contained in the cell payload. AAL functions include time stamping, source clock frequency recovery, preserving data frame boundaries, sequence number processing, forward error correction, and detecting and handling lost or misdelivered cells. The AAL is subdivided into segmentation and reassembly (SAR) and convergence sublayers (CS). SAR segments higher layer data packets into cells before they enter ATM switching facilities at the transmitting end and performs the inverse operation at the receiving end (Cheung, 1992; De Prycker et al., 1993). CS facilitates switching all four ATM service classes through the same switching fabric and, according to the specific service class, performs functions such as message identification, error checking, and clock recovery. Four AAL types were initially proposed, with each optimized to carry one of the ATM service classes (Fig. 3). AAL 1 supports Class A CBR services. AAL 2 supports Class B services and its protocol data unit (PDU) has a header, a trailer, and a field which contains an indication of beginning of message (BOM), continuation of message (COM), or end of message (EOM). AALs 3 and 4 were originally established for Class C and D services, respectively. They have since been merged into AAL 3 / 4 whose procedures may be applied in a CON or CLS manner (De Prycker et al., 1993). According to user specifications, data transport may be reliable or on a "best effort"
14
C.M. Bakes, F.N. Goldberg / J . Eng. Technol. Manage. 13 (1996) 1-28
basis (Cheung, 1992). If the user requests a reliable data transfer, AAL retransmits lost or corrupted packets and performs flow control. Otherwise, no retransmission or flow control functions are provided. To allow a common AAL 3 / 4 , the AAL 3 / 4 CS is divided into common part and service specific sublayers. AAL 3 / 4 is predominantly used in support of SMDS. AAL 3 / 4 operations support two types of data transfer requirements, message and streaming modes (Bae and Suda, 1991). Message mode service is used for framed data and allows a single service data unit (SDU), the information unit passed from the service layer above, to be segmented into smaller pieces for transmission. The CS accepts the SDU, prepends a 4 byte header, pads the SDU to make it an integral multiple of 32 bits, and appends a 4 byte trailer. Header and trailer functions include service indication and cell loss detection. The header, SDU, pad, and trailer form a CS PDU, which is passed to the SAR sublayer for segmentation into 44 byte SAR SDU payloads, of which the last one may have an unused portion. To the SAR SDU payload, the SAR sublayer prepends a 2 byte header and appends a 2 byte trailer. Header and trailer functions include segmentation and reassembly, indication of segment type (BOM, COM, EOM, or single segment message), identification of a message, indication of a partially filled segment, and bit error detection for the entire contents of the SAR SDU. The SAR SDU is passed to the ATM layer. Streaming mode is used for low speed continuous data with low delay requirements (Bae and Suda, 1991). It transports, as one AAL PDU, one or more fixed size SDUs which can be as small as one byte. While a CS PDU consists of one SDU for message mode service, it may consist of several SDUs for streaming mode. A header, pad, and trailer are added to complete a CS PDU. This is segmented into 44 byte SAR SDU payloads by the SAR sublayer, with each SDU contained in a separate SAR SDU. Any segment, not just the last one, may have an unused portion. A 2 byte header and a 2 byte trailer are also added to the SAR SDU payload. AAL 5, sometimes called SEAL, has been added to the original set of AAL types for simplicity requirements (De Prycker et al., 1993). It supports CLS and CON VBR data services on a best effort basis (Cheung, 1992). It performs error detection but does not retransmit lost or corrupted packets. AAL 5 has a limited set of functions versus AAL 3 and 4, but has lower bandwidth overhead, simpler processing requirements, and less implementation complexity. ATM is expected to support existing network management frameworks and communications protocols transparently (Biagioni et al., 1993). It is a layer 2 entity, on top of which higher layers such as T C P / I P and SMDS can be added to build multilayer communications protocols. To support T C P / I P over ATM, IP packets must be encapsulated in ATM PDUs and the AAL must be used to segment, reassemble, and frame the IP packet. Internet addresses must be mapped to ATM layer addresses for connection establishment and the IP layer, with its CLS service, must interface to a CON ATM data link layer. Either AAL 3 / 4 or AAL 5 may be used to encapsulate IP packets over an ATM connection. A B I S D N / A T M network can economically support a wide variety of traffic requirements and interface protocols (Cheung, 1992). It can accept a user's preferred interface at one end and transport it to the other end. For example, frame relay traffic may be
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Mana~,,e. 13 (1996) 1 28
15
transported over an ATM network by mapping the frame relay DLCI (data link connection identifier) and DE (discard eligibility) to an ATM V P I / V C I and CLP, respectively. The conversion between the frame relay and ATM interface formats may take place within the network or at the edges of the network.
6. Signaling Signaling is the exchange of information between the user and the network in the establishment, control, and management of a connection. While OSI and X.25 use in-band signaling, ATM employs out-of-band signaling in which the control plane transports control packets for VCC setup and release and the user plane transfers data packets (De Prycker et al., 1993). Before user data may be transmitted, control traffic flows on a VCC in the control plane to establish a VCC with a reserved bandwidth in the user plane. Existing signaling techniques do not adequately meet the complex requirements of the multimedia, multiservice, and multiparty applications envisioned for ATM networks (Cheung, 1992; Crutcher and Waters, 1992), New capabilities are required for simultaneously establishing or releasing connections associated with a call, for adding or removing connections to or from an existing call, for adding an existing call to a multiparty call, and for splitting a multiparty call into multiple calls. Some of these requirements can be provided through extending Q.931 for access signaling at the UNI and using Signaling System 7 for network signaling at the NNI. In the long term, ATM connection management may require major restructuring of current signaling protocols and consume significant network bandwidth and processing resources.
7. Traffic control B I S D N / A T M networks support a variety of voice, data, image, video, and multimedia services using network resources that include transmission bandwidth, switching functions, buffer capacity, and control intelligence (Burgin and Dorman, 1991). Controlling access to these resources is important for achieving both grade (relating to call blocking probabilities for offered traffic) and quality (relating to cell loss and cell delay for carried traffic) of service objectives for all service classes. Theoretically, the rate of transfer is under the control of the sender (Cavanagh, 1992). If the sender has a large burst of data to be transmitted, bandwidth is made available and the data are sent. If the sender has little or no information to send, little or no bandwidth is allocated. This network capability to absorb data as it becomes ready for transmission is called "bandwidth on demand." To implement bandwidth on demand, with satisfactory service for all users, requires sophisticated congestion control to handle bandwidth contention at the periphery of the network, handle network congestion when the offered load exceeds the network's capacity, and enforce a system of equitable sharing between users.
Congestion control for ATM should be simple, flexible, robust, and controllable (De
16
C.M. Bakes, F.N. Goldberg/J. Eng. Technol, Manage. 13 (1996) 1-28
Prycker, 1993; Eckberg, 1992; Eckberg et al., 1991; Wernik et al., 1992). Simple algorithms are more likely to be fast, cheap, implementable, understandable, and able to achieve high resource efficiency. Flexibility facilitates adapting to new services, applications, traffic characteristics, and performance objectives. Robustness reduces sensitivity to imperfect assumptions and changing requirements. Adequate congestion control contributes to efficient network resource utilization without a performance penalty. An overall system of congestion, flow, and error controls, with functions in internal elements of the network, at network access points, and in end terminals, is envisioned for ATM (Lea, 1992). The objectives include protecting network resources, protecting ATM connections from other connections competing for resources, and increasing the useful throughput and performance of ATM connections. The service contract ties these controls together. Most conventional congestion control techniques do not adapt well to the high speeds, edge-to-edge protocols, and diverse traffic characteristics and service requirements of B1SDN/ATM networks (Bae and Suda, 1991; Burgin and Dorman, 1991). In general, link-by-link congestion control schemes react to congestion after it happens and then try to reduce it to an acceptable level. At the onset of congestion, destinations send choke packets to instruct sources transmitting information through the point of congestion that they should stop or slow transmission (Hong and Suda, 1991). Choke packets may be sent to all sources to also throttle sources that will begin transmitting in the future. For ATM networks, ITU-T has recommended 155 Mbps and 622 Mbps and internal links may operate in the Gbps range. However, these broadband speeds over wide area distances result in large values of bandwidth-propagation delay product and increase the amount of traffic that can be in transit during the propagation time of the throttling message. By the time feedback reaches the source nodes and control is initiated, it may be too late to react effectively. For example, on a I Gbps 3,000 km link, the propagation delay for a choke packet would be 10 ms, during which time the sender would have transmitted another 10 Mbits. This can cause severe buffer management problems and reduce the effectiveness of reactive feedback-based congestion control schemes on ATM networks. Preventive control schemes, which try to maintain traffic load at a manageable level and to prevent congestion before it happens, are required for ATM networks (Bae and Suda, 1991). These include call admission controls, priority schemes, and usage parameter control (UPC). They act at the connection level and at the cell level and are supported by cell header functionality. The most common and effective approach is to control traffic flow at the network entry points by regulating new calls admitted to the network. At the time of call setup, the network examines the call's expected traffic characteristics (e.g., peak rate, average rate), the call's service requirements (e.g., acceptable cell delay and cell loss probability), and the current network load (De Prycker, 1993; Roberts, 1991; Wernik et al., 1992; Yazid and Mouftah, 1992). Based on this information, it decides whether to accept or reject the new connection. If the call is admitted, the specified TD parameters and QOS requirements become a service contract between the end-terminal and the network (Handel and Huber, 1991). Congestion control capabilities within network elements (NEs) include resource
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
17
allocation and service scheduling (De Prycker, 1993; Eckberg, 1992; Eckberg et al., 1991; Ramakrishnan and Newman, 1995). This involves allocation of buffers and link bandwidth between service classes to prevent one service class from "locking out" another from transmission. A system of relative priorities between service classes is thus established. The capabilities intended to be continuously active for RT network congestion control during ATM cell transport are traffic monitoring, tagging excess traffic at network access points, selective cell discard at individual congested NEs, providing congestion status information to end terminals, and NE internal controls for scheduling and allocating NE resources. Prior to transmitting a traffic burst, a source may also use a fast reservation protocol to request a specified bandwidth reservation from the network (Cavanagh, 1992; Newman, 1994; Roberts, 1991). Recommendation 1.371 proposes that conformity of every ATM connection to its service contract should be enforced at the UNI and NNI network entry points (Boyer et al., 1992). The network enforces the service contract by employing UPC, which may be combined with leaky bucket or virtual scheduling algorithms, to control traffic generated by individual connections (De Prycker, 1993; Hong and Suda, 1991; Roberts, 1991; Wernik et al., 1992). UPC acts like a throughput burstiness filter which partitions a stream of cells into excessive and non-excessive traffic categories (Eckberg, 1992; Eckberg et al., 1991). It includes a means for shedding load selectively, with only minimal impact on end se~ices and applications. A cell deemed in excess of the declared parameters may be discarded before entry to the network or may be permitted to enter the network with a violation " t a g " (Burgin and Dorman, 1991). This is implemented via the CLP indicator in the cell header which indicates whether the cell may be discarded if it reaches a congested node. Setting CLP to 1 indicates a cell may be discarded in any NE along a V P / V C path if congestion above a threshold is encountered. CLP may be set by the sending terminal to indicate that the cell carries nonessential information or at the network access point if the cell violates the traffic limits negotiated in the service contract. Treatment of the cell is independent of the reason for setting CLP. Through this ability to quickly shed load at congestion points, the network becomes more resilient to load surges. Large cell delay variation (CDV) may induce severe degradation in the accuracy of UPC mechanisms, but highly constraining CDV bounds can be impractical (Boyer et al., 1992). A spacing function may be added at the network access point to prevent low network link utilization when moderately constraining bounds on CDV are used (Brochin, 1992; De Prycker, 1993; Roberts, 1991). If the negotiated peak cell rate is momentarily exceeded, the spacing function can space out cells arriving too closely together. Conformity to the original peak cell rate may then be achieved at the access point and resource allocation can be optimal. To control edge-to-edge CDV, cell spacing may also be used at the output side of terminal equipment, by end user equipment, or even just before a NNI within the network. It may also be combined with fast resource management schemes to perform a statistical allocation of resources to bursty ATM connections (Cavanagh, 1992; Newman, 1994; Roberts, 1991). Some reactive control schemes are also under consideration for use on ATM networks (Newman, 1994). A capability for forward notification of encountered congestion conditions to the destination terminal may be implemented by setting an explicit
18
C.M. Bakes, F.N. Goldberg/J. Eng. TechnoL Manage. 13 (1996) 1-28
forward congestion indicator (EFCI) to 1 within the cell header (Eckberg, 1992; Eckberg et al., 1991; Yazid and Mouftah, 1992). Any NE may set EFCI in a passing cell when congestion exceeds defined thresholds. The destination terminal could then notify the source to trigger rate controls, error controls, or other appropriate actions (Kung et al., 1994). A capability for backward notification of congestion conditions is also provided using NE originated cells. NEs can monitor network load and, if they detect an onset of congestion, they can notify the network access points or end terminals to trigger appropriate actions. End terminal controls also employ rate controls and other reactive control schemes to dynamically control input parameters based on available information regarding network congestion (Bonomi and Fendick, 1995; Boyer et al., 1992; De Prycker, 1993; Iwata et al., 1995; Lea, 1992; Ramakrishnan and Newman, 1995; Wernik et al., 1992; Yazid and Mouftah, 1992). Feedback from the network concerning the available bit rate may be used to control the rate at which sources transmit cells into the network (Brochin, 1992; Kung et al., 1994; Newman, 1994). Schemes such as traffic smoothing policies and adaptive windowing mechanisms prevent short term congestion by ensuring that sources do not exceed allocated parameters. Traffic classes that are not very delay sensitive can be shaped by buffering. Applying different control and priority mechanisms to cells belonging to different traffic classes is a common method for satisfying numerous QOSs (Bae and Suda, 1991). In scheduling and congestion control, for example, ATM may assign a higher priority to some cells based on their sensitivity to delay or loss (Hac and Mutlu, 1989). Congestion control may be implemented at a number of levels in an ATM network, including VPCs, VCCs, and individual cells (Burgin and Dorman, 1991; Wernik et al., 1992). Since each VPC may support many VCCs, VPs are a natural target for network management actions. Grouping connections that share common network paths into one VP also reduces the network control cost, since network management can be applied to a few groups of connections instead of many individual connections. Managing VPCs may involve allocating capacity based on anticipated demand, changing capacity allocations to cater for changing demand, and rerouting traffic in times of congestion. The process of setting up a VPC is decoupled from the process of setting up an individual VCC. By reserving capacity on a VPC, new VCCs can be quickly established by executing simple control functions at the endpoints of the VPC, with no call processing required at the transit nodes. For an individual VCC, network node processors check to see if a VPC to the required destination node is available with sufficient capacity and appropriate QOS to support the connection and to store the required state information (e.g., VPI, VCI mappings).
8. Infrastructure
Many organizations, including telephone carriers, NASA Centers, and universities, are upgrading their transmission equipment and moving toward optical fiber based infrastructures (Lyles and Swinehart, 1992). Optical fiber media offer the advantages of high capacity, low propagation delay, and reliable transmission that are required to
C.M. Bakes, F.N. Goldberg / J . Eng. Technol. Manage. 13 (1996) 1-28
3
1
Columns Column
19
86
Columns
[ Section! p Overhead a (3 rows) , t --
h
Payload
iO Line v Overhead e r (6 rows) h
(9 rows)
e
a d: ,
STS Synchronous Payload Envelope
•
Fig. 8. STS-I frame.
support current and emerging BISDN services (Hac and Mutlu, 1989). SONET is specified as a PL protocol for BISDN in the U.S. and SDH is specified in other countries. SONET defines a standard set of interfaces which may be used for optical transmission across the UNI and NNI (Hac and Mutlu, 1989). It is a hierarchy of optical signals, called OC-Ns, that are multiples of a basic 51.84 Mbps OC-1 signal (Cheung, 1992). The 155.52 Mbps OC-3 and 622.08 Mbps OC-12 signals have been designated as the user access rates for BISDN networks. Other important rates are the 2.488 Gbps OC-48 and 9.953 Gbps OC-192. The electrical counterpart of an OC-N signal, called a STS-N signal, has a standardized frame format and a frame duration of 125 microseconds. An STS- 1 frame (Fig. 8) consists of 9 rows and 90 columns where the width of each column is 1 byte. It contains 3 columns of transport overhead and 87 columns of payload and path overhead. The transport overhead is divided into 9 bytes of section overhead and 18 bytes of line overhead. There are 9 bytes of path overhead and 774 bytes of payload. Frames are transmitted sequentially by row, in a pattern of 3 bytes of transport overhead followed by 87 bytes of payload and path overhead. An STS-N signal is formed by synchronously byte interleaving N STS-1 signals (Cheung, 1992). In addition to payload, each STS-N frame contains N copies of the section, line, and path overheads (Hac and Mutlu, 1989). The concatenated STS-Nc signal, which is formed by combining N STS-Is together as one entity, provides a contiguous high speed channel (Fig. 9). An STS-Nc signal sends N copies of the section and line overheads and a single copy of the path overhead. The public BISDN network is expected to provide a powerful and ubiquitous infrastructure for transporting multimedia applications (Cheung, 1992). For BISDN, the SONET payload is divided into a number of ATM cells. Pointers in the SONET
20
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
9 I Columns Column
26O Columns
Section p Overhead a (3 rows) t h '0 Line v Overhead e r (6 rows) h e
a d ...............
STS-3C Payload
............
Fig. 9. ATM cells in an STS-3c frame.
overhead are used to indicate where the cells are placed, which allows the SONET payload to be flexibly allocated to a wide range of applications with varying bandwidth needs. For example, S O N E T / A T M based BISDN may be used to interconnect supercomputers and high performance workstations across public metropolitan and wide area networks by transporting bursty Gbps data streams over multiple STS-3c channels or over a single high capacity channel such as an STS-12c, STS-24c, or STS-48c. SONET simplifies multiplexing and reduces the quantity of network equipment required by each node (Cheung, 1992). Because SONET conforms to international standards, it allows different vendors' products to be interconnected at the optical level. It facilitates a mid-span meet capability so that equipment from different vendors can be interconnected in a point-to-point system. SONET standards are expected to cover interoffice transmission, cross-connects, switching, local distribution, and private LANs. When complete, they will encompass standards for transmission rates, frame formats, optical and electrical parameters, operations, administration, and maintenance (OAM) support, and protocols for digital cross connection and automatic protection switching. Architectural principles and functions for ATM layer OAM, as well as some general OAM rules for ATM networks, are primarily described in ITU-T Recommendation 1.610 " O A M Principles of the BISDN Access" (Anderson and Nguyen, 1991; Breuer, 1991). A specific OAM flow is responsible for maintenance of the transport mechanism at each of five networking levels in the ATM and physical layers. These flows are designated F1 through F5 and are transmitted in ATM cells specifically dedicated to OAM functions. Like user cells, OAM cells are routed by cell headers, but they use separate PT identifiers and access the management plane rather than the AAL. The F4 and F5 flows correspond to OAM related communications at the VPC and VCC levels of the ATM layer, respectively. Their functions include monitoring and reporting of VPC availability, V P C / V C C error performance, and V P C / V C C cell loss or insertion. PL "alarm
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
21
surveillance" and VPC "keep-alive" are two complementary approaches to VPC failure detection. The PL supports ATM layer transport using flows F1, F2, and F3. These provide OAM functions for the regenerator section, line, and transmission path levels of the SDH PL, respectively.
9. ATM applications at the NASA Lewis Research Center Today's scientific and engineering computing environments are moving away from centralized supercomputer systems toward distributed architectures that contain large scale computing platforms, high performance graphics workstations, and other diverse elements (Catlett, 1992). Aggregations of small, state of the art, parallel or clustered computers, supported by gigabit networks, may work together to solve a single problem and appear to users as a single processing system. In this type of cooperative computing environment, a network is used as a "backplane" in much the same way as a backplane or bus is used to interconnect components of a single computer. NASA Lewis researchers are experimenting with cluster computing as an alternative to vector processing and fixed hardware parallel processing for solving large problems. They are testing ATM for use as the transport fabric to support this activity. Approximately 70 high performance workstations, including Hewlett-Packard, IBM RS-6000, Silicon Graphics, and Sun workstations, have ATM adapter cards installed. Two alternative methods are currently under investigation. In the first, workstations are placed in close proximity and networked together using ATM as the transport mechanism. This enables closely coupled processing in a similar manner to parallel processing. In the second alternative, ATM is used to support a "compute engine" for cluster computing in which available cycles from any or all of the workstations distributed throughout the Center can participate in solving a particular problem. Cycles can be donated by a workstation when its normal user is using less than the full capacity of the workstation's computing power or when the normal user is inactive (e.g., at night). By taking advantage of factors such as different time zones, an extension to this research is the wide area use of clusters of workstation clusters. Time differences can be exploited to build these clusters of clusters from geographically dispersed workstations that are idle at different times. In this manner, individual cluster participation in the solution of a problem can vary as the time of day changes across the country. NASA Lewis currently has six to seven ATM switches installed in selected buildings. Most of these switches are concentrated in the Advanced Computing and Communications Lab (ACCL), primarily in support of the Lewis' Advanced Computational Environment (LACE) cluster and some experimentation. One ATM LAN interconnects workstations within the ACCL using point-to-point fiber distributed throughout the building and connects to various Lewis buildings via the fiber optic backbone. Another switch is being installed in the building that houses the fluid dynamics researchers. Two switches reside in a gateway building that connects the local ATM environment with wide area networks. One is part of a private point-to-point ATM network over which video experiments have been conducted between the NASA Lewis, Ames, and Langley Research Centers. The other connects to public carriers' networks which are used to
22
C.M, Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
provide communication between NASA Centers. NASA Lewis is presently formulating plans to reach more researchers' workstations throughout the entire Center and to increase its local and wide area use of ATM. For example, additional ATM switches and network interfaces are likely to be installed to support the current trend of moving away from private point-to-point networks and toward taking advantage of emerging public network facilities, such as the proposed NREN. The ATM switches currently installed at NASA Lewis have maximum internal switching speeds of 2.5 Gbps. These are supporting DS-3, TAXI, and OC-3 traffic and will eventually encompass OC-12 rates. The majority of NASA Lewis' ATM applications use SEAL to carry mixed media traffic such as that transmitted in support of simulation studies. There is also some uncompressed video traffic running under AAL 1. While semipermanent virtual circuit implementations have been used for initial experiments, switched virtual circuit implementations are replacing them due to the limitations associated with setting up permanent circuits. NASA Lewis is an active participant in some major NASA-wide WAN initiatives which use ATM for sharing information. The ATM network between the NASA Lewis, Ames, and Langley Research Centers was installed as an initial "proof-of-concept" demonstration. To develop an understanding of the effects of latency, three simultaneous video tasks were initiated over point-to-point links between Lewis and Ames and between Ames and Langley. Each video path was approximately 10,000 miles long, based on a round trip from Lewis, to Ames, to Langley, and back. The observed latency due to all switching and other equipment was on the same order of magnitude as the propagation delay caused by the speed of light. These initial results have demonstrated that ATM has the capability to allow all NASA Centers to have instantaneous access to a variety of audio, video, numeric, and textual information that resides at any other Center. NASA Lewis is exploring the use of ATM for providing personnel at other government sites, industrial organizations, and academic research institutions with remote access to local facilities for collaboration on experimental research. Such a collaborative environment would require a gigabit network to allow multiple users to work together in real time to steer an experiment and remotely visualize the data (Lyles and Swinehart, 1992). ATM could transport data between local and remote sites and give all personnel the ability to observe and participate in the experiment. Large experimental facilities such as wind tunnels are critical to improving the current fleet of aeronautics in the U.S. and to developing the next generation. These are limited in number and, in recent years, have become viewed as national resources. Industrial partners need a way to utilize these facilities without encountering massive disruptions in their work flow or their lives. This is especially true when studying actual flight conditions, with high Reynolds' numbers and widely varying temperatures. Frequent travel to scarce wind tunnels and other test facilities capable of approaching these conditions can be very disruptive for experimental researchers in industry. However, ATM's ability to provide very high bandwidths and to transport multimedia can have a major impact in enabling these researchers to stay at their home sites with their familiar support facilities. ATM can offer a total visual and audio interaction between researchers' home sites and the site of an actual test facility. Also, from their
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
23
home sites, researchers can drive the experiment at the test facility and receive instantaneous responses about the data being collected. Another interesting application of ATM at NASA Lewis is in the remote steering of fluid flow simulations that are performed on a large vector processor such as a Cray supercomputer. This type of data can only be analyzed via visualization. After processing, the visual data is transported over an internal ATM LAN to a high speed terminal which forwards the data to the ACTS satellite. The ACTS satellite, also using ATM, beams the data to a high speed terminal in Seattle, Washington, and through a mirrored ATM LAN to workstations used by researchers at Boeing. Based on their analysis of the data, the Boeing researchers have the ability to use a non-ATM terrestrial path to remotely steer the direction of the simulation, in a similar manner to an experimental researcher being able to drive an experiment at a remote test facility. Advanced visualization and data representation techniques, including high definition television, virtual reality, and holographic animations, can be embedded within full model simulations that incorporate elements from both solid and fluid dynamics disciplines. While a simulation is running, ATM can transport a combination of text, audio, visual, and animation data to a user at a remote site. The remote user can then use this data to run an application program that displays a visualization of, for example, a fluid flow, solid structure, or the whole combined model. The remote user can also participate in the simulation through an interactive exchange of data in both directions. ATM's ability to handle multiple traffic types in one continuous stream can also be used to increase cooperation between experimental and analytical researchers. While researchers at one site are running an experiment, researchers at remote sites can run simultaneous RT simulations of various components of the model under investigation. ATM is a candidate for multicasting parameters, extracted from the experimentally collected data, to the remote sites where they are required for the simulations. ATM may also be capable of distributing the visualization data resulting from the simulations. Thus, iterating between the experiment and simulations, all participants would have the ability to provide settings that guide the experiment. Without ATM, multicasting visualization data to an audience of up to 100 users in a full duplex interactive mode would require more network capacity than the highest level of service available from most communications carriers today. Any synchronous protocol capable of providing such high transmission rates would also be prohibitively expensive. At present, ATM is the only network technology with the potential to transport this type of load. An obvious extension to the above is the emerging MetaCenter concept. MetaCenters house massively parallel processors, advanced visualization devices, virtual reality technology, and other scarce powerful computing resources. Like wind tunnels and other sparse experimental facilities, MetaCenters are extremely expensive and very small in number. Consequently, analytical researchers throughout the U.S. need to have remote access to these facilities. ATM is a candidate as the enabling technology to support this concept. ATM was originally implemented at NASA Lewis for the purpose of transferring numerical information needed to support simulations and experiments. However, nonnutactical information transfer has also emerged as an application for ATM and may surpass numerical information transfer as a motivator for further ATM introduction at
24
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
NASA Lewis. This is largely due to ATM's potential to play a major role in accessing the World Wide Web and other information superhighway applications. The Mosaic browser, for example, provides users with access to a wide variety of data types, not only text and numeric data, but also image, visual, audio, and animation data. ATM is the leading candidate for providing the networking support to accommodate this multimedia traffic, especially animation which generates large quantities of very bursty traffic. NASA Centers are looking very seriously at how they can form partnerships with existing ATM endeavors and merge their interests with those of other ATM networks installed in support of various consortia of industry, academic, and government organizations. A primary candidate is for NASA Lewis to interface with the Washington, DC, Advanced Technology Demonstration (ATD) network which includes the NASA Goddard Space Flight Center. A second candidate is for NASA Lewis to interface with emerging statewide networks, such as the one proposed to support distance education in the state of Ohio. NASA Lewis and the entire industry face many challenges before the use of ATM becomes more widespread. The need to seamlessly interconnect ATM LANs and WANs and to guarantee security and privacy, through firewalls that are capable of handling large amounts of data, are major problems. Another unresolved issue is that, while the WAN infrastructure and the local backbone use fiber optics, less expensive alternatives such as category 5 twisted-pair may be more appropriate for carrying information to the faceplate. Also, the large values of bandwidth-propagation delay product on a high speed, cross-country ATM network make flow control impossible and require new algorithms for admission control. In addition, ATM is the only communications technology capable of handling the next generation of distance learning, in which information will be shared among participants at multiple locations in seminar fashion, as well as disseminated in the classical lecture form of classroom environment. While the feasibility of multicast ATM technology has been demonstrated, the nature of multicast traffic still poses challenges for this type of application (Cox et al., 1993; Grovenstein et al., 1994).
I0. Conclusion ATM offers many advantages over more traditional networking technologies. These include inherent high bandwidth and inherent scalability in port data rates and network throughput (Cox et al., 1993). Through its use of asynchronous cell multiplexing, ATM has the flexibility to allocate bandwidth on demand and to support many different kinds of services irrespective of their bit rates, burstiness characteristics, or quality requirements (Cheung, 1992; De Prycker et al., 1993; Irvin, 1993). Another benefit of ATM is that its standards are well advanced. Consequently, a single ATM network with standard interfaces has the ability to support a variety of different services more cheaply and simply than using separate networks to support each service. Also, ATM has the flexibility to accommodate, without major modifications, technological advances such as faster data rates and future services with as yet unknown characteristics.
C.M. Bakes, F.N. Goldberg/J. Eng. Technol. Manage. 13 (1996) 1-28
25
ATM incorporates many features that have been designed to improve network performance (Bae and Suda, 1991). It takes advantage of the error free properties of fiber optic transmission media by using light weight protocols. Also, once a VP has been established and assigned a predefined route, an ATM host can set up new VCs on that VP without having to request them from the network and without having to rewrite the routing table. This reduces call setup delay to a minimum. ATM's use of small cells reduces packetization delay at sources and, compared with traditional variable length packets, its use of fixed length cells simplifies buffer and queue management (Cheung, 1992). Further, because cells belonging to the same call follow the same route through an ATM network, switches do not need to interpret or modify the VCI fields in cells on VPCs. This minimizes the processing required for routing decisions and, together with the use of small, fixed length cells and light weight protocols, reduces the functionality required in switching nodes. ATM thus facilitates the high speed switching, low delay, and small delay jitter that are important for RT and CBR applications. Also, through statistical sharing, ATM makes efficient use of network bandwidth, buffers, and processing resources and, when combined with SONET, it provides a bandwidth efficient transport mechanism for a wide variety of applications (Eckberg et al., 1991).
11. Glossary Acronym AAL ACCL ACTS ATM BISDN BOM CBR CCITT CDV CFD CLP CLS COM CON CS EFCI EOM FDDI GFC HEC IP ISDN
Explanation ATM adaptation layer Advanced Computing and Communications Lab Advanced Communications Technology Satellite Asynchronous transfer mode Broadband ISDN Beginning of message Constant bit rate Consultative Committee for International Telephone and Telegraph Cell delay variation Computational fluid dynamics Cell loss priority Connectionless Continuation of message Connection oriented Convergence sublayer Explicit forward congestion indicator End of message Fiber distributed data interface Generic flow control Header error control Intemet protocol Integrated Services Digital Network
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
26
ISO ITU-T LAN NASA NE NNI NREN OAM OC OSI PDU PL PMD PRM PT QOS RT SAR SDH SDU SEAL SMDS SONET STM STS TC TCP TD UNI UPC VBR VC VCC VCI VP VPC VPI WAN
International Standards Organization International Telecommunications Union - Telecommunication Standardization Sector Local area network National Aeronautics and Space Administration Network element Network node interface National Research and Education Network Operations, administration, and maintenance Optical carrier Open Systems Interconnection Protocol data unit Physical layer Physical medium dependent Protocol reference model Payload type Quality of service Real time Segmentation and reassembly Synchronous digital hierarchy Service data unit Simple and efficient adaptation layer Switched multimegabit data service Synchronous Optical Network Synchronous time division multiplexing Synchronous transport signal Transmission convergence Transmission control protocol Traffic descriptor User network interface Usage parameter control Variable bit rate Virtual channel Virtual channel connection Virtual channel identifier Virtual path Virtual path connection Virtual path identifier Wide area network
Acknowledgements This research was funded by NASA Cooperative Agreement NCC3-235.
C.M. Bakes, F.N. Goldberg / J. Eng. Technol. Manage. 13 (1996) 1-28
27
References Anderson, J. and Nguyen, M.D., 1991. ATM-layer OAM implementation issues. IEEE Communications Magazine, 29 (9), September: 79-81. Bae, J.J. and Suda, T., 1991. Survey of traffic control schemes and protocols in ATM networks, Proceedings (~['the IEEE, 79 (2). February: 170-189. Biagioni, E., Cooper, E. and Sansom, R., 1993. Designing a practical ATM LAN. IEEE Network. 7 (2), March: 32-39. Bonomi, F. and Fendick, K.W., 1995. The rate-based flow control framework lor the available bit rate ATM service. IEEE Network, 9 (2). March. Boyer, P.E.. Guillemin, F.M., Servel, M.J. and Coudreuse, J.P., 1992. Spacing cells protects and enhances utilization of ATM network links. IEEE Network, 6 (5). September: 38-49. Breuer, H.J., 1991. ATM-layer OAM: Principles and open issues. IEEE Communications, 29 (9), September: 75-78. Brochin. F.M., 1992. A cell spacing device for congestion control in ATM networks. Perjbrmance El'aluation. 16: 107-127. Burgin. J. and Dorman, D., 1991. B-ISDN resource management: The rote of virtual paths. IEEE Communications, 29 (9), September: 44-48. Catlett. C,E., 1992. In search of gigabit applications. IEEE Communications, 30 (4), April: 42-51. Cavanagh. J.P., 1992. Applying the frame relay interface to private networks. IEEE Communications, 30 (3), March: 48-64. Cheung. N.K., 1992. The infrastructure for gigabit computer networks. IEEE Communications, 30 (4), April: 60-68. Cox, Jr., J.R., Gaddis. M.E. and Turner, J.S.. 1993. Project Zeus. IEEE Nem'ark. 7 (2), March: 20-30. Crutcher. L.A. and Waters, A.G.. 1992. Connection management for an ATM network. IEEE Network, 6 (6), November: 42-55. De Prycker, M., 1993. Asynchronous Trans.fer Mode: Solution.for Broadband ISDN, 2nd ed. Ellis Horwood. De Prycker, M., 1992. ATM switching on demand. IEEE Nen~'ork, 6 (2). March: 25-29. De Prycker, M., Peschi, R. and Van Landegem. T.. 1993. B-ISDN and the OSI protocol reference model. IEEE NeN'ork, 7 (2), March: 10-18. Eckberg. A.E., 1992. B-ISDN/ATM traffic and congestion control. IEEE Network, 6 (5), September: 28-37. Eckberg, A.E., I)oshi, B.T. and Zoccolillo, R.. 1991. Controlling congestion in B-ISDN/ATM: Issues and strategies. IEEE Communications, 29 (9). September: 64-70. Grovenstein. L.W., Pittman, C.. Simpson, J.H. and Spears, D.R., 1994. NCIH services, architecture, and implementation. IEEE Network, 8 (6), November: 18-22. Hac, A. and Mutlu, H.B., 1989. Synchronous optical network and broadband ISDN protocols. COMPUTER, November: 26-34. Handel, R. and Huber, M.N., 1991. Integrated Broadband Networks: An Introduction to ATM-Based Networks. Addison-Wesley. Hong. D. and Suda. T., 1991. Congestion control and prevention in ATM networks. IEEE Network, 5 (4), July: 10-16. lrvin. D.R., 1993. Making broadband-lSDN successful. IEEE Network, 7 (1), January: 40-45. lwata. A., Mori, N., Ikeda, C., Suzuki. H. and Ott, M., 1995. ATM connection and traffic management schemes for multimedia internetworking. Communications ~[" the ACM, 38 (2): 72-89. Kung. H.T., Blackwell, T. and Chapman, A,, 1994. Credit-based flow control for ATM networks: Credit update protocol, adaptive credit allocation, and statistical multiplexing. Computer Communication Ret'iew, 24 (4): 101-114. Lea, C.T., 1992. What should be the goal for ATM. IEEE Network, 6 (5), September: 60-66. Le Boudec, J.Y.. Port, E. and Truong. H.L.. 1993. Flight of the FALCON. IEEE Communications, 31 (2), February: 50-56. Leslie. I.M., McAuley, D.R. and Tennenhouse, D.L., 1993. ATM everywhere'? IEEE Network. 7 (2), March: 40-46. Lyles, J.B. and Swinehart, D.C., 1992. The emerging gigabit environment and the role of local ATM. IEEE Communications, 30 (4), April: 52-58.
28
C.M. Bakes. F.N. Goldberg/J. Eng. TechnoL Manage. 13 (1996) 1-28
Newman, P., 1994. Traffic management for ATM local area networks. IEEE Communications, 32 (8), August: 44-50. Ramakrishnan, K.K. and Newman, P., 1995. Integration of rate and credit schemes for ATM flow control. IEEE Network, 9 (2), March. Roberts, J.W., 1991. Variable-bit-rate traffic control in B-ISDN. IEEE Communications, 29 (9), September: 50-56. Wernik, M., Aboul-Magd, O. and Gilbert, H., 1992. Traffic management for B-ISDN services. IEEE Network, 6 (5), September: 10-19. Yazid, S. and Mouftah, H.T., 1992. Congestion control methods for BISDN. IEEE Communications, 30 (7), July: 42-47.