BER decoding scheme in a multiprocessor environment

BER decoding scheme in a multiprocessor environment

computer -.CCIIl~ . ELSEVIER Computer Communications 19 (1996) 152- 159 Case study Partial ASN. 1/BER decoding scheme in a multiprocessor environ...

877KB Sizes 3 Downloads 81 Views

computer -.CCIIl~ . ELSEVIER

Computer Communications

19 (1996) 152- 159

Case study

Partial ASN. 1/BER decoding scheme in a multiprocessor

environment

Sunwan Choi”, Kilnam Chonb “Protocol Engineering Center, Electronics and Telecommunications Research Institute, 161 Kajong-Dong, Yusong-Gu, Taejon, 305-350, Korea bDepartment of Computer Science, Korea Advanced Institute of Science and Technology, 373-l Kusong-Dong, Yusong-Gu, Taejon, 305-751, Korea

Received 1 April 1994; revised 12 December 1994

Abstract

In many cases, the most time-consuming part of the Open Systems Interconnection (OSI) protocol stack is the encoding and decoding of transfer syntax done in the presentation layer. The conventional ASN.l/BER decoding scheme runs on a whole Presentation Protocol Data Unit (PPDU). To overcome this problem, we present a new ASN.l/BER decoding scheme called Partial Decoding. The Partial Decoding scheme allows us to start decoding partially on each segment of a PPDU before a whole PPDU has been received. The measures show that the new scheme has a performance improvement as high as 40% over the conventional decoding scheme at large message size. Parallel processing for the network layer through the presentation layer is also used for a high performance protocol implementation on the basis of the proposed scheme. Keywords: ASN. 1; BER; OSI; Partial decoding scheme; Parallel implementation

1. Introduction

The primary goal of communication systems is to exchange application data between application processes. In the Open Systems Interconnection (OSI) protocol stack, the application data for local representation is transformed into external data representation at the sending side. Inversely, the external data representation is transformed to the local representation at the receiving side. The data conversion done in the presentation layer’ is the most timeconsuming task in the OS1 protocol stack [l]. Thus, the presentation layer used to be a performance bottleneck. Two approaches have been developed to improve the performance of the presentation layer. One approach is to reduce the number of processing operations required for a single datum. An example is Light Weight Encoding Rules (LWER) [2-41. The LWER approach can be applied to applications where the type and format of the information are predictable. The other approach is to increase communication system speed by performing more processing operations per unit time. Parallel processing techniques [1,5,6] and efficient implementation techniques [7-l l] are classified into the second approach. ’ Transformation is described in the presentation layer in the abstract model. It can be part of the application layer. In this paper, the transformation is performed in the presentation layer. 0140-3664/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDI 0140-3664(95)01037-8

In this paper, we focus our attention on the Abstract Syntax Notation One (ASN. 1) [12]/Basic Encoding Rules (BER) [13] decoding scheme done in the presentation layer at the receiving side. A new approach is proposed to remove the performance bottleneck in the presentation layer. In addition, a parallel processing technique is adopted to show the robustness of the proposed scheme.

1 .I. Problem definition

In many cases, the repeated header-user data structure of the OS1 data units involves the sequential processing of data. This has serious problems in terms of communication performance. In particular, the presentation layer identifies the following problems: The presentation layer does not perform BER decoding until a whole Presentation Protocol Data Unit (PPDU) has been received and reassembled at lower layers. The delay from the lower layers is significant. The presentation layer will abort presentation connection if errors occur in the middle of BER decoding. The errors are detected on the basis of the whole PPDU. This consumes the protocol processing time to make up the whole PPDU at lower layers.

153

S. Choi, K. ChonjComputer Communications 19 (1996) 152-159 l

I

a

IAPPDU

Fig. 1. Relationship between two decoding schemes. (a) Conventional decoding scheme; (b) Partial Decoding scheme. PPDU: Presentation Protocol Data Unit.

To solve the above problems, a new BER decoding scheme, called Partial Decoding, has been proposed by applying BER decoding on one segment in a complete PPDU. Fig. 1 shows the relationship between the conventional decoding scheme and the Partial Decoding scheme. Each segment is carried by an arrived Transport Protocol Data Unit (TPDU). However, there are some technical difficulties in implementing the proposed scheme as follows: Extracting presentation data from a TPDU

As a TPDU arrives, the transport header and the session header should be stripped from the TPDU. The remaining data corresponds to one segment in a complete PPDU. Performing reassembly and reordering

Because the presentation layer assumes that all the data arrives in order, reassembly and reordering are performed at the transport layer as well as the network layer. Using the same TPDUfrom presentation layer

the transport layer up to the

Because the presentation layer performs BER decoding on a TPDU, the session layer should also perform its functions on the same TPDU. Performing BER decoding on one segment in a complete PPD U

The type-length-value scheme of BER may be broken. The BER decoding should be partially performed on the segment breaking up the type-lengthvalue sequence. .2. Related work

the unit of parallelism has been explored [6]. Each processor runs the code for all the layers, and its allocation is scheduled in a round-robin fashion. The presentation layer performs BER decoding on a whole PPDU. Haas [5] suggests a new architecture, called the Horizontally Oriented Protocol Structure (HOPS), to solve sequential processing of the layered architecture. It increases throughput by concurrently performing all the layers on the same packet, called the HOPSpacket. However, Haas did not describe how BER decoding runs on the HOPS packet, Clark and Tennenhouse [I] proposed the Application Level Framing (ALF) concept, which uses a single frame for all functions that are based on the Application Protocol Data Unit (APDU); the single frame is called an Application Data Unit (ADU); i.e. the ADU is defined as the smallest unit which the application (or BER decoding) can deal with out of order.

2. System description 2.1. Model We make two observations on the previous work. First, parallel processing significantly improves the performance, but the presentation layer still works on a whole PPDU. Second, HOPS and ALF are not directly applicable to the current OSI concept because they need a new architecture and a new packet frame. The Partial Decoding scheme proposed here is a new implementation technique rather than a new architecture. Fig. 2 shows the layered model for the scheme. The transport layer delivers a partial SPDU to the session layer before the whole SPDU has been received. The session layer performs its protocol functions on the

application t presentation

Lay9 a partial PPDU


Session Layer a partial SPDU Transport L a whole TPDU

Parallel processing can be used for high performance protocol implementation. Surveys are given by Feldmeier [14] and Zitterbart [15]; however, most parallel protocol implementations do not include the presentation layer. A parallel approach to OS1 processing for the data link layer through the presentation layer by using a packet as

Network Layer> t a whole NPDU Fig. 2. Layered model for Partial Decoding. PDDU: Presentation Protocol Data Unit; SPDU: Session Protocol Data Unit: TPDU: Transport Protocol Data Unit; NPDU: Network Protocol Data Unit.

S. Choi, K. Chon/Computer Communications

l

19 (1996)

1.52-159

has been processed, a session header is carried by the TPD u. When a TPDU arrives in order after a session connection has been established or a whole PPDU has been decoded, a presentation header is carried by the TFDU.

2.3. Reassembly and reordering Fig. 3. Relatiohship between types of data units. TH: Transport header; SH: Session Header; PH: Presentation Header; TPDU: Transport Protocol Data Unit; SPDLJ: Session Protocol Data Unit; PPDLJ: Presentation Protocol Data Unit.

SPDU. It also delivers a partial PPDU to the presentation layer.

partial

2.2. Extraction of headers and data Fig. 3 shows a relationship between types of data units consisting of a header field and a data field. In the Partial Decoding scheme, a session header and a presentation header should be extracted from an arrived TPDU. The appearance of the headers is determined as follows: l

When a TPDU arrives in order after a whole SPDU Persona&cord ::= [APPLICATION 0] IMPLICIT SET ( Name, title [0] IMString, EmployeeNumher, dateOfHire [1] Date, nameQfSpouse [2] Name, [3] lMPL1Cl-I SEQUENCE OF ChildInfonnation DEFAULT {}} ChildInformation ::= SET {Name, dateOfBirth [0] Date} Name ::= [APPLICATION l] IMPLICIT SEQUENCE { givenName IASString, initial IASString, familyName IASString } EmployeeNumber ::= [APPLICATION 21 IMPLIClT INTEGER Date ::= [APPLICATION 31 IMPLICIT IASString

Because reassembly and reordering are very complex operations, this may cause unacceptable performance. Reassembly occurs at the network layer, the transport layer and the session layer. Reordering occurs at the network layer and the transport layer. In particular, reassembly over reordering is required for buffering. The buffer management scheme for reassembly is designed to reduce data copying between layers. The buffer management scheme is similar to the buffer-cut-through scheme proposed by Poo and Ang [16]. Data delivery between adjacent layers is done by passing each buffer address where a header and data are stored. In the middle of Partial Decoding, the session layer does not know that a whole SPDU has been received at the transport layer. Therefore, when the transport layer receives a TPDU with End Of Transport (EOT) set to true, which indicates the last segment in a

[APPL O]

--iA

I.

. 1 Date

I struct Type-Tree { unsigned char TYPE; /* identifier *I long length; unsigned char *primitive; I* primitive value *I int ftmction_number, /+ primitive translation function */ struct Type-Tree *constructor, I* child */ struct Type-Tree *cousin; /* same level node *I

“Ralph” “Susan” u

Fig. 4. ASN. 1 example.

(a) Personal

Record

ASN. 1 type; (b) type tree; (c) C data structure

for type tree.

“B” “T”

“Jones” “Smith”

S. Choi. K. ChonlComputer

Communications

SPDU, the transport layer must notify the fact that the whole SPDU has been received to the session layer. The presentation layer also does not know that a whole PPDU has been received at the session layer. When the session layer previously received (or just received) a session header with Enclosure Item set to end and it then receives a TPDU with EOT set to true, the TPDU also indicates the last segment in a PPDU. 2.4. Protocol processing As an NPDU arrives from the network, OS1 processing from the network layer up to the presentation layer proceeds as follows. The network entity performs a header checksum function, a time-to-live function, a header format analysis function and a route function. When a whole TPDU has been received, the network entity passes the buffer address for the TPDU to the transport entity. For every TPDU, the transport entity performs a multiplexing function and an error control function; we assume that a checksum function is implemented in hardware because it is a very time-consuming function for software. If an error occurs, an acknowledgement function is performed. Otherwise, the transport entity performs a flow control function (window size is 8) and an acknowledgment function. If the arrived TPDU is in order, the transport entity also passes the buffer address for a partial SPDU to the session entity. The session entity performs a concatenation function and a token management function, if a session header in the arrived partial SPDU exists. The result does not have an effect on the current session connection until the whole SPDU has been processed. The session entity passes the buffer address for a partial PPDU to the presentation entity. The presentation entity decodes the presentation header by applying ASN. l/BER if there exists a presentation header in the received partial PPDU. In the middle of Partial Decoding, the presentation entity may suspend BER decoding until the next segment (a partial PPDU) in a PPDU arrives, and it resumes immediately as the segment arrives. 2.5. ASN.I tool To correctly perform BER decoding and to easily support suspension and resumption, we use an ASN.l tool. The ASN.l tool consists of parser, encoders and decoders. The ASN.l parser accepts an ASN.l description (see Fig. 4a for a sample ASN. 1 description). It then generates C language data structures to store and to retrieve local representation and C language encode/ decode routines (or encoders/decoders) to translate local

19 (1996)

155

152-159

i segment is recursively checked by the typelength-value scheme

the whole PPDU

Fig. 5. State transition diagram for Partial Decoding.

representation to/from transfer syntax. in particular, the ASN. 1 parser must produce type trees (or type structure trees; see Fig. 4b) corresponding to ASN.l data types from the ASN.1 description. At run-time, the ASN. 1 decoders examine the type-length-value scheme by searching all the nodes in the type tree. When the BER decoding is suspended, the current node address in the type tree is saved. In hand-coded ASN. 1 decoders, all the variables of system process must be saved. Fig. 4c shows the C language data structure for constructing a node in the type tree. The function-number field in the node structure indicates a function for translating transfer syntax to local representation, such as a primitive translation function.

2.6. Partial decoding procedure During partial decoding, ASN.l decoders use a presentation connection management table (maintaining a presentation connection information). a type tree and a buffer address where an arrived segment is stored. Fig. 5 shows the state transition diagram for the Partial Decoding. In the initial-state, the presentation entity awaits the first segment, and the state is transitted to the conversion-state as the segment arrives. In the conversion_state, ASN. 1 decoders recursively examine the type-length-value scheme by searching the type Table 1 Suspension reason for the presentation entity in the middle of Partial Decoding Suspension reason

Description

COMPLETE

ASN. 1 decoders have exactly decoded the

TYPE-ONLY TYPE-BROKEN LENGTH-BROKEN NO-VALUE VALUE-BROKEN

type-length-value scheme for a segment remaining data contains only a type field remaining data contains a part of multi-octet type field remaining data contains a type field and a part of length field remaining data does not contain a value field remaining data contains a type field, a length field and a part of value field

156

S. Choi. K. ChonjCompufer Communications 19 (1996) 152-159

tree, and by retrieving the buffer where the segment is stored. The Partial Decoding cannot proceed (1) when an error occurs, (2) when the whole PPDU has been decoded, or (3) when the next segment has not yet arrived. In the first case, the partial PPDU needs to be abandoned. However, the time wasted in Partial Decoding is less than in the conventional decoding scheme. In conventional decoding, the delay for the transport layer through the session layer is significant. In the second case, the presentation entity enters the initial-state, then awaits the first segment in a new PPDU. In the third case, the presentation entity is suspended and the following variables are saved to be reused later: (1) a suspension reason (see Table 1); (2) the buffer address of the remaining data; and (3) the current node address in the type tree. In particular, if the suspension reason is VALUE-BROKEN and the ASN. 1 data type is a primitive type, such as OCTET STRING or BIT STRING, the remaining data can be transformed to local representation, although the amount of the. data is less than the number of octets to be decoded. This enables BER decoding to be streamlined. The presentation entity enters the state suspending-state, then awaits the next segment in the PPDU. For example, if an application is bulk data transfer with the same data type, such as file contents and image data, the BER decoding can be performed without receiving all data. 2.7. Parallel processing Parallel processing is a well-known approach to enhance the performance of communication subsystems. A multiprocessor implementation may use several forms of parallelism embedded in communication protocols. We exploit a layer pipeline formed by the sequence of different layers. The throughput of a pipeline, however, is limited by the slowest stage, such as the presentation layer. To overcome this problem, a processor is additionally allocated to each presentation connection by a round-robin scheme.

3. Implementation The following environments are used to implement our approach: the parallel programming language; both the efficient context switching and the efficient process creation; and the flexible network topology.

Application Simulator I

TI

I

Receive Memory

Send Memory Manager

I

+I

Fig. 6. Implementation

T structure

on transputer

network.

arbitrary topology. Implementations have been performed on the Parsytec SuperCluster. The programming language used is Par.C System [18], which supports parallel language and a network configuration facility, running under a parallel operating system, Helios [19]. Par.C System is extended to control the parallel execution of processes. Channel provides communication links between processes or transputers. SendLinklRecvLink sends/receives data between transputers. Data transfer between processes uses -In/-Out. Both select and guardlguardlink support guarded commands. Multiple processes are concurrently created by using par. 3.2. Implementation structure We implemented the presentation kernel functional unit, the session kernel functional unit, the transport protocol class 4, and the connectionless network protocol type 1 function. Implementations have been performed to compare our approach with the conventional approach. Ten transputers are used for the implementations, as depicted in Fig. 6, where Tn (n stands for an integer) represents a node on the transputer network. The implementation structure is similar to the model proposed by Zitterbart [ 151as she applied it to the network layer. In Fig. 6, the multiple instances of the presentation protocol and the session protocol are performed by different processes on a single transputer (TlO). The processes are replaced by processors.

4. Performance measurement 3.1. Implementation environment We use the transputer [17] as a processing unit. The transputer can be programmed in several parallel languages, and both context switching and process creation take less than 1 p.s. It can also support a network of

Performance measurements have been performed with a simple system library, clock(), which can be used to determine the processor time used by running the program up to the moment this function is called. That is, clock0 returns the value of the internal transputer

S. Choi. K. ChonlComputer

Communications

Table 2 Processing delay for protocol functions measured on a transputer PER: conventional decoding scheme, hand-coded ASN. 1 decoders, Packed Encoding Rules; BER( 1): conventional decoding scheme, hand-coded ASN. 1 decoders, Basic Encoding Rules; BER(2): conventional decoding scheme, tool-based ASN.1 decoders, Basic Encoding Rules; BER(3): Partial Decoding scheme, tool-based ASN. 1 decoders, Basic Encoding Rules

19 (1996) 152-159

157

12 10 8 6 4 2 0 0

5 10 15 20 25 30 35 40 45 50 Data size (Kbytes)

Layer

Functions

Processing delay (w)

Network layer

route header format analysis header checksum time-to-live reassembly

189 299 418 188 192

Transport layer

multiplexing & in-order checksum by software Bow control acknowledgement reassembly

check

Session layer

header decoding & event processing separation reassembly & token management

Presentation layer

header decoding PER BER(l) BER(2) BER(3)

Timer library

set cancel

(simply-encoded-data)

162 4480 14 261 128 232 38 9 109 128 320 1408 2112 580 392

clocks. To obtain a time measurement in seconds, one should divide the return value clock() by CLOCKS_ PER-SEC, which is replaced by a priority-dependent value at run-time (15625 for low, 1000000 for high priority). The performance of BER decoding depends upon the complexity for ASN. 1 data types for transfer syntax. The ASN. 1 processing delay may be very large for the transfer syntax that is composed of constructor types, such as

0

5 10 15 20 25 30 35 40 45 50 Data size (Kbytes)

Fig. 7. Performance measures (1): throughput versus data size. A: partial decoding scheme; 0: conventional decoding scheme.

Fig. 8. Time delay before starting BER decoding: time delay versus data. A: partial decoding scheme; a: conventional decoding scheme.

SEQUENCE OF SEQUENCE. In evaluating the performance, test data is composed of the PersonnelRecord and OCTET STRING types. For the test data, we use a Session Service Data Unit (SSDU) of an unlimited size, a Transport Service Data Unit (TSDU) of 16384 bytes, a Network Service Data Unit (NSDU) of 8 I92 bytes, and a Network Protocol Data Unit (NPDU) of 4500 bytes. Table 2 shows the processing delay of each protocol function measured on the transputer. The decoding delay for a single PersonnelRecord type on a transputer is about 1.408 ms for the conventional approach and 2.2112 ms for our approach. Our approach spends more processing time because ASN.1 decoders were implemented by stack operations; for suspending, ASN.l decoders perform the push operation to save the current node address in the type tree and the buffer address of the remaining data, and ASN.l decoders perform the pop operation to retrieve them for resuming. Fig. 7 shows the observed relationship between throughput and data size measured for the Partial Decoding and conventional decoding schemes. Both are implemented with the ASN.l tool. At large data size, the Partial Decoding scheme shows a performance improvement of about 40% compared with the conventional decoding scheme; at small data size, performance measures show almost the same throughput. The throughput of the Partial Decoding scheme increases linearly with data size. However, the increases of throughput are low when the presentation entity

0

5

10 15 20 25 30 35 40 45 50 Data size (Kbytes)

Fig. 9. Performance measures (2): throughput versus data size. A: partial decoding scheme; 0: conventional decoding scheme.

S. Choi, K. ChonlComputer Communications 19 (1996) 152-159

158

Table 3 Overhead occurred by breaking up the type-length-value sequence Data size

5K

10K

15K

20 K

25 K

30 K

35 K

40 K

45 K

50 K

Overhead

I .56%

1.23%

1.16%

3.15%

3.84%

3.53%

4.38%

4.10%

4.32%

4.94%

frequently awaits the next segment after a segment was decoded. Fig. 8 shows the elapsed time between receiving the beginning of the PPDU and starting the BER decoding for the Partial Decoding and conventional decoding scheme. The Partial Decoding scheme has the advantage where segmentation occurs frequently. Because the transport and session layers must receive all segments and reassemble them before passing them up to the adjacent upper layer in the conventional decoding scheme, the delay is a significant part of the processing time. Although the Partial Decoding scheme has a high throughput, it incurs another level of complexity by breaking up the type-length-value sequence. This introduces an additional overhead, as shown in Table 3. 4.1.

Comparison

with hand-coded

ASN.1

decoders

In general, the hand-coded ASN. 1 decoders are faster than ASN. 1 decoders using an ASN. 1 tool [8, 111. Fig. 9 shows our experiments. At large data size, the Partial Decoding scheme shows improvement of about 30% in performance, but not at small data size. We also measured the conventional decoding scheme with the hand-coded ASN.1 decoders, by applying Packed Encoding Rules (PER)4. The Partial Decoding scheme obtains performance improvement of about 28%.

5. Concluding remarks This paper presents a new BER decoding technique called the Partial Decoding scheme to improve the performance of the communication subsystem for the OS1 protocol stack. In particular, parallel processing was used for high performance protocol implementation. At large data size, performance measures show that our approach is faster than the conventional approach, but not at small data size. Our approach will obtain better performance when bulk data transfer with the same attributes (for example, file content and image data) occurs frequently. Our approach also has an advantage in cases where the segmentation occurs frequently. We can further improve our approach by utilizing more efficient implementation techniques, such as buffer management. In addition, our approach needs to be applied on Packed Encoding Rules (PER).

Acknowledgements The authors would like to thank Peter Furniss at Peter Furniss Consultants and Kwang-Soo Chung at KwangWoon University for their fruitful comments. Also, many thanks to the reviewers for.their patience and for their many helpful suggestions. We would also like to thank Jim Richardson for his proofreading.

References 111D. Clark and D.L. Tennenhouse,

‘Architectural considerations for a new generation of protocols’, SIGCOMM (September 1990) pp 200-208. 121M. Bever and U. Schiiffer, ‘Coding rules for high speed networks’, IFIP Trans. Upper Layer Protocols, Architectures and Applic. (1992) pp 119-132. 131 C. Huitema and A. Segall, ‘A high speed approach for the OS1 presentation protocol’, IFIP Workshop on Protocols for HSN (May 1989) pp 277-287. [41 IS0 CD 8825-2 Information Technology - Spectjication of ASN.1

Encoding

Rules -

Part 2: Packed

Encoding

Rules,

ISO, Geneva (1991). [51 Z. Haas, ‘A protocol structure for high-speed communication over broadband ISDN’, IEEE Network Mag. (January 1991) pp 67-70. WI M.R. Ito, T.Y. Takeuchi and G.W. Neufeld, ‘A multiprocessor approach for meeting the processing requirements for OSI’, IEEE J. Selected Areas in Comm., Vol 11 No 2 (February 1993) pp 220-227. [71 M.B. Abbott and L.L. Peterson, ‘Increasing network throughput by integrating protocol layers’, IEEE/ACM Trans. Networking, Vol 1 No 5 (October 1993) pp 4-19. PI C. Huitema and G. Chave, ‘Measuring the performance of an ASN.l compiler’, IFIP Trans. Upper Layer Protocols, Architectures and Applic. (1992) pp 105-l 18. PI H. Lin, ‘Estimation of the optimal performance of ASN.l/ BER transfer syntax’, ACM Comput. Comm. Rev., Vol 23 No 3 (July 1993) pp 45-58. [lo] G. Neufeld and S. Voung, ‘An overview of ASN.l’, Comput. Networks & ISDN Syst., Vo123 (1992) pp 393-415. [l I] M. Sample and G. Neufeld, ‘High-performance ASN.l compiler’, Comput. Comm, Voll7 No 3 (March 1994) pp 156-171. [12] IS0 8824 Information Technology - Abstract Syntax Notation One(ASN.1) - Part I: Basic Notation, ISO, Geneva (1992). [13] IS0 8825 Information Technology - SpeciJication of ASN.1 Encoding Rules - Part 1: Basic Encoding Rules, ISO, Geneva (1992). [14] D.C. Feldmeier, ‘A framework of architectural concepts for high-speed communication systems’, IEEE J. Selected Areas in Comm., Vol 11 No 4 (May 1993) pp 480-488.

S. Choi, K. ChonlComputer

Communications

M. Zitterbart, ‘High-speed transport components’, IEEE Network Mtzg. (January 1991) pp 54-63. G. Poo and W. Ang, ‘Cut-through buffer management technique for OS1 protocol stack’, Comput. Comm., Vol 14 No 3 (April 1991) pp 166-177. INMOS Ltd.,The Transputer Family, INMOS Ltd., UK (1987)

19 (1996) 152-159

159

[18] Parsec

Developments Par.C System: L’ser’s Manual and Reference, Parsec Developments, Netherlands (1990). [19] Perihelion Software Ltd. The Helios Parallel Operating Systern, Prentice-Hall, Hemel Hempstead, UK (1991). Library

Kilt-tam Chon received his MS in computer science and PhD in systems engineering from UCLA in 1967 and 1974, respectively. He has special interests in system architecture, includign computer networking, distributedprocessing and information systems. He has worked as a principal investigator of a national project on workstation development, intelligent processing computer development and the information high-way. He serves as a Professor in the Computer Science Department at KAIST. He is on leave from KAIST and currently working OS a visiting professor ut Stanford University.

Sunwan Choi received the BS in computer science from the Hong-lk University in 1984 and MS in computer sciencefrom Korea Advanced Institute of Science and Technology in 1986. Since 1990 he has been a PhD student at KAIST. He currently works at ETRI as a technical member of research stafi His research interests include high performance communication architecture, high-speed protocol implementation and protocol engineering. I