Comput.& Indus.EngngVoL7, No. 3, pp. 209-216,1983 Printedin GreatBritain.
0360--8352/83/030209-08503.00/0 PergamonPressLtd.
NETWORK PROTOCOL DESIGN: MODEL RELATIONSHIPS, HEURISTIC FEATURE SPECIFICATION AND ANALYTICAL EXTENSIONS ROBERT P. DAVIS Department of Industrial Engineering and Operations Research, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, U.S.A. (Received in revised form July 1982)
Abstract--This paper presents a brief description of the network protocol design problem, and a mathematical model which has been developed to assist in the specification of protocol features. Heuristic solutions to the feature specification problem are described in the context of a design process and computational results from such heuristics are presented. Finally, analytical issues appropriate to the feature specification model, and their utility, are identified. INTRODUCTION
For a number of years researchers have been interested in problems associated with the design and operation of communication networks. With respect to both voice and data transmission networks, this research has been directed primarily at resolving problems in: transmission routing and switching, line capacity planning and concentrator location analysis[18]. The application of operations research techniques to the modelling and analysis of both local and distributed communications networks has not only led to improvements in systems design and operation; but more importantly, they have led to a greater understanding of the interactions of network components: their capacities, limitations and resource requirements. Recently, researchers at the National Bureau of Standards have been engaged in an effort to provide a definitive basis for the design and specification of protocols for host-to-host computer network communications[14], these protocols being common to both distributed and local networks. Specifically, the emphasis of this research centers on distributed networks of independent (host) computers which interact through commonly accessed communication links (i.e. telephone or satellite circuits). Unlike local networks (i.e. close geographic proximity in which complete control is exercised over both computing hardware and transmission media), the diversity of host characteristics (both hardware as well as software) in a distributed network requires both an awareness of and a subscription to standard procedures for both initiating and maintaining intercommunication between host computers (i.e. protocols). The development of computer network protocols is of central importance in both the effectiveness and efficiency of local and global communications systems. The need for greater standardization of not only protocols but the procedures employed in their development is an internationally accepted fact[B]. This same rationale applies to both global (i.e. ARPA Net) as well as local (i.e. Computer Integrated Manufacturing System) computer networks. In particular, protocol standardization enhances the compatability of interfacing computing hardware and software to facilitate information interchange. Further, standard communication protocols permit a greater transportability of both operating and control system software and algorithmic processes. This last point is of particular significance to the continued development of automated manufacturing systems. With a tractable, normative basis for the evolutionary (and interactive) design of optimal local network protocols, an enhanced reliability and transportability of control systems and control software will exist. This is an essential step toward the realization of reproducible computer integrated manufacturing systems. The majority of the cost incurred in the development and operation of these systems is software related; and it has been estimated that each system currently in operation has its own unique operating and control system design foundation. Unifying a basis for protocol design is the first step toward providing hardware and software compatibility among these systems and their related components. 209
210
ROBERT P. DAVlS
The network protocol feature specification problem is essentially one of defining a set of communication protocols within each layer of a general host computer architecture. These protocols would be standardized; and, consequently, each host computer in the distributed network would conform to a consistent set of peer layer to peer layer communication procedures. A seven layer architecture is proposed (see Ref. [15]) with each layer performing a definable range of services. Briefly, these services can be identified as follows: • Application--support application processes (e.g. file transfer, common command language). • Presentation--translation, transformation and structuring of data. • Session--inter-connection of presentations and control of data exchange (syncronization, delimiting). • Transport--transfer of data between session entities (selection and reliability). • Network--functional and procedural exchange of data units between transport entities (routing and switching). • Data Link'functional and procedural means for establishing, maintaining and releasing data links between network entities. • Physical--functional and procedural characteristics to establish, maintain and release physical connections (mechanical, electrical) between link entities. For a more detailed and thorough discussion of communication protocols, service features and network architectures see Refs. [1, 16, 19]. FEATURE SPECIFICATION MODEL
Based on an original study by the Institute for Computer Sciences and Technology (NBS), a zero-one integer linear program has been proposed for use in specifying features inherent in a given protocol. The basic structure of this model is given in Fig. 1, and represents a fundamental resource allocation decision scenario. The nature of the features is described by the column headings and the nature of the limited resources by the row headings. It should be noted that all resources can be converted to an equivalent base (i.e. dollars). For example, such
ENHANCEMENT FEATURES
RESOURCES
x 1
x 2
•
.
"
~ ~
Development Effort Design Time Design Cost Implementation Time Implementation Cost Checkout Time Checkout Cost Operational Cost Documentation Training Maintenance Installation Space Complexity Resident Code Swappability Coda Resident Table Space Buffer/Queue Space Temporary Work Storage Control Block Size Per Connection Time Complexity Execution Time Wait Time Recover Network Burden
~
~
8
~
. .
.
x~
B1
B2
DESIGNER/ USER SPECIFIED RESOURCE ALLOTMENTS
aij = COST OF FEATURE
BM
C1
C2
.
.
.
.
.
.
CN
VALUE OF FEATURE
Fig. 1. Linear programming model for protocol development (from Ref, [13]).
Networkprotocoldesign
211
resources as development time and execution time have a value measurable in monetary terms; and further, each feature, if implemented, can be said to consume a specific proportion of each resource. These issues will be discussed later in this paper. In essence, it is assumed that technological and operational requirements exist to define a set of essential features which a particular protocol must exhibit. The model is intended to aid the protocol design community (i.e. network subscribers) in establishing a set of enhancement features for a given protocol. Sets of these enhancement features, along with the kernel (essential) protocol, form a family of protocols associated with host-to-host communication in a given layer of the network architecture. Although the intent of the model is to define these subsets of feature enhancements, the reader should recognize that such a model can also be of value (as a design evaluation tool) in defining the specific set of features which constitute the protocol kernel. In general, the higher the layer in the architecture, the less well defined the technological requirements are. Consequently, the model framework would have more utility in defining features at the higher levels (i.e. layeres 4-7). Mathematically, the feature specification problem is defined by: N
Max: F(x) = ~, cjx~
(1)
.i=1 N
subject to: ~ aijxi <- bi(i = I . . . . . M) j=l
xj =
1, if the feature is included 0, if not included
(2)
(3)
and further, aij, b~,cj>_O (i=1 . . . . . M ; ] = I . . . . . N).
(4)
In the above model, bi "~, maximum quantity of resource "i" available aij ~ the per unit quantity of resource "i" consumed by feature "j" if implemented c~ ~ the value associated with having feature "./" present in the protocol. The effectiveness coefficients (q) are assessments of the utility of having feature "j" present in the protocol being designed (see Ref. [14]). As such, their valuation may not be consistent among individuals involved in protocol specification. Similarly, variability may exist in the resource consumption characteristics (aij) and resource availabilities (bk) among network subscribers involved in protocol design and specification. In this context, the above mathematical model is a useful tool in answering "what if" questions associated with the implementation of prescribed subsets of protocol features. Given the iterative nature of design and specification processes, it is also useful to obtain heuristic solutions to the above problem for given parameter values. Such solutions can provide benchmark information regarding the total utility of a set of features given parametric values for individual valuations. Further, a heuristic result can be used to provide an initial solution and bound value to an optimization algorithm, once a consistent set of parameter values is obtained; thereby, expediting the determination of an optimum set of features. In the following section, two specific forms of heuristic solution technique will be described. Their use in resolving protocol feature specification problems lies in their ability to produce good approximate solutions to the above described mathematical program in a small amount of computing time, thereby facilitating the iterative design and specification process. Heuristic procedures for [eature specification The preceeding mathematical model exhibits two fundamental characteristics which form the basis for the heuristics to be described. First, since every cj (effectiveness coefficient) in the
212
ROBERT P. DAVIS
objective function (1) is non-negative the most desirable situation to present itself would be one in which it is possible to have every decision variable "active" in the final solution (i.e. xj = 1, j = 1 . . . . . N ) . Consequently, one should attempt to find a solution in which only the least desirable decision variables have been made "inactive" (i.e. xj -- 0). One interpretation of what is meant by "desirable" will be explained shortly. Second, in many planning and design problems, it is often the case that individual resource limitations (2) can be related to a common base. As such, these individual restrictions can be aggregated into a single restriction which can be used in place of the original set. Such an aggregate, replacement constraint is called a "surrogate". An example of such an aggregation can be made in a production system context. Given that a new product is being planned for a manufacturing system, it is initially conceived that there are two limiting resources: raw material and machine time. However, the raw material constraint can be converted to an equivalent monetary limitation given raw material costs; and similarly, machine time can be converted to a monetary limitation given the cost of machine operation and fixed charges. Once these two constraints are converted to the same monetary base, they can be added together to form an aggregate budgetary constraint which can be used to replace the two original constraints. One should not conclude that such a surrogate constraint is equivalent to its original constraint set. Rather, it represents a "broad brush" approach to generalizing the overall resource consumption characteristics of the decision variables in a planning and design model. Particularly in the context of an iterative design process, such an aggregation of resources (with a common base) leads directly to the notion of a potential re-distribution of these resources based on the results of an initial assessment of their individually limiting effect on the decision process (i.e. a solution to the mathematical program when the original constraints are replaced by the single surrogate constraint). Mathematically, the form of surrogate constraint employed by the heuristics described herein is as follows. M
Let 6 i = ~ . a~i i=I
and M
I~= ~ , bi. i=l
The basic form of surrogate constraint employed to replace the system of inequalities given in
(2) is N
~j ~ b.
(5)
It is now possible to return to the idea of "desirability" with respect to the decision variables. A weighting scheme is employed to rate these variables according to their effectiveness relative to the quantity of surrogate resource which they consume. Let r~ = cjMj (j = 1 . . . . .
N).
(6)
The decision variables can now be ordered according to this desirability (or "relative effectiveness") measure. In effect, if a decision variable must be made inactive (xk = 0), then it should be the variable whose relative utility is the lowest (i.e. it is the least desirable decision; rk <-- rm, m = 1. . . . .
N,m#k).
This criterion measure will form the basis for the procedures described below. Conceptually, these heuristics may be thought of as "serial partitioning algorithms". In the first, all decision variables begin with a value of "l"; then a sequence of these variables are set to "0", one at a time in order of lowest relative effectiveness, until a solution is found which satisfies the original constraints (i.e. an exclusion algorithm). In the second, a rationale which is just the
Network protocoldesign
(
I I
213
)
START
1
F=O k=O
Form Surrogate Constraint
Computer Relative Effectiveness Ratios
Sort Variable indices into s in decending order according to Relative Effectiveness
Yes constraint violated? I
Are Original Constraints Violated?
INo I
Solution is: X~ = X F* = F
F=F+c (j)
STOP
)
Fig. 2. Informationflowchartfor inclusionheuristic. opposite of this is employed. That is, all variables begin with a value of "0"; then, they are made active in sequence according to their relative effectiveness until no feasible additions to the active list can be made (i.e. an inclusion algorithm). General information flow charts for an inclusion (add) and an exclusion (drop) heuristic are given in Figs. 2 and 3 respectively. From a computational perspective both of these procedures should provide an approximate optimum in a very small amount of execution time as compared to a direct optimum seeking method. To illustrate the computational tractability of these procedures a number of test problems, both standard[21] and randomly generated, were solved and are compared.
Computational experience The preceding algorithms were implemented as FORTRAN programs executed on an IBM 370/158. Initially, their robustness was investigated by employing them to solve nine
ROBERTP. DAVIS
214
START
)
Obtain Problem Data
Set
x.
= 1
(j = 1 , 3 . . . .
N)
N
F (x) =
Z cj j=l
Form Surrogate Constraint
I
k
-I
No
Surrogate Satisfied?
Compute r. values J for x. # 0 3
I
Yes
Are All Original Constraints Satisfied?
Omit Satisfied Constraints From Further Consideration
Find and
minimum r k
associated
"k"
I
Yes Let
Solution is x k (k = 1 . . . . .
<
xk = 0 and
F(~) = F(~) -
N), F(~)
STO. I
Are
any
ck
x . # O? 3
(j = i . . . . .
N)
Fig. 3. Informationflowchartfor exclusionheuristic.
Table 1. Initialtest problem[21]resultspercent(%)errorfromthe optimum Average Error
Problem No.
Inclusion Algorithm
39
0
9
16
0
i
4
ii
0
9%
Exclusion Algorithm
39
0
9
16
0
i
4
ii
0
9%
Note:
All execution times (independent of input/output) were less than 0.01 seconds for both procedures.
standard resource allocation test problems. Table 1 contains the results of this testing and demonstrates that, in general, these procedures perform well in approximating an optimum solution. However, these test problems are very small (i.e. 10 variables and 1 constraint) and give no real basis for comparing the computational efficiency of the inclusion and exclusion algorithms when applied to large problems. To determine this, six additional large scale problems were randomly generated using a variant of the procedure employed Hammer and
Network protocoldesign
215
Table 2. Additionaltest problemresults Number of Variables
Number of Constraints
Execution Time (seconds)* Inclusion Algorithm Exclusion Algorithm
50
20
0.01
0.03 1
i00
20
0.05
0.09
150
20
0.Ii
0.19
50
50
0.03
0.05
i00
50
0.08
0.13
150
50
0.15
0.25
NOTES: *:
Execution time is independent of input/output
time.
i:
Solution by a generalized, optimum seeking implicit enumeration algorithm (with backtracking) required 2038.36 seconds of execution time. The inclusion heuristic gave a solution which was 53% of the optimum while the exclusion heuristic produced a solution which was 70% of the optimum. In all cases the exclusion algorithm produced a substantially better bound for the objective function.
Peled [10]. The results from these problems is found in Table 2. The most striking aspect of this result is the extremely low execution time required for both algorithms. Comparing the two heuristics it should be noted that, in each case, the adaptive nature of the surrogate constraint form used by the exclusion algorithm enabled a superior bound value to be determined (improvements in bound value varied between 8.2 and 92.7%). Consequently, the exclusion heuristic is judged to be a more desirable approximation algorithm. The preceding heuristics can be used to initialize an optimum seeking, enumeration (tree search) algorithm in one of two ways: (1) by providing an "initial bound value" for the algorithm to employ in "pruning" branches which cannot contain the optimum; (2) by providing both an initial bound value and an initial solution vector from which to begin a search for the optimum. The latter is usually associated with "backtracking" algorithms; so called because the search procedure backtracks up the tree of solution nodes, beginning with the initial solution, and either prunes-off branches or "fathoms" them to obtain an improved solution (continuing in this manner until the optimum is obtained). A number of strategies can be employed in a tree-search optimization framework. The general references that follow contain a number of these. However, when a good initial solution is available from which to start a search, backtracking algorithms have been found to be quite efficient. Further, should the search be aborted before completion, a good solution is always available.
Analytical extensions The preceding problem context and associated mathematical model are typical of many decision processes in which strictly limited resources must be allocated among competing alternative activities. In particular, the above problem presents a scenario in which information relative to the sensitivity of solution decisions to variations in model parameters is particularly important. In addition, the nature of the protocol design process is such that lower layer protocols may be virtually prescribed (by communications technology) while higher layer protocols may have a rather vague basis (since the spectrum of applications processes is virtually uncatalogued). This leads to a "de facto" decomposition in addressing protocol design issues among layers, even though the protocols across all layers are competing for the same resources. These characteristics, coupled with the fact that the resources to be allocated among protocol features are brought to a common base, leads directly to a consideration of specific approaches to the development of analytical bases for: • decomposition, • sensitivity analysis, and • model relaxation and restriction, CAIE Vol. 7. No. 3--D
216
ROBERTP. DAVIS
all of which must be both consistent with the zero-one decision framework and lead to algorithmic procedures which are computationally tractable. There exists very little research upon which to base strong criteria for a meaningful decomposition of large scale zero-one programs. Further, only a limited research basis exists for the development of sensitivity information from integer programs in any form. Finally, even though a rationale may exist for specific approaches to problem relaxation or restrictions (such as with the surrogate constraint used above), there is no definitive basis for evaluating, a priori, the practical utility of these approaches. These are rather challenging issues in integer programming which have an even greater significance when one realizes that obtaining an initial solution to most practical pseudoBoolean programs is a computationally significant task. Expediting this task and, subsequently, evaluating the sensitivity of a solution once obtained are of significant practical, as well as theoretical, value. SUMMARY This paper has presented a description of a mathematical model which has been developed to assist in the specification of network protocol features. This is a significant problem area: both with respect to its utility in providing enhanced communications in distributed networks, as well as representing a definitive mathematical programming structure which presents fertile ground for the development of solution algorithms and post-optimality analysis techniques. Acknowledgement--This work was supported in part through funds from the National Bureau of Standards, Computer and Network Architecture Division, under Project Number 651-0072.
REFERENCES l. C. W. Davies, et al. Computer Networks and Their Protocols. Wiley-lnterscience,New York (1979). 2. R. P. Davis & J. E. Shamblin, A time-shared ILP algorithm for strictly limited resource allocation problems. CoED Trans. VI(2),23-32 (1974). 3. R. P. Davis & M. P. Terrelt, An approximation technique for pseudo-Boolean maximization problems. AIIE Trans. 8(3), 365-368 (1976). 4. D. R. Doll, Data Communications--Facilities, Networks and Systems Design. Wiley-lnterscience,New York (1978). 5. R. A. Garfinkel & G. L. Nemhauser, Integer Programming. Wiley Interscience, New York (1972). 6. A. M. Geoffrion, An improved implicit enumeration approach for integer programming. Operns. Res. 17, 437-.454 (1%9). 7. A. M. Geoffrion & R. Nauss, Parametric and postoptimality analysis in integer linear programming. Management Science 23(5), 453-466(1977). 8. F. Glover, Surrogate constraints. Operns. Res. 16, 741-749 (1968). 9. F. Granot & P. L. Hammer, On the use of Boolean functions in 0-1 programming.Technion Mimeograph Series on Operations Research, Statistics and Economics,No. 70, Haifa, Israel (1970). 10. P. L. Hammer & U. N. Peled, On the maximizationof a pseudo-Booleanfunction. J. ACM, 19, 265-282(1972). 11. P. L. Hammer & S. Rudeanu, Boolean Methods in Operations Research and Related Areas. Springer-Verlag, Berlin (1%8). 12. P. L. Hammer & E. Shlifer, Applications of pseudo-Boolean methods to economicproblems. Technion Mimeograph Series on Operations Research, Statistics and Economics,No. 31, Haifa, Israel (1969). 13. J. F. Heafner & F. H. Nielsen, A linear programmingmodel for optimal computer network protocol design. Proc. NCC Vol. 49, pp. 855-861 (May 1980). 14. J. F. Heafner, F. H. Nielsen & M. W. Shiveley,Toward the extraction of service features from definitive documents on high-level, network protocols. Proc. NCC, Vol. 49, pp. 863-870(May, 1980). 15. International Organization for Standardization. Reference Model of Open Systems lnterconnection, 1SO/TC97/SC16. Working Document (1979). 16. National ComputingCentre, Introducing Communications Protocols. NCC Publications, Manchester, England (1978). 17. T. L. Saaty, Optimization in Integers and Related Extremal Problems. McGraw Hill, New York (1970). 18. M. Schwartz, Computer Communication Network Design and Analysis. Prentice-Hall, Englewood Cliffs, New Jersey (1977). 19. A. J. Swan, Data Communications Protocols. NCC Publications, Manchester, England (1979). 20. H. A. Taha, Integer Programming. AcademicPress, London (1975). 21. C. A. Trauth & R. E. Woolsey,Integer linear programming: a study in computationalefficiency.Management Science 15,481--493 (1%9). 22. S. Zionts, Linear and Integer Programming. Prentice-Hall, EnglewoodCliffs, New Jersey (1974).