Data communication channel capacity for multiple-buffered concentrators Gerald HerskowitzJF and Stephen Shapiro~ present a method of minimizing the channel capacity needed to transmit data from multiple-buffered communicationsnetwork concentrators.
An analytical procedure is presented to evaluate the capacity required for channels used to transmit data from multiplebuffered concentrators used in data communications networks. Three message assembly procedures are compared. The parameters considered in the analysis are queuing discipline, buffer size, and computational complexity. The influence of these factors on channel capacity is examined. It is shown that the required channel capacity C for a firstcome, first-served message assembly procedure satisfies C <<.2 C ~, where C* is the lowest possible channel capacity, achieved when the incoming messages are completely packed. Other, more efficient, procedures are examined and compared.
INTRODUCTION In data communications networks, such as those used in conjunction with time-sharing or business-oriented systems, messages are combined before transmission to use data channels more efficiently 1'2. Combining procedures vary from multiplexing and line switching to more sophisticated statistical message- and packet-switching concentration techniques 3. In the latter method, requiring a communications processor4, messagesare queued in a buffer until enough data is available to make efficient use of the outgoing channel capacity s. The costs of communications processors have decreased greatly with the introduction of microprocessors and storage elements using large-scale integration technology 6'7, resulting in the availability of relatively inexpensive 'intelligent' network processors8 to perform distributed processing. This development makes it feasible to study the use of network processors to improve the operation of data communications networks through the use of additional storage and processing in concentrators close to the source of the data. The procedures described in this paper require multiple buffering 6 to allow almost continuous use of the data comDepartment of Electrical Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA The above work was supported in part by the USA Electronics command, Contract DAAB077CA120. This work was supported in part by the US National Science Foundation under Grant MCS 76-08176.
196
munications channel. Polling techniques h6 are used for collecting submessagesfrom input terminals or devices. These are collected in primary buffers at the data source, and a message assembly buffer is used to implement the procedures described in the paper before transmission to the data channel. In this way, efficient use is made of the channel, which will allow transmission from the message assembly buffer without significant delay. A tradeoff is thus provided between the size of the buffer and the output channel capacity. The block-size requirements to accommodate expected traffic for specified overflow-rate probabilities have been studied extensively 9-12. The procedures described in this paper were originally developed for dynamic memory storage allocation 9 and are based on bin-packing theory 16'2~. These procedures are applied here to the assembly of data messages into blocks, using similar strategies. The results are comparisons with the lowest possible channel capacity, achieved when incoming messagesare completely packed.
D A T A TRANSMISSION BLOCK PROPERTIES Data blocks having the structure shown in Figure 1 are assembled as described by Herskowitz 6 and Herskowitz and Schoen 7. Submessagesare packed into the region of size M bits. H and E are the number of bits required for the header and for error checking. In this data block transmission method L and M are of fixed size. The submessages are all smaller than M and are of varying size. After the submessages are placed in the data block, a wasted space W often results.
L
P
[ ] Submessaqes [ ] Wosted spoce
Figure I
Typical data block (DB)
0140-3664/78/0104--0196501.00 © I PC Business P r e s s
computer communications
Let C denote the channel capacity of the transmission system. Assuming that the channel is fully loaded and that there are no gaps between the data blocks (DBs), one sees that a portion of the channel capacity is not effectively used, as it is required for sending W.
CHANNEL CAPACITIES AND RELATIONSHIP TO BIN PACKING The procedure used to select the submessages that are to be placed in each DB determines the amount W of waste. As shown above, W influences C. As one wants to make the channel capacity as small as possible, W should be minimized over the sequence of data blocks. Another viewpoint is to start with a set of submessages to be packed into the DBs. The smallest number of DBs, which will yield the lowest channel capacity are required. Let C* be the optimum channel capacity. As previously outlined, packing procedures that will minimize C are sought. Let W* be the average waste associated with the best possible assignment of submessages to DBs. W* is associated with C*. It may easily be shown that c = kC*
(1)
where
k-
1 -
1
W*IM -
WlM
As W increases, C increases. As W* ~< W by definition, k/> 1. The smaller k is the more closely optimum channel capacity is approached, and the better is the performance. The packing of submessages in DBs is to minimize the waste, and the number of DBs used is equivalent to the binpacking problem.
Relationship to bin packing The bin-packing problem consists of the following: given a set of bins of height h and segments of length ai E (0, h], i = 1, 2 , . . . , n, find the assignment of segments to bins that minimizes the number of bins used subject to the condition that the sum of the lengths of objects in each bin is at most h. The problem is related to the knapsack problem. As the latter is NP-complete 14'~s, implying essentially enumerative solutions, heuristic procedures are usually applied. These heuristic procedures are intended to be fast and simple. Many problems can be modelled by bin packing, including the assignment of segments to tracks on discs to minimize the number of tracks used, prepaging, packing of variable length strings into fixed-length words, and scheduling theory ~6'~7. There are also applications in operations research, such as the cutting stock problem ~s'~9. In terms of the data blocks, M = h, and the segments correspond to the submessages. Let b* equal the number of DBs used for an optimum packing, and b equal the number of DBs used by a heuristic assignment of submessagesto DBs. b* ~
vol 1 no 4 august 1978
(2)
in analogy with (1). The smaller f (b*) is, the better the heuristic procedure. Three packing procedures are as follows: • The submessagesare loaded into the DBs in order of arrival, maintaining a first-come, first-served priority discipline. • A moderate-size primary buffer is used to load each DB and first-come first-served is not maintained. • A very large primary buffer is used for loading the DB and first-come first-served is again not maintained. It will be shown below that each of these methods yields a successively smaller required channel capacity. Each method has a successively greater computational complexity. Thus it will be seen that required channel capacity is lowered at the expense of primary buffer size, computational complexity and priority ordering.
Heuristic packing procedures The procedures used are called next fit (NF), first fit decreasing hopper (FFDH), and first fit decreasing (FFD) respectively in the bin packing literaturel6'2°'2k NF and FFDH are online procedures. In N F, submessagesare taken from the primary buffer in order of arrival, and placed in the current DB. When a submessage does not fit into the DB being filled, the DB is called complete and is transmitted, and the submessage is placed in a new DB. In FFDH, one assumes that the primary buffer holds submessages, and, by means of hand shaking with the inputs, is always full. The submessagesare ordered in size, and each DB is filled as in NF starting with the largest submessage first. When DB is full or after a complete pass through the primary memory inserting all submessages that fit, the DB is transmitted. After the transmission, the primary memory is refilled with additional submessagesfrom the sources. In FFD, in this context, a large number of messagesis taken into the primary store. These are ordered as above. A set of DBs is filled concurrently as follows: suppose the DBs are numbered l, 2 , . . . . Beginning with the longest segment, each segment is inserted in the bin with the lowest index in which it fits. When the primary store is emptied, the DBs are sent.
Channel capacities associated with packing procedures ProcedureA No statistics being assumed for submessage sizes, and assuming a steady state:
C <<,k(x)C* where
k(x) = 2 =1 + ( x -
x ~ [1, 2) 1)- I
(3) x~>2
x = (maximum submessage size/M) -1 That is, the channel capacity required for this procedure is, at most, two times optimal. This result follows from
197
Table 1
Statistical performance
k(p)
P
1.25 1.11
5 10
1.07 1.05 1.04 1.02
20 25 50
Procedure C For exponentially distributed submessage lengths, steady state, C = h~o, s)E(C*)
15
(7)
where s = the number of submessagesheld in the primary buffer, k(p, s) may only be found numerically. Values are given in Table 2. These results are based on Shapiro =1.
Procedure D If no statistics are assumed for the submessagesizes, and the primary memory is assumed to be very large, Table 2
Values for k(p, s)
c ~
(8)
$
where 5
P
10
15
20
25
30
35
3
1.11 1.02 1.0
1.0
1.0
1.0
1.0
5
1.2011.03 1.0
1.0
1.0
1.0
1.0
1.92 1.1611.06 1.03 1.0
1.0
1.0
I
10
L
15
-
i
1.52 1.23 1.1411.06 1.03 1.0 h(p, s) for procedure C
I Region of improvement over procedure A
h -~ 1 I/9 Procedure D may be viewed as superior to procedure A since h < 2, k(x) of equation (3), regardless of the maximum submessage size, based on no knowledge ofx. For exponentially distributed segment lengths, it may show following Shapiro21 that this procedure yields a h -~ ] when the procedure is applied to at least 100 submessageswith p >t 3. The bound in equation (8) follows from Johnson etal. 16. Note the improvement in the worst-case performance over procedure A. Simulations have shown that FFD yields almost optimal packings for typically distributed lengths.
Computations required at terminals Johnson's work 2° on bin packing relating b* to b. Consider a time T, for an optimum packing; then C*T -
L
b*
(4)
where b* is the number of DBs used. Similarly CT
L
(5)
-b
for the same set of submessages. Johnson's results relate b and b*.
Procedure B For exponentially distributed submessage lengths, steady state, E(.) denotes the expectation. C* <~ C ~- k~o)E(C*)
p >1 5
(6)
where ~)
= p~o - 1) -1
p = MIr
and r is the average submessagesize. The statistical performance can be illustrated as shown in Table I. Thus, for example, atp = 10, the required C is, at most, 1.11 C*. This result follows from Shapiro2, where it is shown that, for N F, E(b*lb)/b = 1 - p-1
198
p t> 5
There is a cost in terms of the number of computations required to pack each submessage in the assignment to the DBs associated with each of these procedures. The first procedure is the simplest, requiring one step per submessage. If the submessagedoes not fit in the current DB, a new DB is initiated. The second procedure increases in complexity linearly with the number of submessages. The third procedure has a complexity of O (n In n), where n is the number of submessagesconsidered simultaneously. Using simulations2% it may be shown that, for exponentially distributed submessageswith p ~ 15, a complexity of 1 In assures that C = 1.01C* when procedure C is used for n submessages. For procedure D with p = 5 and exponentially distributed submessagesand n = 500, about 66 comparisons per submessage are required when a simple O (n 2) implementation is used rather than the faster O (n In n). Here, the resulting C -~ C*. On the other hand, procedure C for this case yields C = 1.02C* using 6.6 comparisons per segment. This shows that only a very small reduction in channel capacity is achieved by an order of magnitude increase in the number of computations. Thus procedure C is generally superior to procedure D.
REFERENCES 1
Schwartz,M, Boorstyn and Pickholtz, L 'Terminaloriented computer-communication networks' Proc. IEEE Vol 60 No 11 (November 1972) pp 1408-1423
computer communications
2 3 4 5 6 7 8 9 10
11
12
Martin, J Telecommunicationsand the computer Prentice-Hall, USA (1976) Doll, D R 'Multiplexing and concentration' Prec. IEEE Vol 60 No 11 (November 1972) pp 1313-1321 Newport, C B and Ryzlak, J 'Communications processors' Prec. IEEE Vol 60 No 11 (November 1972) pp 1321-1332 Shapiro, S D 'Random store and forward communication networks' in Generalizednetworks Polytechnic Institute of Brooklyn Press(1966) pp 721-733 Herskowitz, G J 'Application of microprocessing techniques in communications systems', Seminar Notes, USA Electronics Command, (Summer, 1977) Herskowitz, G J and Schoen, A 'Multiple access techniques for communications systems submitted to Computer Communications 'Codex 6000 intelligent network processors', Codex Corporation Application Notes USA (1976) Shapiro, S D 'Dynamic memory storage allocation' Prec. Tenth IEEE Computer Society International Conf. (February 1975) pp 105-108 Gaver,D P Jr. and Lewis, P A 'Probability models for buffer storage allocation problems' J. A CM Vol 18 No 2 (April 1971) pp 186-198 Schultz, G D 'A stochastic model for messageassembly buffering with a comparison of block assignment strategies']. ACM Vol 19 No 3 (July 1972) pp 483492 Chang,J H 'An analysis of buffering techniques in
13 14 15
16
17 18 19 20 21
teleprocessing systems' IEEE Trans. on Comm. COM-20 No 3 part II (June 1972) pp 619-29 Pederson,R D and Shah, J C 'Multiserver queue storage requirements with unpacked messages'IEEE Trans. on Comm. COM-20 Pt. I (June 1972) Aho, A V, Hopcroft, J A, and UIIman, J D The design and analysis of computer algorithms Addison-Wesley, Reading, Ma., USA (1974) Sahni,S. 'Approximate algorithms for the 0/1 knapsack problem'J. ACM Vol 22 (1975) pp 115-124 Johnson,D S, Darners,A, Garey, M R, UIIman, J D and Graham, R L 'Worst case performance bounds for simple one-dimensional packing algorithms SIAM J. Comput. Vol 3 (1974) pp 299-332 Coffman, Jr., E G (Ed) Computer and job-shop schedul. ing theory Wiley, New York (1976) Brown, A R Optimal packing and depletion: the computer in space- and resource-usageproblems MacDonald, London (1970) Gilmore, P C and Gomory, R E 'The theory and computation of knapsack functions Operational Res. Vol 14 (1966) pp 1045-1074 Johnson, D S 'Fast allocation algorithms' Proceedings of the 13th Annual IEEE Symposium on Switching and Automata Theory pp 144-154 Shapiro, S D 'Performance of heuristic bin packing algorithms with segments of random length' Information and Control Vol 35 No 2 (October 1977) pp 148-156
R e a d about us regularly We microprocessors are changing " of the electronic and computer in~ allowing existing products to be n effective and revealing new appli( areas in industry and the home.
"
MICROPROCESSORS is a new bimonthly technical journal, covering the hardware, software and applications of microprocessors and microcomputers. Annual subscription (six issues) £23.00 UK, £35.00 Overseas
For further details write to:
.
David Burt, IPC Science and Technology Press Ltd, IPC House, 32 High Street, Guildford, Surrey GU1 3EW, England. Telephone: 0483-71661. Telex: 859556 Scitec G.
vol 1 no 4 august 1978
199