Journal of the Franklin Institute 339 (2002) 265–275
How packet switching works Paul Baran* 83 James Avenue, Atherton, CA 94027-2009, USA Accepted 27 November 2001
1. What is packet switching? According to the Data Communications Glossary1 packet switching is: ... a data transmission technique whereby user information is segmented and routed in discrete data envelopes called packets, each with its own appended control information for routing, sequencing and error checking; allows a communication channel to be shared by many users, each using the circuit only for the time required to transmit a single packet; describing a network that operates in this manner. While proposed 40 years ago this scheme was initially regarded as implausible, unworkable and at best an unnecessary complex way of building communications networks. Now essentially all new communications networks, including the Internet, incorporate the concepts of packet switching. The once wild idea has become so commonplace that we take it for granted, like the engine under the hood of a car. It is there, and it works. One does not have to know how or why it works to drive a car. 2. Motivation For the few curious about packet switching origins, its motivation was to fill a Cold War defense need: to build a robust communications network to withstand destruction of many nodes by an enemy, and allow the physically surviving nodes to intercommunicate. My early work was done for the US Air Force at the not-forprofit RAND Corporation in the 1959+ time period. *Tel.: +1-650-323-4053; fax: +1-650-323-2056. E-mail address:
[email protected] (P. Baran). 1 Data Communications. March 1998. p. 270 glossary complied by staff of Data Communications. 0016-0032/02/$22.00 r 2002 Published by Elsevier Science Ltd. on behalf of The Franklin Institute. PII: S 0 0 1 6 - 0 0 3 2 ( 0 1 ) 0 0 0 4 2 - 4
266
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
H-bomb testing in the Pacific in the 1950s revealed that long distance, short wave (high frequency) sky-wave transmission would be disrupted for several hours by a high altitude nuclear blast. RAND computer simulations showed USSR weapons targeted at the US retaliatory forces would render long distance telephone communications service inoperative by collateral damage alone. While almost all the telephone facilities would survive, the paucity of switching centers created a dangerous Achilles Heel. To cool tensions of the Cold War, a retaliatory force capability was needed that could withstand a surprise attack, and sufficiently survive to return the favor in kind, in a controlled manner. A survivable command and control communications infrastructure was needed to get away from the guns loaded, hair trigger doctrine of the time. The initial focus of my work was on distributed networksFa network without a hierarchical structure. This structure appeared to be more robust than alternative network structure.
3. Types of networks Fig. 1, shows three different types of communications networks, centralized (a), decentralized (b), and distributed (c). The first is called a centralized network because all nodes are connected to a central switching point. Communication from any node to any other node is via the central switching node. This arrangement allows simple
Fig. 1.
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
267
switching. But this network has a single point of high vulnerability. Take out the center node and there can be no communications among the nodes. In the decentralized network (b), instead of a single common node, the network comprises small, centralized clusters. Most traffic tends to go to the nearby neighbors with the longer distance traffic sent out on the longer links. This decentralized topology is the common configuration for the telephone plant. The distributed network (c) is the most robust of all. There is no single node point of vulnerability that can bring down much of the network. However, switching traffic around such a network was an unsolved problem at that time. The necessity of switching signals in such a network led to the development of packet switching.
4. Building a survivable networkFredundancy of connection A distributed network with the minimum number of links connecting each of the nodes together in a large network is defined to be network of Redundancy Level of 1. If each node is connected with twice that number of links (by having all the horizontal nodes connected together and all the vertical nodes together in fishnet fashion) is defined as a Redundancy Level 2 network, and so forth. I found that when we reached about Redundancy Level 3 or so, an interesting phenomenon took placeFthe network became extremely robust. That is, if any node survived physical damage, it would likely be connected to all other surviving nodes. This means that it is theoretically possible to design an extremely robust communication network out of unreliable links.2 In other words, if a redundantly connected node survived the physical attack, there is a high probability of that node, at least on paper, would be somehow be circuitously connected to all the other surviving nodes. ‘‘Somehow’’ was the issue, and this created the need for packet switching, because conventional circuit switching was not feasible.
5. The routing problem At that time we did not know how to build communications switches where signals were able to traverse many serially connected nodes, and operate reliably in the face of damage. We needed a way to get signals through a large number of nodes, via circuitous paths that could not be determined in advance. The signals had to be 2
The optimum degree of redundancy is a function of the sum of the loss of capacity by component reliability failures plus the damage anticipated. It does not make much difference if the damage is by an attack or by component reliability. Thus it would become possible to build highly reliable networks out of unreliable links and nodes and be able to withstand high levels of damage at a Redundancy Level of about 3. Even a Redundancy Level of 1.5 would more than suffice to build a reliable network where no enemy attack is anticipated, just component reliability.
268
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
passed along without errors through a large number of nodes, traveling via highly circuitous paths that could not be determined in advance. I considered several analog transmission approaches, but kept hitting a brick wall. The only way I could think that overcame this restriction was to transmit all signals digitally to avoid the distortion buildup and attach the routing information with the signals to be sent. 6. Message blocks Having reach the decision that all signals would have to be digital, I had to design the switching node. These switching nodes were to be small special purpose computers having a small amount of memory and processing capability. This computer at each node was called a Switching Node. Each link connecting to its neighbors would be bidirectional and considered as an independent subsystem. Each Switching Node kept a copy of each message block it sent until it received an acknowledgement that this message block had been correctly received. If no acknowledgement were received within a short maximum expected time, a duplicate message would be sent out a different path. Each Switching Node had the responsibility of sending each message block in a direction toward its designated end destination. By sending all data in formatted blocks, now called packets it became possible to use a cyclical redundancy checking (CRC) approach. This is the method still used today to detect defective packets. By treating the data stream as a series of independent blocks of information that were reglued together at the receiving end, a number of non-obvious features appeared. 7. Synchronization The packets in the network had to traverse through many tandem nodes. It would be very difficult to have all individual links in tandem operate at the exact same data rate as required for a circuit switch connections. Instead it would be now possible to let each interconnecting link operate at its own ‘‘natural’’ data rate. A small amount of buffering at each Switching Node removed the need for tight overall network timing. This simple choice meant that a ‘‘real-time’’ connection between the transmitting and receiving end user would not exist, as it would be necessary to retransmit failed packets along the way. But if the data rate were fast enough, the illusion could be created that a real-time connection existed. The benefit of packet switching increases as the data rate of its links increase, allowing more users to share a common facility. This increased data rate reduces the delay through the system. 8. Mix and match Breaking the fundamental lock step nature of circuit switching meant that it would be theoretically possible to build the network from a hybrid collection of
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
269
totally different types of links, each operating at markedly different data rates. See Fig. 2. But what about the reliability of building a network of such links, some of which may even be low-altitude satellites, that may or may not be present at any moment? This question takes us to one of the most important features of packet switching that is not well appreciated. In circuit switching, a single failed tandem connected element brings down the entire communications path. But in the case of the distributed network, as will be shown later, traffic moves around the failure, so the network can be built of inexpensive low reliability parts. An oversimplified way of viewing this difference is to consider a circuit switched system as comprising a large number of elements in series. Reliability of a series of connected elements is no better than the weakest link. The reliability decreases as the number of elements in tandem increase. The distributed network model is quite different. It approaches parallel reliability where the reliability is that of the most reliable parallel element, and where adding elements increases system reliability. Another sometimes unappreciated difference is that it makes little difference whether a link fails by enemy attack or by cheap component reliability. Even though more communications links are required, their lower cost, shorter distance requirement allows such systems to be built at a far lower overall cost than conventional analog signal circuit switched circuits.
Fig. 2.
270
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
This may seem to be obvious and trivial today. But at the time such points encountered a mental brick wall in the minds of many competent communication engineers. There was a conceptual barrier in comprehending the scheme, particularly by analog signal transmission engineers unfamiliar with the digital computer art. The idea that it would be possible to switch a voice call over different routes while the call was in progress was viewed at best, as madness.
9. Adaptive routing It is necessity for each packet to find its way through a network of changing topology, whether caused by instantaneous transmission error or by longer-term physical damage. The initial routing protocol I proposed was simple. Each packet had a ‘‘to’’ and ‘‘from’’ address field together with a ‘‘handover counter’’ field that was incremented every time the packet was sent from node a node. The value of the handover number was an estimator of the length of the path taken by each packet. Each switching node regarded recently observed handover numbers as a better estimator than older measurements. The network not only had to learn, but it also had to forget, to be able to respond to changes in link and node availability.
10. Post office analogy John Bowers, a RAND colleague suggested that it was easier for him to visualize the concept by imagining an observant postman at each node (or post office). The postman can infer from the lowest received cancellation data (handover number) of the letters (packets) coming FROM any direction (link) would define the best direction TO send traffic in the future to that address. By observing traffic passing through the node and by recording the handover numbers of the FROM station, together with the link number, the imaginary postman could determine the best TO link, the second best TO address, the third, etc. When the shortest path link is busy, or is out of action, the next best path will be taken.
11. Hot potato routing To dramatize the need for speed in the switching nodes, I described the switching process by saying that each packet should be regarded as a hot potato, tossed from person to person, without gloves. The objective was to get rid of the hot potato as quickly as you can, but preferably in the direction of its marked designation. If your first choice recipient is busy, then toss it to your second choice recipient, etc. If you have no better choice you are even allowed to throw the hot potato back to the previous thrower. Everything had to be essentially instantaneous, if voice was to be transmitted, as voice is intolerant to delay.
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
271
I called this early routing scheme a ‘‘hot potato routing’’ algorithm, which has been reinvented the usual number of times since then, and now is most often called ‘‘deflection routing’’.
12. Flow control Every network has a maximum allowable load. Network performance is limited by the number of packets that can be sent and received per unit of time. It was important that the packet network be able to overload gracefully and never crash. It is similar to the road system. If too many cars enter a freeway, the traffic comes to a gridlock stall. Metering lights are now used to limit the number of new cars onto a freeway until enough traffic leaves the freeway and prevents a jam from being created. This is called flow control, and was an integral part of the routing doctrine proposed.
13. Switching nodes and multiplexing stations There are two parts of the system. To this point we have been considering the Switching Nodes that get packets from one point in the network to another. A second subsystem is needed, to terminate conventional circuits, from many users. This second unit called a Multiplexing Station provides functions such as filling in missing (blank) packets during silence periods and the end-to-end functionality. It provides the concentration of a large number of lower-data-rate users into sharing a few high-speed channels. Each sent packet must have the local address of the end user as well as the address of the Multiplexing Station. On the transmitting end, the functions include chopping the data stream into packets, adding housekeeping information and end-to-end error control information to the outgoing packets. On the receiving end, each multiplexing station uses terminating buffers temporarily assigned to each end addressee to unscramble the order of the arrived packets, and to buffer them so that they come out as an error-free stream, only slightly but not noticeably delayed in human time.
14. Separating the logical from the physical addresses An interesting difference from circuit switching practice of the time was the new concept of separating the physical address from the logical address. This need came about in part from the requirements for a system designed for command and control communications, which sought to avoid single points of failure. As the military command structure is composed of individuals who are unique and cannot be replicated very easily, their vulnerability would violate the no single target being better than any other restriction. The underlying idea is to play the old shell game of a single pea and multiple walnut shells. Think of the pea as the commander(s) and
272
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
Fig. 3.
the various points of entry into the network as the walnut shells. The commander can appear at any location on the network and start operating, and the network would quickly learn the new location and begin routing traffic accordingly (Fig. 3). By removing any connection between physical and logical addresses, a new freedom is created that is useful in many ways. Wherever you move, in essence you can now take your telephone number with you. This is unlike the telephone system of the day, where the telephone number referred to a specific physical central office pair of wires. This is one of the characteristics we take for granted in today’s Internet. 15. Suppression of silence In most circuit switch communication applications, silence is the usual message as no information is transmitted most of the time (i.e., remote computer terminals, voice, two-way video, etc.). There is a huge economy to be gained by not sending long strings of 0’s or 1’s that contain no information, as necessary in conventional circuit-switched networks. The magnitude of this economy is large, as the common facilities are more effectively shared. In packet switching, packets are generally not sent unless there is information to be sent. If there is no change in the data stream relative to the content of the last packet, why bother sending a blank packet? 16. Virtual circuits The Multiplexing Station has the responsibility to provide the connected user with the missing non-transmitted packets to maintain an illusion for the user that his or her
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
273
computer is always connected. This concept is called a virtual circuit. There is no limitation to the number of virtual circuits that can be simultaneously maintained; In brief the process fakes out the connection. The virtual circuit creates the illusion of a circuit always being there when you need it, but consuming no resources when not instantly needed. It is the high speed of the transmission and the switching that allows this sleight-of-hand to create an illusion that a physical connection is always present. When considering the network from the user at the Multiplexing Station, the network appears as a fuzzy cloud. The user does not particularly care which instantaneous path his or her traffic is used to get to its destination. The user need not be concerned about the transportation portion of the network. Rather, all the user sees is a virtual circuit to the chosen end destination. The user cannot tell the difference between a physical and a virtual circuit, but the economics permits ‘‘selling’’ the same ‘‘circuit’’ many times over, legally. 17. Where the name packet switching came from In the early 1960s I used the term message block. In 1966 Donald W. Davies, of the National Physical Laboratory, unaware of my earlier work independently came up with a closely related approach. Davies used the term ‘‘packet’’, a far better and more graceful term of British usage, from which we get today’s term ‘‘pecket switching’’. Davies told me that he and a friend chose the name packet switching specifically to distinguish it from earlier era ‘‘message switching’’. He too realized that packet switching had magical new properties, and he wanted to distinguish it from the older ‘‘message switching’’ networks used to relay telegraph messages through a network. Davies concern was not misplaced. Recently there has been an awkward situation created by confusing message switching with packet switching [1]. Davies, on his deathbed wrote a very carefully prepared paper to clarify this matter for the record [2]. 18. Packet length Parenthetically, the initial system description in the RAND papers used a packet length of 1024 bits: This is the same number that was independently selected by Davies. These two numbers were the same because to the designer it represents a compromise. If the packet is too short, too much capacity will be devoted to the fixed length overhead header, while a longer packet increased the probability of a single error requiring retransmission of an entire packet with a resulting loss of channel efficiency. And unless traffic is not handled in short pieces the performance will not be as described. 19. Reasons for low error rates Why is packet switching so reliable a data transmission scheme? There are two reasons: the redundancy of the routes allowed and the policy of keeping a ‘‘carbon
274
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
copy’’ of the transmitted packet until you are ‘‘absolutely positive’’ that the packet sent has been correctly received by the next recipient. If in doubt you ‘‘fail-safe’’ and retransmit the packet. You let the end point clean up the duplicates. The ‘‘absolutely positive’’ assumption derives from the use of redundant information in the packet in an error detection field. This ensures that the data was not corrupted in the transmission process. The slight redundancy of bits contained in the CRC performs the same function as the redundant coding of the DNA molecule. The DNA molecule must be duplicated through many generations of copies. A slight inbuilt coding redundancy in structure prevents incorrect copy of the molecule from ever being made, except in very rare instances. With packets, the original copy is always retained until there is certainty that the packet has been perfectly copied. Replication thus occurs generation after generation without error. There is always a residual low probability of error, but its value can be made arbitrarily small by a longer CRC or end-to-end verification. 20. Putting things into perspective History does not begin when one enters a field. All systems are built on top of the works of others. Throughout this system design process I borrowed freely using whatever technology best fit the objectives. I mention this to avoid any inadvertent impression that all the ideas described are totally original. In my 1964 series of detailed memoranda on the subject I devote an entire volume [3] to the matter of history and alternative approaches at the system level that is helpful in separating the old from the new. 21. Contemporaneous references The work described here was primarily defined by 1961, and flushed out and put into formal written form in 1962. The idea of hot potato routing dates from late 1960. The detailed series of RAND Research Memoranda that provided the engineering details was essentially completed in 1961–62, but delayed to permit the entire series of volumes to be released at one time. Three references describing the evolution of this work can be found in: (1) RAND Paper P-1995, ‘‘Reliable Digital Communications Systems Using Unreliable Network Repeater Nodes’’, May 29, 1960. (2) RAND Paper P-2626 ‘‘On Distributed Communications’’, November 1962. (3) August 1964 series of RAND Memoranda. These papers are all unclassified and publicly available with copies sent by RAND to depository libraries around the world. These references are available from the RAND Publications Department and the multi-volume RAND RM’s is also available on line via: (http://www.rand.org/ publications/RM/baran.list.html)
P. Baran / Journal of the Franklin Institute 339 (2002) 265–275
275
22. Disclaimer From time to time I am given credit for all sorts of things that I have not done. I am not responsible for the ARPANET. The initiator was Robert Taylor, and it was a project managed by Larry Roberts and implemented by Bolt Beranek and Newman, Inc. It grew and flourished through the effort of many graduate students around the country. I am also falsely charged with creation of the Internet. I deny the charge, as the Internet is the result of the work of many. The Internet is a growing, evolving and ever changing organism with wonderful new ideas constantly being added. The present paper can cover only a small part of the story of packet switchingFonly its beginning. But its use in the Internet and elsewhere was done by many, many different people. No single person is responsible for more than a small piece of the present evolution of the system consisting of many ideas woven together to form the present technology of packet switching and the Internet. There are many books that carry this subject forward that can be recommended, with the more recent books benefiting from the earlier works. These include: 1. John Naughton, A Brief History of the Future, 1999 and 2000, Overlook. 2. Janet Abbate, Inventing the Internet, 1999, MIT Press. 3. George Dyson, Darwin among the Machines, 1997, Addison-Wesley, 4. Katie Hafner, and Matthew Lyons, Where Wizards Stay up Late, 1996, Simon & Schuster. 5. Arthur Norberg and Judy O’Neil, Transforming Computer Technology, 1996, Johns Hopkins. References [1] K. Hafner, A Paternity Dispute Dived Net Pioneers, New York Times, November 8, 2001. [2] D.W. Davies, An historical study of the beginnings of packet switching, Comput. J. Br. Comput. Soc. 44(3) (2001) 152–162 (Introduction p. 151). [3] P. Baran, On Distributed Communications, Vol. V., RM-3638, The RAND Corp. Santa Monica, CA.