OSI Deprogrammer
Re-conceptualizing cyberspace

___

By Robert Graham 2023-10-01
(
@erratarob, @erratarob@infosec.exchange)

Original doc here.

 

1. Abstract        9

2. Introduction        10

2.1. Reader requirements        17

2.2. Organization of this document        18

2.3. About the author        19

3. The OSI Model        21

3.1. The basic 7 layers        21

3.2. Protocols        22

3.3. Where it started        23

3.4. The Network Stack        24

3.5. Use as a framework        25

3.6. Multi-protocol stack comparison        27

3.7. Consensus definition of the 7 layers        28

3.7.1. Application Layer #7        29

3.7.2. Presentation Layer #6        30

3.7.3. Session Layer #5        31

3.7.4. Transport Layer #4        32

3.7.5. Network Layer #3        34

3.7.6. Data Link Layer #2        35

3.7.7. Physical Layer #1        36

3.8. Non-consensus views of OSI        37

3.9. Modern networks        39

4. (simple)
Alternative network stack        40

4.1. Network-on-network        40

4.2. Protocols or sublayers        43

4.3. Upper OSI layer alternative        45

4.4. A Packet Network        46

4.4.1. A network        46

4.4.2. A packet        47

4.4.3. An address        48

4.4.4. A route or path        49

4.5. Differences between Ethernet and Internet        51

4.5.1. What runs underneath        51

4.5.2. Forwarding by address        52

4.5.3. Hop limit        53

4.5.4. Routing protocols        54

4.5.5. Transport (simple)        55

4.5.6. Naming        56

4.5.7. Configuration        56

4.6. RFC 791        57

5. (complex)
Alternative network stack        59

5.1. My Network Stack        59

5.2. Sublayers and Protocols        61

5.3. Payload        63

5.3.1. Data        64

5.3.2. Requirements        65

5.3.3. Payload Protocols        66

5.3.4. Interfaces        67

5.4. Physical Transmission        68

5.4.1. Signaling        68

5.4.2. But sometimes inside the stack        69

5.5. Local Networks and Links        69

5.5.1. Link vs. network        70

5.5.2. Network vs. Internetwork        70

5.5.3. LLC and local “transport”        71

5.5.4. Beyond Ethernet+WiFi        73

5.6. Services        75

5.6.1. Service vs application        76

5.6.2. Other services than the web        76

5.6.3. Common features of services        78

5.6.4. Web        80

5.6.5 BGP        81

5.7. Control plane vs. data plane        82

5.7.1. Data plane software        83

5.7.2. Example: masscan        83

6. Alternative framework        85

6.1. What is an ontology?        85

6.2. It’s not theory so much as engineering        86

6.3. Physical transmission        87

6.3.1. History        87

6.3.2. Digital signaling        88

6.3.3. Analog Modulation        88

6.3.4. Radio        89

6.3.5. Forward error correction        89

6.4. Payload        90

6.4.1. User sessions        90

6.4.2. Data Encoding and Representation        91

6.5. A network        92

6.5.1. Links vs. network        92

6.5.2. Many to many        92

6.5.3. Path        93

6.5.4. Other concepts        94

6.5.5. Other networks        94

6.6. Bulk transport        95

6.6.1. Top of the network        95

6.6.2. Transport addresses        95

6.6.3. Ends        96

6.6.4. Congestion-control        96

6.6.5. Fragmentation and reassembly        97

6.6.6. Payload congestion        97

6.6.7. Retransmission and reliability        97

6.6.8. Establishing connections        98

6.6.9. Summary of transport        98

6.7. Security        99

6.7.1. Security is adversarial        99

6.7.2. Not just crypto        99

6.7.3. Encryption        99

6.7.4. User authentication        100

6.7.5. Authorization        101

6.7.6. Auditing, logging, and instrumentation        102

6.8. Naming and directories        102

6.8.1. Address lookup        102

6.8.2. Identity        103

6.8.3. Resources        103

6.8.4. Directory services        104

6.9. Parsing and formatting        104

6.10. Network management        105

6.11. Governance of public networking        106

6.12. Content delivery        106

7. Misconceptions        108

7.1. It’s not theory        109

7.1.1. What they teach you        109

7.1.2. Poorly understood terminology        110

7.1.3. Theory by command        111

7.1.4. Generalizations        111

7.1.5. Duplicate terminology        112

7.1.6. Buzzwords        112

7.1.7. Outdated        113

7.1.8. History        114

7.2. It’s not a framework        114

7.2.1. What they teach you        115

7.2.2. Assigning functions to layers        116

7.2.3. Assigning protocols to layers        118

7.3. Top 3 layers are fiction        118

7.3.1. What they taught you        119

7.3.2. Wrong anyway        119

7.4. Lower 4 layers are inaccurate        120

7.4.1. What they teach you        120

7.4.2. It still doesn’t match Ethernet/Internet        121

7.4.3. Entanglement vs independence        122

7.4.4. Rigidity        123

7.5. Layering        124

7.5.1. What they teach you        124

7.5.2. Things are layered, but there are no layers        125

7.5.3. Layer independence        126

7.5.4. Too many layers        127

7.5.5. Fixed layers        128

7.6. Packet headers        129

7.7. Local-only visibility        135

7.8. Control protocols        137

7.9. Ours is not the original OSI Model        138

7.10. Standard        139

7.11. The Application Layer #7        142

7.11.1. Service Elements        143

7.12. The Presentation Layer #6        144

7.13. The Session Layer #5        147

7.14. The Transport Layer #4        150

7.14.1. End-to-end        150

7.14.2. Internet Protocol is connectionless        150

7.14.3. Transport belongs on top        151

7.14.4. The endpoint        152

7.15. The Network Layer #3        152

7.15.1. There are multiple networks in the stack        153

7.15.2. The original Network was connection-oriented        153

7.16. The Data Link Layer #2        155

7.17. The Physical Layer #1        156

7.18. Exchangeable components        157

7.19. Later misconceptions        158

7.20. It was never useful        159

7.21. Truth        160

8. History of the Mainframe/Telecom Network        161

8.1. Steampunk circa 1880        162

8.2. Non-interactive batch jobs circa 1960        164

8.3. Non-interactive control systems circa 1960        165

8.4. Interactive terminals circa 1970        165

8.5. IBM SNA        168

8.6. Telecom X.25        170

8.7. Xerox PUP and XNS        174

8.8. ARPAnet and the TCP        175

8.9. Ethernet circa 1974        176

8.10. History of the physical layer        179

8.10.1. Baudot 5-bit digital telegraph        180

8.10.2. RS-232 serial link        180

8.10.3. Simplex, half-duplex, and full-duplex links        182

8.10.4. RS-422 and RS-485        183

8.10.5. USB – the universal serial bus        184

8.10.6. TTL, GPIO, I²C, SPI        185

8.10.7. UART - universal asynchronous receiver/transmitter        186

8.10.8. AppleTalk        187

8.10.9. History of the word protocol        188

8.11. OSI – Open Systems Interconnect        190

8.12. The original 7 layers        192

8.12.1. #7 - Application        192

8.12.2. #6 - Presentation        193

8.12.3. #5 - Session        193

8.12.4. #4 - Transport        194

8.12.5. #3 - Network        194

8.12.6. #2 - Data Link        195

8.12.7. #1 - Physical        196

8.13. Andrew Tanenbaum “Computer Networks” (1981)        196

8.14. C, Unix, and 32-bit microprocessors        197

8.15. Multiprotocol office networks in the 1980s        198

8.16. Interoperability and open systems        201

9. Proposals        202

9.1. Terminology        202

9.2. Teaching and textbooks        203

9.3. Professional certification        205

9.4. Wikipedia        206

9.5. Standards        208

10. Glossary        210

11. Some References        243

1. Abstract

This document argues that the “OSI 7 Layer Model”, the common way of describing the Internet, needs to be abandoned. It’s not just a lie, but unhelpful. It needs to be removed everywhere, except as a historic footnote about 1970s mainframes.

We’ve reached an absurd state of affairs where everybody knows the OSI Model is false, where everyone is confused by most of it. Yet, people still defend it, claiming some of it is helpful. Many remember some epiphany, where OSI helped them “get” a difficult concept. The problem is that these cases are almost always misconceptions, such as “layers”.

Its negatives far outweigh its positives. OSI is not theory, it’s not a framework, it’s not helpful, it’s not a standard, it’s not anything those who use it claim it to be. Anything you teach with the OSI Model can better be taught without it.

This document explains why everything they teach you about OSI is a lie.

This document is a dense read[1], containing hundreds of unfamiliar terms, many historical. I don’t intend for anybody to read it from beginning to end. Instead, I intend this to be a reference that thoroughly supports its claims. When university professors come up with arguments justifying OSI, I have a long chain of citations showing them wrong[2].

You can read the chapters in any order, but the most important is the Misconceptions chapter in the middle. That chapter debunks common beliefs, like “layering”.

2. Introduction

University professors have been using the OSI Model to teach computer networks for over 40 years, since even before the Internet was created. They know it’s not true[3], but wrongly think it’s helpful[4]. They wrongly think it’s some theoretical[5] basis or framework[6] for networking. They think it helps conceptualize[7] and explain the subject. They believe it’s a standard[8] for how networks are supposed to work.

They are addicted to it. Most professors can’t conceptualize the Internet without the model. It’s the classic problem with any advancement that breaks the old model, it takes a while for oldtimers to die off before the new concepts to take hold.

To start with, the OSI Model is outdated. It described the mainframe and telecom networks of the 1970s. The office networks and Internet of the 1980s worked on different principles and made OSI obsolete.

Figure: IBM mainframe, late 1960s

The OSI Model was designed according to the IBM mainframe[9] and telecom X.25 networks that had appeared in the mid-1970s. Its design was political, not technical. IBM owned office networks. The state-run telecoms around the world owned the long distance infrastructure. These powerful entities had to be appeased, so the model described what they did.

OSI was designed primarily around the dumb terminal connected to a mainframe.

People today can’t comprehend the original model because they have no experience[10] with terminals and mainframes. This makes them treat OSI as a deep mystery. They don’t understand what OSI meant specifically, so they assume it meant generalities. They re-interpret the terminology as theory, as timeless concepts[11], when all it really meant was some narrow, specific item.

For example, “session meant something related to attaching videotex[12] terminals to mainframes. It’s an issue so out-of-date that it can’t be explained to the modern audience. Back then, point-to-point links could be simplex, allowing only one side at a time to communicate. Thus, designers inserted a protocol layer to handle this. They could’ve used any term to describe this issue: dialog, interaction, ping-pong, chat, interlocution, intercourse – anything. But they chose “session”.

Since that thing disappeared 40 years ago, their “sessions” no longer exist. So nowadays, we imagine the term means something else. We imagine it’s some sort of timeless theoretical framework category that encompasses anything we might describe with the word “session”. We drop into this bucket anything other people have named “session”, like HTTP session cookies.

Everyone has a slightly different conception of what “session” actually means, because nobody is talking about what it originally meant. Every textbook or class notes describes it differently. There is no “standard” for what it actually means applied to the Internet, because it never meant anything that applies to the Internet. By the time the Internet appeared, links were always full duplex, so there was no need for the type of dialog control that OSI intended.

In short, OSI “session” meant something concrete that no longer exists on modern networks. So we instead pretend it means theory.

It’s like reading the confused writing of Nostradamus[13] and believing he’s actually making predictions of the future. We keep re-interpreting ambiguous phrases written back then to correspond to what’s going on now. We likewise stand in awe of the wisdom and prescience of the OSI Forefathers, pretending they anticipated modern networking, when they actually meant something completely different.

In 1981, when people started using OSI in college textbooks, it was at least plausibly useful. OSI was an international standard, and every computer vendor promised their network stacks would be changed to conform to the standard in the future. Academic papers often referred to the standard, declaring the layer where their future innovation would eventually fit.

But that never happened. Ethernet and Internet, the two major technologies we use today, violated the standard from the beginning. Ethernet was a full local network[14], not just a link, and thus didn’t fit in the OSI’s notion of a Data Link Layer #2. The Internet was “connectionless”, not “connection-oriented” as envisioned for Network Layer #3. Most importantly, it was an internetwork above OSI’s network layer, a conceptual layer that simply didn’t exist[15].

OSI was retconned to fix these problems. It’s like how the Star Wars movies change the relationship between Luke Skywalker and Darth Vader. In the first movie, Vader killed Luke’s father. In the second movie they decide that Darth Vader was Luke’s father. They claimed that since the identity of “Darth Vader” replaced “Anakin Skywalker”, from a certain point of view, it was true that “Vader killed Anakin”.

Much the same happened with OSI. Those who learned the model around 1980 went on to become professors teaching their own classes and writing their own textbooks. The model was increasingly inaccurate, so they kept retconning it, pretending it said something it really didn’t. Your teacher of networking is as much a liar as Obi-Wan Kenobi.

That’s why the bottom 4 layers of the model are so popular.  We  can discard the original model and pretend that they now describe Ethernet and the Internet.

Ethernet has two layers, “MAC” and “PHY”. The Internet has two layers (more precisely, protocols), “TCP”, and “IP”.

Thus, there’s a rough consensus view of the lower 4 layers based upon [TCP, IP, MAC, PHY] that’s roughly the same from one textbook to the next. It’s not the original, de jure standard, reading the actual OSI Model isn’t going to help you pass a college exam, but at least the retconned layers are a de facto standard that’s the same from one college to the next.

The situation is quite the opposite for the upper 3 layers. There’s no match in modern networks, so nobody can agree on how to retcon those layers. But despite this, every textbook, every professor, every class teaches these layers. What they teach is different.

The only commonality is that while they recognize that the layers themselves don’t exist, that they represent categories in a framework for theoretical concepts. Everyone defines these categories differently.

In short, there is no real modern standard. Nobody uses the written OSI standard (ITU-T X.200[16] aka. ISO/IEC 7498) as it was originally intended. They all have a non-standard retconned version of OSI. So long as they define the retconned version to match Ethernet and the Internet, there’s some agreement. But for the most part, each textbook, class, or professor differs significantly from each other.

You see this in the texts for the CISSP, the most common professional certification in the cybersecurity industry. The study guide[17] writes a bunch of fictional things about the OSI model that don’t agree with other sources. Somebody who doesn’t really understand these terminologies did their best to describe it anyway. It’s not real, it doesn’t match the original OSI, it doesn’t match what other people claim about OSI, it doesn’t match the Internet as it exists today. It is pure invention of whoever wrote the guide. Yet, in order to pass the class/certification, the student has to regurgitate this special description on the test. Learning about OSI from other sources, especially this book, means you’ll have less chance of getting the right answer on the test. (To be fair, the entire CISSP certification is bad, written by those who aren’t experts in the subject matter; this situation isn’t unique to OSI or networking.)

OSI has passionate defenders, in particular, the teachers who have been teaching these lies for so long. But even average people defend it because they remember a couple of concepts that OSI helped teach. Students fondly remember that one thing that was an epiphany, that brought everything into focus[18].

They declare it doesn’t have to be perfect to be helpful.

This is wrong because it’s not actually helpful.

For one thing, they are ignoring all the confusion. This text is 200 pages long primarily because so few people know more than a few things about OSI. I have to explain what it’s supposed to mean, what they should’ve learned, before I can debunk it. Trying to use OSI instead of the [TCP, IP, MAC, PHY] models creates a huge cognitive load for the student, and no student ever really learns it. That’s why I’m writing this text, because I talk to teachers and find they’ve never really learned all the things they are teaching

The second reason the above notion is wrong is that the most “helpful” bits teach misconceptions. In particular, OSI gets its eponymous concept, layering, wrong. Yes, things are layered in the network stack, which is why this misconception persists. But no, layering doesn’t work as OSI says. The Misconceptions chapter lists a ton of things that people believed were helpful – but are wrong.

For example, OSI teaches that all the layers are part of a single network stack, with each performing a different function. It teaches that Layer #2 (like Ethernet MAC) and Layer #3 (Internet IP) perform different functions in this stack.

But the truth is that the Internet and Ethernet are independent networks. They are both networks, meaning that Ethernet MAC and Internet IP perform roughly the same functionality (forward packets in a path through their respective networks). Yes, they have differences, Ethernet is primarily local while the Internet is primarily global. Yes, they are layered, the Internet is an internetwork that’s layered on local networks, including Ethernet sometimes. But they are still independent networks, not parts of the same network. Ethernet carries plenty of traffic that isn’t Internet. The Internet runs over many local links and networks that aren’t Ethernet.

OSI teaches Layers #2 and #3 perform different functions in the same network. The real world equivalents perform the same functions in different networks.

Every time OSI is used to teach students, it teaches the lie.

Luckily, not everyone is confused. After all, regardless of what teachers attempt to teach, a lot of students simply roll up their sleeves and learn from practice, working first hand with networks. People can learn the real-world reality of Internet TCP/IP and Ethernet MAC+PHY if they just ignore this nonsense teachers try to explain using OSI.

But enough do believe the lie. It’s a constant confusion among professionals who are supposed to be the experts, who can’t explain things, or are unable to do their jobs correctly, because their conceptualization is based on what OSI taught them and not how things really are.

We are in a situation like the Emperor’s New Clothes. It’s crazy how vast this lie has become. The upper 3 layers are when the teacher makes up their own lies, the bottom 4 layers are when they repeat the rough consensus lies. It shouldn’t be this hard to stand up and point out these things aren’t true, and laugh at the Emperor for being naked.

One reason OSI endures is that people don’t have an alternative. It’s easy to say their conceptualization is wrong, but that doesn’t help them re-conceptualize things the right way. Therefore, this document is overly thick, full of different ways of tackling this problem.

Layering doesn’t work as OSI suggests, but things are layered in the network stack. For those who want to visualize these layers, I present this simple model in a chapter:

For those who want to discuss complicated theory instead of simple practice, I present another chapter describing the theoretical complexity.

This text argues that there has been a paradigm shift[19]. The OSI Model was created in the 1970s to describe networks of their time. This consisted mostly of terminals talking to mainframes and circuit-switching networks of telephone monopolies. In much the same way the personal computer made mainframes obsolete, the peer-to-peer networking of the Internet made OSI obsolete.

The debate is similar to the famous paradigm shifts in science:

  • Aristotelian science compared to Galilean science
  • Ptolemaic model of the solar system compared to the Copernican heliocentric model
  • Miasma theory of disease compared to the germ theory of disease
  • Lamarck evolution compared to Darwin’s theories of mutation and natural selection
  • Newtonian mechanics compared to Einsteinian quantum mechanics

The field of computer science has had many such paradigm shifts as Moore’s Law has repeatedly transformed the field. In some cases, the situation is like Newtonian mechanics – the old model continues to be useful despite being technically wrong. Other cases are like the Ptolemaic earth-centric model – technically could still be used, but so grossly wrong that it needs to be replaced.

Defenders of OSI want us to believe it’s more Newtonian (helpful though wrong), but this document argues OSI is more Ptolemaic (not even helpful).

College professors who learned the old paradigm aren’t aware how obsolete it has become. They learned to conceptualize the Internet according to this model, so naturally, OSI still conforms to how they think about the world. They acknowledge the model is flawed, because it sometimes conflicts with reality, but they are mostly blinded by their misconceptions. They think the network works differently than it actually works.

Professors are also blinded by the fact that it’s an international standard. This gives it an air of legitimacy. But this is nonsense. Standards are political documents. It’s like writing a law defining pi (𝜋). The TCP/IP protocols that form today’s Internet violated the model when they were created, and everything on top of those protocols conforms even less. At most, only layers #1 and #2 have any legitimacy, and even then, these layers teach misconceptions.

2.1. Reader requirements

The intended readers of this document are university professors and technology professionals – people who believe themselves to already be experts on this subject. Students with only a little experience with networking will find much of this content overwhelming.

At least in theory. In practice, my college professor friends tell me this document is overwhelming for them as well. The problem is that they are more knowledgeable with how the Internet really works rather than how older networks worked.

I expect you to recognize TCP/IP as the name for the primary protocols of the Internet. I expect you to recognize IBM SNA and telecom X.25 as the standard protocols that existed in the 1970s before OSI and TCP/IP. I do have a History chapter that describes these in more detail.

I expect you to have used Wireshark to look at packets, so that when I show screenshots, you won’t be too discombobulated by them.

I expect some knowledge of OSI. However, everyone teaches OSI differently, so you should probably check the OSI chapter first, to see how others use the model.

2.2. Organization of this document

This document is organized into eight chapters. They can be read in any order – I’m not sure of the best order. Maybe History should go first? Maybe the list of Misconceptions first?

Introduction: This is a short summary of the major points. You should probably read this first.

The OSI Model: This document assumes the reader has heard of the OSI Model. But everyone’s knowledge is incomplete and different. This chapter attempts to present the consensus view – not the original model from the 1970s but the model that you are likely to find described in modern textbooks.

Alternative network stack: People ask “If not OSI, then what?”. There are two chapters, one with a simple alternative, and one with a complex alternative. It’s based upon the reality of the TCP/IP Internet that is layered on top of, and runs independently of, local networks like MAC/PHY Ethernet.

Alternative framework: OSI is falsely and grossly used as a theoretical framework (not just a network stack). This chapter presents alternative categories. Of particular interest is discussing how “transport” is a category that tends to be at the top of any network stack.

Misconceptions: This is the meat of this document. It’s a long list. Most come from some discussion I’ve had with a networking “expert” and why the thing they just said is wrong. Textbook writers and university professors should study this. It’s very dense – you are expected to think long about each one, not read them quickly.

History of Mainframes and Telecoms: This describes the original model, what it meant when drafted in the 1970s. It’s incomprehensible without understanding the history of mainframes/telecom up to that point.

Proposals: Here is what I say needs to be done. We need to remove OSI from our teaching materials (other than as history). It needs to be expunged from Wikipedia as a theory or a framework. Professional certifications need to stop rote memorization of the model. New standards need to stop pretending the OSI Model is still a standard.

2.3. About the author

I was chief architect at Network General in the late 1990s, where I wrote code for all the popular alternatives[20] to OSI and TCP/IP. A large part of this text comes from my experience in seeing networks from different perspectives, not simply how OSI or TCP/IP sees them.

Since then, I’ve written code for multiple network stacks, kernel code, user mode applications and so forth. I’ve spent years in data centers solving core problems of the Internet. As a hacker, I’ve written code that causes problems for network operators. For example, my masscan tool can scan the entire IPv4 Internet for a port in under 5 minutes.

It’s not just practical experience but I’ve also read a ton of academic papers over the years. I’m not an academic, so I just release code and blogposts, so you won’t see my names on many academic papers.

3. The OSI Model

This document isn’t an introduction to networks. It assumes the reader has at least heard of  “OSI” and can name a few of the layers.

However, different people have different conceptions of the model. Even if all colleges taught the same model (they don’t), a few years after college, each person remembers something slightly different anyway.

This chapter attempts to review the topic, so that all readers have the same idea what this book is talking about.

The goal of this chapter is to reflect what everybody knows about the OSI Model, the sort of thing you’ll get from querying ChatGPT. This sets the stage for the rest of the book, which teaches that what everybody knows is wrong. Thus, the descriptions in this chapter don’t reflect what I believe, but what everyone else generally believes.

3.1. The basic 7 layers

We are talking about the OSI 7 Layer Model[21] that looks like this:

7

Application Layer

6

Presentation Layer

5

Session Layer

4

Transport Layer

3

Network Layer

2

Data Link Layer

1

Physical Layer

It comes from an international standard (ITU X.200 aka. ISO/IEC 7498) first drafted around 1977[22]. This was written 6 years before the Internet was turned on in 1983, 4 years before TCP and IP were written up in RFCs, and 2 years before TCP and IP[23] were first imagined as protocols.

3.2. Protocols

The model identifies all the components of the network and how they interact with each other. These interactions are known as protocols.

This term has become overloaded. We mean not only the interaction itself, but also the packet headers used to transmit the interactions, and the name of the entire subsystem that implements it. Thus, TCP refers simultaneously to the code in the network stack, the headers on the wire, and the interactions (like the handshake at the start of a connection).

The two primary protocols of the Internet are TCP and IP – referred to as TCP/IP. These are synonymous, so that in technical discussions we might use “TCP/IP” and “Internet” interchangeably.

In the 1980s, standards bodies defined specific protocols that corresponded to the OSI stack, like CLNP[24] as a network protocol and ISO/IEC 8073 as a transport protocol, but they were never fully implemented[25].

It’s one of the things that confuses people. OSI was intended as practice, a blueprint for future protocols. Since those protocols don’t really exist in the modern world, people assume it was always meant as theory instead.

3.3. Where it started

By 1981, books on networking appeared using the model, such as Andrew S. Tanenbaum’s “Computer Networking” university textbook showing the model on the cover. Academics want to teach theory first and practice second. The OSI Model deceptively looked like such a theory. Early academic papers often referenced this theory. This started the chain of events, where each new textbook/course is based upon the teachings of a previous textbook/course. Teachers pass on the knowledge they were taught. OSI became a meme.

This textbook deserves a lot of the blame. It was less about how networks worked and more about how people described such networks. Reading the book in hindsight reveals how little the author really understood networks. It regurgitates buzzwords from vendors or academic papers, but doesn’t really explain them.

But to be fair, it reflected the academic thinking of the time. Many papers erroneously assumed the model was going to be the blueprint of the future, so to understand such papers, this textbook would’ve been helpful.

3.4. The Network Stack

One use of this model is to describe the network stack. Data sent by an application goes down the stack, with each layer doing its own thing, often adding metadata or headers. On the receiving end, the process is reversed, with each layer processing incoming packet headers until only the final data reaches the application on the receiving end.

According to the way it’s taught, each layer is responsible for something different. The “Networking Layer” is responsible for routing packets through the network, while the “Transport Layer” is responsible for detecting packets that have been lost and retransmitting them (among other things).

The model teaches how each layer communicates with its peer layer on the other end. In order to detect lost packets and retransmit them, the “Transport Layer” is communicating with the peer “Transport Layer” on the other end of the connection. In between, various devices operating at different layers can forward data. An Internet router only needs to process things up to layer 3, and can ignore any layers above that, which aren’t used in routing decisions.

This is shown in the following diagram (their Figure 12) from the X.200 standard. This is also the diagram on the cover of Tanenbaum’s 1981 textbook.

3.5. Use as a framework

The OSI Model is also used as a framework or ontology. Networking concepts are assigned to various layers in the model. A good example is this chart from Wikipedia.

This framework promises that it’s describing networking in general, but for the most part it’s only describing the Internet today. Alternatives to Ethernet and the Internet, both today and in history, don’t really follow this framework.

Most de-emphasize or ignore layers #5 and #6 since nothing plausibly similar to them exists on the Internet today.

For example, Douglas Comer’s popular textbook[26] still teaches what’s essentially the remaining layers, just retconned a little bit to conform to TCP/IP. It shares some of the same flaws[27] with the OSI Model.

We are also talking about the TCP/IP suite of protocols that runs the current Internet. You’ll often see them categorized into which layer they “belong”.

7

Application Layer

HTTP, SMTP, FTP

6

Presentation Layer

SSL, XDR

5

Session Layer

NetBIOS

4

Transport Layer

TCP, UDP, SCTP

3

Network Layer

IPv4, IPv6

2

Data Link Layer

Ethernet, PPP, SLIP

1

Physical Layer

3.6. Multi-protocol stack comparison

See also the History chapter on 1980s office networks.

In the 1980s, office networks had many alternatives to the Internet protocol suite of TCP/IP. TCP/IP wasn’t very popular, because it was a university research effort, not a real product. Customers preferred products supported by vendors.

Even though the OSI Model was supposed to be only on specific network stack, people distorted it as an attempt to describe all the network stacks of the time.

A specific example is this poster from Network General. They made the popular Sniffer™ Network Analyzer of the time, the product that predates the now popular Wireshark packet-sniffer. This poster, or one like it, was found on the walls of many IT departments in big corporations.

This poster is nonsense[28] trying to fit new technology into the obsolete OSI Model. The relative placement of protocols, showing protocols layered on each other is useful, but it would’ve been improved by removing the OSI parts, removing the upper three layers, and just putting the rest as generically above the network stack.

3.7. Consensus definition of the 7 layers

This section describes the consensus view of each of the layers. It’s not my view, but a rough consensus of how other people describe these layers, which is largely wrong.

A machine learning tool like ChatGPT is excellent for this purpose. It reads everything written on a topic and averages them out. In other words, while individual teachers/textbooks describe these layers differently, the following ChatGPT descriptions (in italics) will be the nearest thing to a consensus.

The rest of this book criticizes the consensus. When you need a refresher to  understand what this book criticizes, come back to this section. Just don’t believe that this section is truth.

The upper 3 layers are fictional, with no match in the real world. But the lower 4 layers have some match to today’s real world. You can use the following summary:

OSI Layer

Real world

What

Between

Transport #4

TCP, UDP

connections

processes

Network #3

IPv4, IPv6

packets

remote devices

Data Link #2

MAC, LLC

frames

local devices

Physical #1

PHY

bits

wires

3.7.1. Application Layer #7

Basics: This is a catch-all for everything not in another layer, or everything at the top of the stack. You are supposed to be able to identify HTTP as an Application Layer protocol, along with other things like SMTP (email), FTP (file transfer), and Telnet (remote terminal). Even things that are arguably part of other layers, like BGP or DNS, are placed here because they still run on top of TCP/IP.

From ChatGPT:

The Application Layer is the top layer in this model and is responsible for providing services and protocols that enable applications to communicate with each other over the network.

The Application Layer performs three main functions:

  • Network Process to Application Data Exchange: The Application Layer provides services that enable applications to access the network and exchange data with other applications. This includes protocols such as HTTP (HyperText Transfer Protocol), SMTP (Simple Mail Transfer Protocol), and FTP (File Transfer Protocol).
  • Application Service Access: The Application Layer provides access to network services that can be used by applications, such as directory services, email services, and database services.
  • Application-to-Application Communication: The Application Layer provides protocols that enable communication between applications running on different computers. This can include protocols such as Remote Procedure Call (RPC), which allows applications to call functions on remote computers as if they were local.

Overall, the Application Layer provides the services and protocols that enable applications to communicate with each other over the network, making it the topmost layer in the OSI model.

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.7.2. Presentation Layer #6

Basics: No such layer exists, but as a category, this is supposed to represent data representation or data encoding. One example is simple issues like ASCII vs. UTF-8. Some consider any encoding issue to be part of this layer, such as JPEG (pictures) or MPEG (video).

ChatGPT says the following:

The Presentation Layer is the sixth layer in this model and is responsible for ensuring that the data sent by the application layer is properly formatted for transmission over the network.

The Presentation Layer performs three main functions:

  • Data Formatting: The Presentation Layer translates data from the application layer into a format that can be understood by the network. This can include encoding data into a standard format, such as ASCII or Unicode, or compressing data to reduce its size.
  • Encryption and Decryption: The Presentation Layer can also provide encryption and decryption services to ensure the security of the data being transmitted. This can include using encryption algorithms to scramble the data and decryption algorithms to unscramble it at the receiving end.
  • Compression and Decompression: The Presentation Layer can also compress data to reduce its size for more efficient transmission over the network. This can include techniques such as Huffman coding or Lempel-Ziv-Welch (LZW) compression.

Overall, the Presentation Layer ensures that the data sent by the application layer is properly formatted and secure, and can be efficiently transmitted over the network to the receiving end.

Note that the Presentation Layer #6 was never defined to provide encryption or compression. It’s just that some people wrongly assume that they are also a data representation issue (they aren’t).

Here are some tweets that express a definition reflecting the consensus view:

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.7.3. Session Layer #5

Basics: No such layer exists, but many consider this a category for anything related to sessions. This is a circular definition, as people don’t really define what a session is, other than something handled by by the Session Layer #5.

There is a lot of confusion here. In recent years, the definition of this layer has crept up to overlap with the Transport Layer. A lot of people now describe this as doing something with “connections” that should actually be at the layer below.

This is the layer where your professor will become the most vague, waving hands, using confusing theoretical terms, and avoids answering any question about this layer because really they don’t know. Nobody knows. This text mentions this layer a lot because nobody ever knew what it meant, not really even the authors of OSI.

ChatGPT describes this layer as:

The Session Layer is the fifth layer in this model and is responsible for managing and maintaining connections between applications running on different computers.

The Session Layer performs three main functions:

  • Establishment, Maintenance, and Termination of Sessions[29]: The Session Layer is responsible for establishing, maintaining, and terminating sessions between applications on different computers. This can include setting up a connection[30], managing data transfer, and ending the session when the transfer is complete.
  • Synchronization: The Session Layer can also synchronize data transmission between applications, ensuring that data is sent in the correct order and at the appropriate pace.
  • Recovery and Error Handling: The Session Layer can also recover from errors that occur during data transmission and provide error handling mechanisms to ensure the integrity of the data being transmitted.

Overall, the Session Layer is responsible for managing and maintaining the connection between applications running on different computers, ensuring that data is transmitted efficiently and reliably.

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.7.4. Transport Layer #4

Basics: You are expected to know that TCP and UDP are transport protocols. You are expected to define the layer in terms of TCP, naming that it creates connections, segments data, and retransmits lost packets. You are supposed to identify the Sockets API as the way that most people write code to interact with the network – it’s a transport layer API.

The most important principle you should really know is that Transport is end-to-end. This is the most important point when trying to decipher what the original model intended, but for somewhat different reasons, it’s the most important point when describing how the modern TCP/IP Internet works.

By end-to-end we mean that the ends of communication are processes running on the computers at each end. In other words, the ends of communication aren’t the computers themselves, but software running on those computers. A process on one end establishes a connection to a process on the other end.

ChatGPT describes this layer as:

The Transport Layer is the fourth layer in this model and is responsible for providing end-to-end communication services between applications running on different computers.

The Transport Layer performs two main functions:

  • Segmentation and Reassembly: The Transport Layer breaks down data from the Session Layer into smaller segments for transmission over the network, and reassembles the segments at the receiving end to recreate the original data.
  • End-to-End Connection: The Transport Layer provides end-to-end communication services between applications running on different computers, ensuring that data is transmitted reliably and efficiently. This includes providing flow control mechanisms to manage the rate at which data is transmitted, and error control mechanisms to ensure the integrity of the data being transmitted.
  • The Transport Layer also provides two main protocols: TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). TCP is a connection-oriented protocol that provides reliable, ordered, and error-checked delivery of data, while UDP is a connectionless protocol that provides fast, unreliable, and unordered delivery of data.

Overall, the Transport Layer provides end-to-end communication services between applications running on different computers, ensuring that data is transmitted reliably and efficiently.

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.7.5. Network Layer #3

Basics: You are supposed to identify IPv4 and IPv6 as the protocols for this layer. You are supposed to know these are packets that are routed through the network. This is the layer that defines the “Internet”, this is the most essential layer of the entire model.

ChatGPT describes this layer as:

The Network Layer is the third layer in this model and is responsible for providing logical addressing and routing services to enable data to be transmitted between networks.

The Network Layer performs two main functions:

  • Logical Addressing: The Network Layer assigns logical addresses, also known as IP (Internet Protocol) addresses, to devices on a network to uniquely identify them. IP addresses are hierarchical and consist of a network ID and a host ID.
  • Routing: The Network Layer is responsible for routing data from the source device to the destination device using the most efficient path. This involves determining the best route for data transmission based on factors such as network congestion, bandwidth, and reliability.
  • The Network Layer also provides two main protocols: IP (Internet Protocol) and ICMP (Internet Control Message Protocol). IP is responsible for logical addressing and routing, while ICMP is used for error reporting and diagnostic purposes.

Overall, the Network Layer provides logical addressing and routing services to enable data to be transmitted between networks, making it a critical layer in the OSI model.

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.7.6. Data Link Layer #2

Basics: This defines a local connection, a link between two neighboring devices within the network, or a local network[31], like Ethernet.

We call packets “frames” at this layer, for two reasons. One reason is that it encapsulates the contents of the upper layer, like a picture frame around a picture. Another reason is that it constructs the packets out of bits from the lower layer, like the frame of a bicycle or the frame of a house.

In the real world, you are supposed to associate this with the MAC layer of Ethernet. This is where Ethernet MAC addresses come from.

ChatGPT describes this layer as:

The Data Link Layer is the second layer in this model and is responsible for providing reliable and efficient communication between nodes on the same network segment.

The Data Link Layer performs two main functions:

  • Framing: The Data Link Layer encapsulates data from the Network Layer into frames for transmission over the physical medium. This involves adding a header and a trailer to the data to form a frame, which includes information such as source and destination MAC (Media Access Control) addresses.
  • Media Access Control: The Data Link Layer is responsible for controlling access to the physical medium to prevent collisions and ensure that only one node is transmitting at a time. This is done using a protocol such as CSMA/CD (Carrier Sense Multiple Access with Collision Detection) for wired networks or CSMA/CA (Carrier Sense Multiple Access with Collision Avoidance) for wireless networks.

The Data Link Layer also provides two main sublayers: the Logical Link Control (LLC) sublayer and the Media Access Control (MAC) sublayer. The LLC sublayer is responsible for identifying different Network Layer protocols and providing flow control and error handling mechanisms. The MAC sublayer is responsible for addressing and accessing the physical medium, as well as error detection and correction.

Overall, the Data Link Layer provides reliable and efficient communication between nodes on the same network segment, making it a critical layer in the OSI model.

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.7.7. Physical Layer #1

Basics: This is where a stream of bits is transmitted onto the physical wire[32].

You are supposed to associate this with the Ethernet PHY layer.

ChatGPT describes this layer as:

The Physical Layer is the first layer in this model and is responsible for transmitting raw data over the physical medium.

The Physical Layer performs three main functions:

  • Physical Characteristics of the Transmission Medium: The Physical Layer defines the physical characteristics of the transmission medium, such as the voltage levels, the physical connectors, and the cable types used to connect devices.
  • Data Encoding and Signaling: The Physical Layer is responsible for encoding data into signals that can be transmitted over the physical medium. This involves converting digital data into analog signals for transmission over analog media, or using digital signaling techniques such as Manchester encoding for digital media.
  • Transmission and Reception of Data: The Physical Layer is responsible for transmitting data from the sender to the receiver and for receiving data from the receiver. This involves ensuring that the data is transmitted at the appropriate speed and in the correct sequence.

The Physical Layer also provides various specifications and standards for different types of transmission media, including wired media such as twisted pair cables, coaxial cables, and fiber optic cables, as well as wireless media such as radio waves and infrared.

Overall, the Physical Layer is responsible for transmitting raw data over the physical medium, making it a critical layer in the OSI model.

See also Chapter 7 for misconceptions about this layer, and Chapter 8 for the original OSI definition of this layer.

3.8. Non-consensus views of OSI

Since there is no written OSI standard[33], everyone is free to make up their own. This document debunks the consensus view of OSI as used today, as well as the original view defined in the 1970s. But there’s a lot of individual non-consensus views that just appear on the Internet that need to be debunked as well.

A good example is a document[34] from CloudFlare, one of the most respected Internet infrastructure companies. Judged in terms of how well it matches the consensus descriptions (shown above), it’s nonsense. It’s descriptions of layers #1 and #3 are plausibly correct, but the rest is silly. You can see this in it’s diagram below:

Figure: Version of the OSI Model from https://www.cloudflare.com/learning/ddos/glossary/open-systems-interconnection-model-osi/ 

For example, Data Link Layer #2 is defined in this chart as “defines the format of data on the network”. It’s true from one point of view. We talk about the “format” of Ethernet packets. But by that definition, every layer “formats” data, as every layer has a format. It’s untrue to say that Layer #2 is specifically responsible for the “format” function.

When you read deeper into the CloudFlare document, you get such statements as the following “If the two devices communicating are on the same network, then the network layer is unnecessary”. It’s technically true from one sense, in that two local devices can communicate directly over Ethernet without the Internet TCP/IP protocols. But they usually don’t – almost all communication in office networks goes across TCP/IP, with no knowledge on either end that the other side of communications is local rather than going around the world.. Whether the company printer is on the same local Ethernet or located elsewhere, you still need TCP/IP to talk to it.

In short, this is a document that changes the consensus definition, describing things in a way that confuses people about the consensus.

3.9. Modern networks

You are expected to know that the first 4 layers of the OSI Model refer to Ethernet MAC/PHY and Internet TCP/IP.

You are expected to know that the Internet consists primarily of the TCP/IP protocols. You are expected to know that UDP is a simpler transport protocol than TCP. You are supposed to recognize other common Internet protocols, like DNS and HTTP.

You are expected to have experience with Ethernet + WIFI as the local network in the home and office with the MAC and PHY sublayers. You are expected to know that while this is overwhelmingly how we build local networks on edges of the network, that it doesn’t extend past the router into the public Internet.

You are expected to know that there are a lot of other local networks and links, such as USB for computer peripherals, HDMI for video, or CANbus within automobiles.

4. (simple)
Alternative network stack

The simplest alternative to teaching OSI is nothing – simply remove any mention of OSI without replacing it with anything. OSI isn’t theory, it’s not a framework, and it’s not modern practice, so there’s really no point in keeping it for any of these purposes. There’s no concept that isn’t better taught by not referencing OSI[35].

But still, some demand a theory framework. This textbook therefore presents two alternatives. This chapter shows a simple alternative, the next chapter shows a much more complicated and theoretical alternative.

4.1. Network-on-network

OSI teaches that everything is part of a single integrated network stack. This was true for IBM mainframes and telecom X.25, but is not true for modern cyberspace. Instead, we have multiple networks, most often the Internet and local networks. They are not integrated together but independent from each other. Instead of the single network, we have multiple independent networks.

In your home/office, you are typically running two separate networks: a local network that spans the home/office, and part of the Internet that spans the world.

The modern Internet is an “internetwork” – it interconnects local networks. It does this by being built on top of local networks.

This might be shown with the following diagram. These aren’t two layers of the same network, but different networks that are layered on each other. I draw the boxes staggered because they aren’t integrated. Local networks carry traffic other than just Internet traffic, and the Internet allows routers to be connected using things that look nothing like local networks, such as carrier pigeons.

Instead of showing these together (which implies they are designed to work together), maybe it’s better to show them separately, as in the following diagram.

The Internet is a complete network. It can carry any sort of payload, either more complex network services (like the web) or simple data. It doesn’t run over wires directly, but uses some sort of underlying network technology to connect routers. Packets go across the Internet from router-to-router. In between each pair of routers will be some local network or local links. On the edges, Ethernet+WiFi is popular. In the center of the network, across the backbones, other technologies are popular, like MPLS. This may be point-to-point wires, satellite transmissions, undersea cables, or even carrier pigeons. The Internet just doesn’t care.

Ethernet+WiFi is a complete network (wired and wireless technologies are integrated together to form a single local network). From their point of view, everything is just payload. Sure, this payload often consists of remote Internet traffic, but they can carry other local payloads at the same time, too. Historically, Ethernet carried alternatives to the Internet, such as AppleTalk, IBM SNA, Novell IPX, and others – all simultaneously. When IPv6 came along to replace[36] IPv4, Ethernet and WiFi equipment remained unchanged[37]. When some new internetworking technology appears 20 years from now to replace the Internet, your existing home/office infrastructure will not need to change.


Figure - tp-link switch

Ethernet+WiFi includes physical transmission, either the wire connecting you to the switch, or radio waves connecting you to an access-point. This physical transmission is independent from the forwarding of Ethernet packets. Your Ethernet cable connects you to the local switch, which forwards your packet out to a different port. After that point, your Ethernet packets are sent over physical links that often look nothing like your own Ethernet cable (maybe fiber optics, maybe different speed).

The two-layer diagram above staggers the two networks to reinforce this point that they are not exclusive. The Internet usually runs over recognizable local networks, but can run over anything[38]. Likewise, local networks like Ethernet usually carry Internet traffic, but can carry other things as well. The two layers don’t line up neatly, they aren’t parts of a unified whole.

With telecom and mainframe networks of the mid-1970s, there were no “Local Networks”. Instead, there were simple “links”. Ethernet had not yet been developed. OSI was correct for its time, labeling such links “Data Link Layer” and the “Network Layer”. But the Internet made that concept obsolete – what connected neighboring divides didn’t have to be a “link”.

4.2. Protocols or sublayers

The two abstract networks described above, the Internet and local networks, can themselves be split into protocols or sublayers.

The Internet uses the term protocol for these subdivisions, such as how TCP runs on IP, where the “P” in both acronyms stands for “protocol”. There are many other “protocols” like UDP, ICMP, DHCP, and so on. The term DNS has an “S” meaning “system”, but is implemented as yet another “protocol”.

The Internet doesn’t really have layers (or sublayers), just protocols. We layer protocols on top of each other in an ad hoc manner. Sometimes there are a lot of extra layered protocols, such as when we use VPNs, and sometimes there are fewer.

The one place where the idea of layers exists is at the bottom of the network stack. We often call the components of Ethernet sublayers instead of protocols. The modern Ethernet PHY sublayer is as complex as the rest of the network stack combined even before signals[39] are transmitted onto the wire.

Whereas networks are layered on each other independently, the protocols/sublayers within a network will be dependent on each other. TCP depends on IP[40]. The MAC and PHY (and LLC[41]) of Ethernet are partly dependent upon each other.

We might show this with a diagram like the one below. The point is that the sublayers/protocols of one network are independent of the sublayers/protocols of the other network.

The problem with the OSI Model, the one that needs to be expurgated from texts, is combining the above two concepts into a common model. The protocols/sublayers of the Internet should not be combined with the protocols/sublayers of the local network. They are useful concepts when used apart, but confusing when combined together.  The following sort of diagram is very bad.

#4 Transport

TCP

Internetwork

#3 Network

IP

#2 Data Link

MAC/LLC

Local Networks

#1 Physical

PHY

This is one of the biggest misconceptions people have today. Academics and professionals alike keep making the mistake of treating Ethernet and Internet as part of the same network rather than as independent networks. This is where they most often cite the OSI Model as helpful in their understanding of networks – but it’s a misunderstanding.

In summary, the alternative to the OSI Model is either one of the two:

  • Nothing – remove any mention of it[42].
  • Don’t combine the protocols/details of the TCP/IP Internet with the details/sublayers of local links like Ethernet – these are independent of each other, not integrated together.

Note that within a network, the individual protocols/sublayers are still accurate. There are numerous international standards for local links in industrial environments, cell phone networks, the power grid, and so on that use the “lower 2 layers of OSI”. This all still works – the LLC, MAC, PHY sublayers exist in these standards. The only thing that needs to be discarded is the pretense that these integrate with any more layers.

4.3. Upper OSI layer alternative

The subsection above describes a simple alternative to the lower 4 layers of the OSI Model. This subsection describes an alternative for the upper 3 layers.

There is none. People try to convert these three layers into three categories of a framework, but that’s lies.

The upper 3 layers of the OSI Model simply don’t exist. It’s not simply that the layers don’t exist but that the corresponding functionality or theory doesn’t exist either. There are no sessions that people try to associate with the Session Layer #5[43], and there is no network encoding/representation as described by Presentation Layer #6[44]. Applications exist, but nothing like what was original envisioned by the Application Layer #7 of things running on a mainframe.

These layers were intended to describe the terminals connected to mainframes in the 1970s. For example, terminals had to negotiate which command codes were used to draw things on their screens. Terminals from different vendors had wildly different command codes, and different products from the same vendor had different capabilities. Therefore, such control codes needed to be negotiated.

Such terminals no longer exist, not even in the modern remote terminal protocols like Telnet and SSH[45]. Hence there’s no alternative model for these things.

4.4. A Packet Network

Both the Internet and Ethernet are packet networks. There are other such networks, like MPLS and SDN. A theoretical model needs to define what a “network” is.

4.4.1. A network

A network is a mesh with nodes connected by links, such as the following diagram. It connects many things to many other things – a many-to-many network.

Figure: “On Distributed Communications Networks”, Paul Baran, 1962

The reason it’s called a network is to contrast it to a link[46] between only two devices. As soon as we introduce a third device, as soon as traffic forks going in one direction or another, as soon as it goes through a device onto another, then we have a network.

The original Ethernet cable was a network because many devices connected to the same Ethernet cable. The modern Ethernet cable is different, it just connects you to the switch, which then forwards packets. Your Ethernet wire is a link, the switch creates a network – a local network.

4.4.2. A packet

A packet is a sequence of bytes containing both a header with control information (like addresses) and data containing the payload. In other words, modern networking aren’t streams[47] of bytes with no clear beginning or end, but a package of limited size (usually less than 2 kilobytes).

Conceptually a packet looks like a telegram message:

Header

From: 172.16.101.65
To:  142.250.217.238

Payload

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

You can eavesdrop on your own packets entering/leaving your computer with a tool like Wireshark. What a packet actually looks like is the following picture (captured from a local network):

This picture above shows a standard hexdump. You aren’t expected to read hex, but what you should see here is that there’s a header at the beginning of the packet followed by the payload. The highlighted 4 bytes are the destination address.

4.4.3. An address

The most important part of a header is the destination address. This specifies where the packet is going. It’s a binary number that we humans typically ignore – you are expected to recognize that a thing is an address more than you are expected to read them. Some typical examples of an address are:

142.250.217.238
2607:f8b0:4002:817::200e
02-60-8c-de-ad-bf

Note that these are the human readable forms of the addresses. They are just a 32-bit number, a 128-bit number, or 48-bit number respectively.

4.4.4. A route or path

A packet is sent by passing it to the nearest router or switch. It then gets passed from hop to hop through the network until it reaches its destination. This looks something like the following:

The path[48] is not fixed. Different packets may follow different routes through the network. Network conditions change, such as links failing, in which case the next packet between these two points will find a different path through the network.

The tricky part is discovering the route/path. There is no centralized map of the Internet. Instead, each router figures out themselves the parts that are important to them. Routers frequently disagree. Changes are constantly happening, and it takes a while for knowledge of changes to propagate to all parts of the network.

The following uses the traceroute program to trace the route between my laptop and Google’s DNS server. It traces the route 3 times. Because of load balancing, packets can follow different routes through the network.

When transmitting an Internet packet across a local network, we’ll see two headers: one for the local network (like Ethernet) and one for the Internet. Logically, it will look like the following, with one packet nested in another. In practice, the end result is just a bunch of bytes. While conceptually headers encapsulate inner data, in practice they almost just prefix the rest of the packet.

Header

From: 3c-22-fb-55-7b-6a
To: 02-60-8c-de-ad-bf

Payload

Internet
Header

From: 172.16.101.65
To:  142.250.217.238

Payload

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

When the local packet reaches the Internet router, all the local information will be stripped off – de-encapsulated. Only the encapsulated packet travels all the way through the Internet.

4.5. Differences between Ethernet and Internet

An important principle of this document is that the Internet and Ethernet are different networks, rather than parts of the same network. The section above describes them together, how they implement the same theoretical principles. In this section, we describe how they are different.

4.5.1. What runs underneath

The biggest difference is what runs underneath. The Internet is defined to run over other networks. Ethernet is defined to run over physical wires.

This is shown in the diagram below.

As everyone knows, there’s an “Ethernet cable” that connects their desktop/laptops to a switch. There is no “Internet cable”.

We split Ethernet into two sublayers: MAC and PHY. The MAC layer specifies the frame/packet that is forwarded between switches. The PHY specifies the physical cable that connects us to a switch. That cable only reaches as far as the local switch.

After the switch, there may be very different PHYs, such as a fiber-optic cable running at 10gbps instead of a 1gbps copper wire. More importantly, there may be a vastly different technology. DOCSIS cable-modems, those that don’t integrate their own Internet router, bridge local Ethernet to some technology that uses Ethernet MAC frames encapsulated inside a vastly different technology.

Likewise, while Internet traffic is intended to run on top of local networks, it can run raw over point-to-point links, such as with SLIP[49].

Thus, while Ethernet and Internet are defined differently in theory, in practice, they have roughly the same concerns.

4.5.2. Forwarding by address

An Ethernet network contains only a few thousand devices. Every node in the network (switches and bridges) knows about all devices. The table stores the entire 48-bit address for every device, using the entire address when making forwarding decisions.

An Internet contains billions of devices. It’s not feasible for every node (routers) to know about the location of every device. Instead, it only uses the first ~20 bits[50] of an address to forward packets – the subnet prefix. This limits routing tables to fewer than 1 million entries. Only when a packet reaches the target subnet does the last router use all the bits in the address.

Thus, the Internet forwards packets using only part of the IP address, while Ethernet forwards packets using the entire MAC address.

This has a number of implications. For example, moving an Ethernet device between ports on Ethernet switches doesn’t change the address. Conversely, when moving an Internet device to a different location, such as connecting to a random Starbucks WiFi network, an address change is needed. The subnet prefix defines a location, so changing locations needs a different subnet prefix.

That’s why Ethernet MAC addresses are hardware addresses. They can be assigned once when the hardware is manufactured. In contrast, Internet addresses are assigned dynamically every time the device is connected.

4.5.3. Hop limit

The Ethernet header and IPv4/IPv6 headers are mostly the same. The major difference is a hop count or time-to-live field that prevents routing loops. If Ethernet switches are arranged in a loop, then packets can be forwarded round and around forever. Thus, loops are allowed on the Internet, but not on Ethernet.

The Ethernet MAC packet format is shown below.

The IPv6 packet format is shown below. The traffic class and flow label aren’t really used, so the only real difference is the hop count.

The IPv4 header is more complex, but for the most part, people realized this added complexity wasn’t needed, so those features were removed for IPv6.

4.5.4. Routing protocols

Ethernet switches have a basic algorithm for forwarding packets along a path. A switch looks at the source MAC address of packets arriving on a port. When it sees that destination in a packet, it knows which port to use when forwarding.

If it sees a destination without having first seen the source, and hence it doesn’t know which port to use, then it sends it to all ports (“floods”). Even for large Ethernet networks with thousands of devices, such flooding events are rare, as communication is almost always two way.

It’s recursive. Each switch discovers the direction of a destination without ever knowing the entire map of the network.

This would not be practical on the Internet, with its billions of devices. For one thing, Internet addresses have prefixes. Routers use something called routing protocols (like BGP or OSPF) to communicate with each other where those prefixes are located.

These routing protocols use clever tricks to communicate as much information they can with the least amount of overhead.

Whereas Ethernet switches must never have a loop or redundant paths, the feature rich Internet routing protocols allow this to happen. When errors happen in routing protocols, a routing loop can form – which is why Internet protocols have a guard against it with the hop-count aka time-to-live field.

4.5.5. Transport (simple)

This textbook re-defines the word transport to mean the protocol on top of the network. The word is used heavily in networking, so it can’t be avoided. The major change is that this  book defines transport to be the top of a network.

The top of the Internet protocols are TCP and UDP. This book calls them transport protocols rather than Transport Layer protocols.

The top of Ethernet we find the optional LLC protocol. It provides much the same features as TCP, but only across the local link. Since TCP does a better job end-to-end, we don’t see it in most[51] local traffic.

Since transport is at the top of the network, this is where you find the interface or APIs.

Transport protocols have some extra identifier, added to the network address, used to identify which of many processes/apps on the machine the traffic is associated with. For example, HTTP traffic is usually found on TCP port 80.

Transport protocols can either send individual packers alone, or set up a connection that tracks packets, resending them as needed.

The complex alternative chapter has a more sophisticated discussion of bulk transport.

4.5.6. Naming

Humans don’t like numeric addresses. We prefer human readable names. Therefore, networks provide some sort of translation between names and addresses.

On Ethernet, this is done with broadcasts. Local machines transmit packets to a special address called a broadcast or multicast address. These packets are seen by every machine on the local network – they flood the network. This works well because local networks have only a few thousand machines, and broadcasts aren’t that often. When you look for computers nearby (such as file servers) on macOS or Windows, what you are seeing is the local computer keeping track of the broadcasts by those servers.

On the Internet, broadcasts aren’t possible, otherwise your incoming network connection would be flooded with packets coming from 20 billion computers. Instead, there are separate servers that keep track of name-to-address mappings. This is called DNS or the Domain Name System. Each organization uses their own DNS server, so that Apple is in charge of every name ending in .apple.com (for example). Your local network provides a resolver that will translate names back into numeric addresses. Thus, when you go to www.google.com, your computer will ask the resolve to first get the numeric address, then will continue communication to that numeric address.

Note that while naming features are implemented as protocols, they don’t belong to any particular sublayer. Protocols are layered on top of each other, but they really don’t exist as layers.

People have a confusing time trying to place DNS into an OSI layer. The reality is that it doesn’t belong to any OSI layer. It’s usually placed in the Application Layer #7, but it’s best described in no layer at all, such as in section 6.8.

4.5.7. Configuration

Ethernet is designed such that hardware is plug-and-play, that no special configuration is needed. You simply plug in a device to the switch and immediately start communicating.

In contrast, the Internet needs configuration. Before it starts communicating, it needs an IP address to be assigned, needs to find its local router, and needs to find a DNS resolver.

Ethernet only needs addresses to be locally unique. But strangely, it does this by assigning every device a globally unique MAC address.

The Internet needs a globally unique address on the public Internet. But strangely, it does this most of the time using a locally unique address (like 192.168.1.2) that’s then translated by the nearest router (a NAT or network address translator).

Ethernet resolves its configuration by talking over the local wire to the switch or by broadcasts on the local network.

Internet IPv4 and IPv6 usually use DHCP to handle their configuration issues. They may also use local broadcast packets.

4.6. RFC 791

This simplified model matches the Internet Model specified in RFC 791[52]. Specifically, the Internet is an internetwork running on local networks. The details of local networks are irrelevant. The local network and internetwork are independent, their details not part of the same model.

This model is every much as rigorous as the OSI Model. Many texts claim this Internet model is somehow deficient, simplified, or incomplete because it leaves local networks undefined. It’s not true, it’s a better model because it leaves them undefined.

OSI defined a single network that was mostly local. RFC 791 defined an internetwork. These are two different things.

One important concept here is protocols instead of layers. There’s some understanding here that TCP and UDP are alternatives for each other. There’s some understanding that ICMP is an integral part of IP, rather than layered on top. But while protocol layering happens, abstract layers don’t. There’s an Internet Protocol, not an Internet Layer.

The Local Network Protocol is a “network” not just a “link” as in OSI’s Data Link Layer #2. OSI was originally designed anticipating only point-to-point links between devices. Ethernet broke this model, creating a local network instead of a local link. But even without this, the Internet was designed to be an internetwork. OSI was originally designed to be a local private network, with each company having their own network not interconnected to others. The Internet was designed to internetwork those disparate networks, to be a public network.

In other words, the above diagram isn’t something that you compare side-by-side with OSI. Instead, it runs on top, with all of OSI merely the local network.

5. (complex)
Alternative network stack

The above chapter presents the simple model of how the Internet is layered on local networks. This should be what everyone uses to replace the OSI Model.

However, university classes want more theory. In this chapter, I present a more complex theory.

5.1. My Network Stack

The previous chapter presents a simple model, this chapter extends it with the following theoretical concerns:

  • We need to think in terms of what goes on top of the network, the payload.
  • The web, email, and a bunch of other technologies form their own independent network on top of the Internet.
  • We need to think more deeply about physical links as something separate from the network.
  • We need proper abstraction levels.

I present the following as a general description of the modern network stack.

What this model shows is three levels of abstraction.

  • At the highest abstraction, we see the network stack relative to things that are outside the network stack. Above the stack, users send payload using the network stack. Below the stack, we have transmission over physical wires.
  • The network stack itself is composed of layered networks. This shows a typical [web/Internet/Ethernet] as an example because it’s extremely common, but there are other combinations.
  • Each network has its own sublayers or protocols. For example, the Internet is often referred to by the two most popular protocols of TCP and IP. Ethernet has LLC, MAC, and PHY sublayers. The web has HTTP and SSL protocols.

The OSI Model has the wrong abstraction level, combining the detailed with the general. Our model has three abstraction levels, from the very general to the very detailed.

At the most general point of view, we model how we use the network. The network carries payloads. A payload is independent from any network. For example, MPEG video streams are carried in local radio broadcasts, satellite, and across the Internet. It’s the same payload regardless of the underlying transport.

In much the same way, the physical transmission of bits is roughly the same regardless of how those bits are interpreted. The technology is constantly evolving bringing faster speeds at lower prices.

Both of these, the payload on top and the physical transmission underneath, are fundamentally outside the network stack. This is difficult because both also overlap with the network. For example, VoIP is a payload that needs a low-latency network, and physical standards are often in the context of a specific local network. Nonetheless, I think it’s better to model them outside the network stack.

Our middle level of abstraction is the network stack. The typical stack that we see personally is the [web/Internet/Ethernet] that you are likely using right now to read this document online. Each of these, the web, the internet, and local network like Ethernet are each a network. At this level of abstraction, networks are layered rather than having layers.

It’s not the only way of stacking things. Consider VPNs and Tor, they add another network to the stack. Another example is ISP backbones, which have their own internal network (like MPLS) that’s layered on top of things like Ethernet and below the Internet. Major services like Twitter or WhatsApp create their own network on top of the web.

Adjacent layers are opaque, distant layers even more so. Internet apps have no clue about Ethernet, for example. You don’t specify the  MAC addresses in the Sockets API, just the IP addresses of the destination. The fact that on your local network you’ll be using Ethernet is some distant and unknowable fact to the application.

The layers [web/Internet/local] don’t have a theoretical purpose. We don’t assign functionality to one layer rather than another layer. The layers aren’t part of a single network, but their own distinct networks. Sometimes there’s overlap in functionality.

The left-hand side is as much theory as we can have. As we move rightwards in the above diagram, things become more practical. The Internet has details like TCP and IP. Ethernet has details like MAC and PHY. These are very practical, specific to their context, and not at all generalizable.

5.2. Sublayers and Protocols

This document doesn’t teach layers. Things are certainly layered on top of each other, but that doesn’t mean that things belong to the same layer. For example, both TCP and ICMP are layered on IP, but they aren’t part of the same layer.

The major things that are layered are the networks themselves. The web runs on top of the Internet, which in turn runs on top of local networks like Ethernet and WiFi.
These are
independent networks, not part of the same network.

Within a network, there may be dependent parts. Sometimes they are best described as sublayers. Sometimes they are best described as protocols. While they are modularized, they are intended to work together.

  • The web has such things as SSL and HTTP has somewhat independent protocols, but HTTP/3 QUIC has integrated them into a single protocol.
  • The Internet has the TCP and IP protocols.
  • Ethernet has the PHY, MAC, and LLC sublayers as we describe below.

In our thinking, OSI is a seven sublayer model. It’s not defining layers so much as sublayers. All seven of the sublayers are designed to work together as part of a single network.

OSI is often used to teach how layers are supposed to be modularized and independent from each other, but then implicitly says they are all dependent and coupled with each other. In this alternative, the networks themselves are independent from each other, but the components of a network are dependent on each other.

It’s sublayers all the way down. Parts often have smaller parts. For example, the PHY sublayer of Ethernet is split into further sub-sublayers. These layers in turn are split into their own sublayers. High-speed Ethernet looks something like:

The thing to note here is the level of abstraction. The OSI Model has a single level. These improperly combine the internal details of one network with the internal details of another. In our model, we have several levels of abstractions, from the generic to the specific. For any level of abstraction, sometimes you step out and describe things with less detail, and sometimes step in with more detail.

5.3. Payload

The payload is everything above the network, outside the network, on top of the network stack. We use the term in several places. In each case, it means the thing that’s largely unrelated to the thing below it.

For example, when looking at just Ethernet, we define it as carrying a payload. This payload could be a local video stream, but it can also be the Internet. The Internet itself carries just payload, but that payload is usually the web. In all these examples, the payload is the thing outside of whatever the network is below it.

The key concept is that the network is bounded by both a top and bottom. The payload represents the top of anything as much as the physical transmission of signals represents the bottom.

The lack of a top boundary is missing from the OSI Model. People imagine that we’ll keep adding new layers on top, like layer #8, layer #9, and so on. In reality, as we add new layers in the stack, we stick them in the middle and underneath. A lot of technologies, from SSL to MPLS, involve lifting everything up and shoving something else underneath.

In the diagram below, the left-hand side shows how the OSI Model draws the stack, with no upper boundary. To the right is now we should imagine the stack, with boundaries on both ends.

There are the main things within this payload category:

  • Data
  • Requirements
  • Protocols

The basic point is that a stack has a top. Any model without a top is missing something important.

5.3.1. Data

First and foremost, the payload of a network largely consists of just the data being carried. This is independent of the network. Data is data even when there is no network. For example, a PDF file is payload. We pass this file around via email, file copies, the web, text messages, and so on. But we also pass it around on USB thumbdrives, independent of anything that looks vaguely like a network. We also print them out and give paper copies to people.

Payloads can be more than simple files. For example, industrial control systems need to transfer large amounts of measurements of temperature, pressure, flow rate through pipes, radiation, and so forth. Likewise, they need to transfer commands in the reverse direction, to close valves, flip switches, and so forth. These aren’t files, but they are payload.

Fundamentally, data is something carried by the network, but is not part of the network.

A key theoretical principle of data is encoding or representation, how binary data represents things. One typical example is character sets, such as ASCII, EBCDIC, and UTF-8. Another is byte-order of integers, such as hi-lo (big-endian) or lo-hi (little-endian).

OSI teaches that such data representation is a function of the network. That’s because in the mainframe era, it was. Each type of computer represented data differently internally and a transformation was needed for two computers to talk. It was assumed that the network would be responsible for solving this problem.

In modern cyberspace, this is no longer true. Data is no longer tied to a single computer. Data has a universal form. Sure, sometimes legacy translation issues exist, but it’s not the network’s job to fix these issues.

Data representation is a property of the data, not of the network.

5.3.2. Requirements

The payload imposes requirements on the network, such as throughput, latency, and reliability.

One of the most important requirements is latency or ping delay. This measures the time for a packet to go across the network in one direction and the time it takes for a response to be received. Networks must have low latency for phone calls. When delay becomes too great, both sides start talking over each other. Each side perceives the other side as rudely interrupting them.

Traditional geostationary satellites have a ping time of around 600 milliseconds. That’s because their orbits must be 36,000 km (22,000 miles) high. It takes the speed of light a long time to travel that far and back.

SpaceX’s Starlink Internet instead works by putting thousands of satellites in low earth orbit only 500 km (300 miles) high. These reduce the latency down to around 30 milliseconds – which is roughly the same latency experienced by wired users. It’s 4000 km between LA and NYC, for example. In theory, an Internet phone call between LA and NYC can experience less latency going through Starlink because the route is straighter.

The key theoretical concept here is that while latency may be a property of the network, it’s a requirement of the payload.

A similar sort of discussion can be applied to congestion control. This is typically thought of as a purely network issue. However, as we see with services like Netflix, congestion is also part of the payload. As the network becomes more congested, Netflix has to transcode videos at a lower quality and lower bitrate. Netflix’s largest network costs are the transcoding of video to handle congestion.

5.3.3. Payload Protocols

Payload isn’t simply data but can also be protocols. At least, this is often the best way to model what’s going on.

Consider fragments of JSON exchanged via the web. Sometimes what the user intends is that these are files, copying them from one computer to the next. Sometimes the user intends these to be protocols, that the recipient performs some action based upon the contents.

Abstractly, all files are ultimately protocols. At some point, software will read and interpret the data in a file using techniques indistinguishable from network software. Even the simplest text file (with a .txt extension) may be parsed looking for URLs embedded in the text, or dealing with Unicode tricks.

What’s weird about this is that URLs are a protocol even when you open a document created 20 years ago and click on the link expecting it to work today. Similarly, XML and JSON files created 20 years ago are expected to work on the network today.

Sometimes, the distinction between data vs. protocol changes over time. The web started out as a simple protocol that used the network stack, largely outside the network stack. But then it became a complex network in its own right, clearly part of the network stack.

The best way to distinguish something as outside-the-stack or part-of-the-stack is whether it offers an API, or the complexity of that API. Once an API becomes complex enough, it’s probably better to model it as part of the network stack.

The question here isn’t what something is, but the best way to describe it. Sometimes it’s best to describe something as a user of the network stack, as payload. Sometimes it’s best to describe the thing as part of the network stack. The same thing can be described as both at the same time, whichever is helpful.

5.3.4. Interfaces

The word interface refers to where any two things interact. If something uses interfaces from something else, but doesn’t provide interfaces for others, then it’s probably payload.

One of the reasons we know the upper 3 layers of the OSI Model aren’t real is that while there is an interface at the top of the Transport Layer #4, there aren’t any other interfaces in the layers above. We have the Sockets API here, a Transport Layer interface, but no Session Layer interface or Presentation Layer interface.

Such interfaces could be to the operating system (like Sockets) or could be libraries (like OpenSSL).

When you write your own Sockets application, your code is probably payload. When you start offering its own interface, then it’s probably part of the network stack.

5.4. Physical Transmission

A network stack has an unambiguous bottom where the bits are finally transmitted onto a wire using a transceiver.

In many ways, this is outside the network stack. The term network stack usually refers to the software running inside the operating system (macOS, Linux, Windows, etc.) inside the computers on either end of the network. The transmitted signals are outside the computer. Even OSI diagrams often show the Physical Layer this way, one part being the (largely software) layer, and the other part being the “physical media” outside the stack.

5.4.1. Signaling

The biggest reason for treating physical transmission as outside-the-stack is the fact that signaling is a bigger field than the rest of networking combined. Ethernet (MAC) and TCP/IP and the web all combined are far simpler than the science and physical engineering needed to build a transceiver to send bits down a wire. Networks get faster year by year due to a vast array of scientific discoveries – nothing in the software stack was discovered by scientists but just put there arbitrarily by engineers.

This text doesn’t discuss signaling.

5.4.2. But sometimes inside the stack

There are a lot of standards for local wires, like RS-232, Ethernet, CAN bus, USB, USB 3.0, HDMI, DisplayPort, ThunderBolt, and so on. They all combine physical transmission standards with higher layer protocols. Thus, physical transmission has always been considered inside-the-stack.

But at the same time, all of these evolve over time with different physical transmission standards underneath with the same protocols on top.

A good example is the radical transition of PCI to ThunderBolt. PCI was defined in the 1990s for adding hardware inside the computer. Then, in the 2010s, it was extended outside the computer and merged with USB (USB 4.0).

Even Ethernet is a great example. The Ethernet MAC frame format has become the lingua franca of local connections used in devices that have no relationship with Ethernet physical transmission standards. A good example are DOCSIS cable modems – the fundamental modem (one not including an Internet router or WiFi) is just a local bridge, with traditional Ethernet signaling on one side and cable signaling on the other.

The point is that many local links start with the physical link being integrated with a network protocol, and then the network protocol continues to be used with radically different signaling technologies.

Thus, sometimes we model a technology like Ethernet or USB or CAN bus with network protocols integrated with low-level signaling, and sometimes we model them separately.

5.5. Local Networks and Links

This text often uses Ethernet as the prototypical example for all local networks and links, but there are a lot of other technologies.

The key points:

  • OSI defined layer #2 as links not networks, hence calling it “Data Link Layer”. TCP/IP, in contrast, defines the equivalent as “Local Networks”, not merely links. The latter is correct – what OSI defines as layer #2 is best thought of as a complete network.
  • This means OSI Data Link Layer #2 incorporates layers #3 and #4. It has a “network” component and a “transport” component. The only difference is that they are done locally across the wire whereas the official layers #3/#4 do it remotely or end-to-end across an internetwork.
  • Ethernet has three sublayers: PHY, MAC, and LLC. It’s a good model for most local networks even though it’s technically just the IEEE[53] Ethernet model.
  • There exists an LLC sublayer in OSI. It isn’t used for the TCP/IP Internet, so you’ve never really heard of it, but it becomes important for understanding history or networks that aren’t TCP/IP Internet.
  • Not all local networks look like Ethernet.

We discuss Etherent specifically in the simple section above. Here we discuss a bit more about theory of local networks.

5.5.1. Link vs. network

When OSI says Data Link Layer #2, they really do mean a link. They did not anticipate local area networks like Ethernet.

This comes from the mainframe and telecom networks prior to OSI which only had such links. They were either point-to-point (one device on each end), or in some cases, point-to-multipoint, controlled by a single primary device and many secondaries. A local area network like Ethernet wasn’t conceived of – the first Ethernet products didn’t appear until a couple years after the first OSI draft.

5.5.2. Network vs. Internetwork

When OSI said Network Layer #3, they really meant a local network, not an internetwork. One of the ways OSI was retconned to change the definition, to pretend that now network and internetwork meant the same thing.

This local network was local to an organization – an organization network. Such networks could be very large, such as the world’s largest banks, car makers, manufacturers, oil extraction, steel refiners, and so on. Such networks could also be the basis for national telecom monopolies.

But OSI did not anticipate an internetwork, where a packet from deep inside one organization could be sent around the world, deep into another organization. Instead, it was envisioned that there would be gateways between organizations.

Email is a good example. If organizations wanted to send email to different organizations, they would pass through gateways. In the 1980s, before the Internet became the standard for cyberspace, there were a lot of such email gateway technologies. A lot of them consisted of modems that would dial-up other gateways to exchange messages (since they were short and didn’t contain the multimedia attachments they now have).

A popular pre-Internet design was BITNET for exchanging mostly email.

Even today, there exist things that are quasi-internetworks. A good example is the SWIFT payments system among the main banks. These interconnect organizations with gateways to exchange payments.

This is partly why in Internet terminology, routers are called gateways, because they’d interconnect different organizations.

5.5.3. LLC and local “transport”

Today, we define Data Link Layer #2 to fit Ethernet’s simple frame format. But the original definition was much more complex, encompassing full transport functionality: establishing connections, sequencing fragments, retransmitting lost packets, flow control, and so on.

The difference between Data Link Layer #2 and Transport Layer #4 isn’t this functionality, but that one does it across the local link and the other does it end-to-end.

The design for layer #2 comes from IBM’s SDLC. Back in the 1970s with the invention of the 8-bit microcontroller, both sides of a point-to-point link could be smart devices. Their previous point-to-point links transmitted streams of characters, which special character code triggering actions, like “carriage-return” or “start-of-header”. The new SDLC transmitted packets instead, with the sort of binary formats we know today. These protocols supported fragmentation of data, sending data in sequence, acknowledgements, and so on.

This was later standardized as HDLC, and many organizations created slight variants for their own purposes. When telecoms created X.25, they used a variant they called LAPB.

When Ethernet was created, it didn’t have these features. For one thing, it was much more reliable than traditional point-to-point links – it was rare that a packet would be lost. For another thing, Ethernet was developed alongside other Xerox protocols like PUP and XNS, which did end-to-end retransmission of lost packets.

But without a protocol similar to SDLC, Ethernet couldn’t carry IBM and telecom protocols. Therefore, they added something similar to SDLC called LLC or Logical Link Control.

A decade after Xerox created Ethernet it was officially standardized by the IEEE. They officially declared that OSI’s Data Link Layer #2 was divided into two sublayers, LLC and MAC.

You rarely see LLC today in packets because it’s optional. You don’t see it when looking at Internet packets, because such transport functionality is done end-to-end instead of locally.

But LLC still exists here and there. Standards bodies are still trying to make it happen. It exists in every WiFi packet, for example. The following is a screenshot of a raw WiFi packet as captured from the air. You see the raw WiFi headers followed by LLC, followed by the IPv4 header.

LLC is confusing to students because except for raw WiFi packets, they’ll never see it in the real world. However, in order to appreciate history, it’s very important.

There is a whole lot of local networking in the world, from USB peripherals attached to your computer, to CANbus inside your car, to industrial control systems, where reliable connections across local links are important. So such features are still found in local network links, even if it’s never used to carry Internet traffic.

5.5.4. Beyond Ethernet+WiFi

This document describes Ethernet+WiFi because that’s what most people have easy access to such technologies in their home and office. But it’s important to note that these are the things that exist on the edges of the Internet and are often quite different from how things work inside backbones.

Consider cable Internet. Cable TV was created as a vastly different sort of network. In the beginning, it had to take many analog TV streams, send them to various neighborhood distribution points throughout the city, which in turn forwarded them through a cable shared by hundreds of homes.

Today’s standard still sends downstream data in the 6 MHz (8 MHz in Europe) channels originally designed for analog TV, though now those channels support higher bit-rates and are bonded together to deliver up to 10 gbps to customers.

Curiously, cable TV continues to use the Ethernet frame/packet format, even though the physical transmission looks nothing like Ethernet. This makes it easy to integrate things on both ends, whether a cable modem can be a simple Ethernet bridge without being a complicated Internet router.

Thus, the cable Internet service looks something like the following stack:

Again, the stair-step is used to indicate that such things aren’t exclusive. The DOCSIS cables also carry raw MPEG video streams, for example.

Another common local network technology is the CAN bus, the network internal to cars. Cars have had dozens of microprocessors for decades, a fact highlighted during the pandemic when car shipments stopped due to the lack of chips. Notably, there may be multiple such buses in a car, rather than a single bus connecting all the components.

A quickly growing standard is LoRa[54] for low-speed, long-distance radio communications. Amazon.com[55] is integrating it into all its Alexa/Echo devices so that they can communicate things (like emergency alerts) even when the Internet is down.

Every cable you have in your home, like USB, HDMI, DisplayPort, and SPDIF, are potentially local networks. Each makes a different set of tradeoffs. For example, HDMI transmits video as a stream, whereas DisplayPort transmits video as a series of packets. HDMI uses additional wires for out-of-band communication (such as remotely turning on/off the TV), whereas DisplayPort sends such information in-band along with the video packets.

There are dozens more local network standards, such as for power plants, factory automation, satellite Internet, or Internet backbone traffic. They all have important differences, and reasons why they don’t look like Ethernet. They often follow the model of a MAC and PHY, but just as often, this is a poor description. Even on very high-speed Ethernet, the PHY sublayer is so complex that it’s best understood as further sublayers.

5.6. Services

Over 90% of traffic across the Internet’s backbones consists of the web. No model of cyberspace would be complete without describing it. We can generalize this a bit, as there are other services that run on top of the Internet.

What we define as a “service” is something that appears as a “network” to whoever uses it. In other words, the email system is layered on top of the Internet, but forms its own network that often[56] uses transport other than the Internet.

The web is a good example. A web address contains more than just a DNS name, and can work offline, such as linking files on a hard drive. There are more examples than just the web, such as RPC, file-systems, remote terminals, and email.

Our model includes services specific layer on top of the Internet (which in turn runs over local networks):

But it should be stressed that, unlike the OSI Model, all these layers are optional, opaque, and there’s no reason to believe there’s exactly 3 layers at those positions. They are not part of the same network stack, but largely independent systems that don’t necessarily run in this configuration. Moreover, there’s no specific definition of a service, other than it looks like a network from the payload’s point of view – no two services implement any of the same functionality.

5.6.1. Service vs application

The first question we need to deal with is our conception of service compared to OSI’s application. They both run on top of the Internet. The simple difference is that this concept of service is still that of a network, upon which applications and payloads might run.

Today version of Application Layer #7 is a catchall, anything that doesn’t fit in other layers. It really has no specific definition. Is it part of the stack (inside-the-stack) or the part above the stack (outside-the-stack)?

In our alternative to the OSI model, we make the distinction that payload is outside-the-stack, and services are inside-the-stack. A service carries payload.

In terms of history, the mainframe view of an application has little relationship to the modern view. In the old days, the application ran on a centralized computer, with only a dumb terminal in front of the user. Today, the user has a super computer in their hand-held phone that does most of the processing of an application. The phone in your pocket has more raw compute power[57] than most of the servers it talks to on the other side of the Internet.

5.6.2. Other services than the web

While the bulk of Internet backbone traffic consists of the web, the bulk of corporate internal networks consists of file system and RPC traffic. Microsoft’s versions of these are as important as web traffic inside corporate, government, and military networks.

A file system connects a remote file server as a local drive. It’s the bulk of all network traffic, though most of it happens between devices located near each other (like on the same Ethernet switch). The two most popular file-systems are Microsoft’s SMB and Unix NFS, though occasionally you still find Apple’s AFS.

From a certain perspective, the file system was the first network. A single computer can run multiple apps at the same time that can communicate with either other by reading/writing files. File systems allow apps to lock files, or even just a section of a file, as a form of synchronization. Such applications then work seamlessly when split among many computers, all of which connect the same remote drive. In the 1980s, the most popular corporate email program was cc:Mail which worked on this principle – it worked over many file systems[58] of the time.

At one time, people hoped to build network software without worrying about network details. This technique was known as remote procedure call or RPC. Software is built from (local) procedure calls. The idea here is that some could be declared to be remote, so that when software calls a procedure, the request is transferred over a network.

This was never really successful as a generic mechanism. As it turns out, the details are important and you don’t want to hide them. But before people realized this, RPC was integrated into Unix and Windows. A lot of operating-system features use them. Most of Microsoft’s system software is written on their RPC mechanism.

There are many versions of RPC. The earliest successful version was Sun RPC. Then came Microsoft’s version. When Java was invented, its own native RPC mechanism (JNI) was created. There are also object-oriented RPC mechanisms, like CORBA and Microsoft’s DCOM.

Email is its own network. It’s not an application running on the Internet so much as it’s own network that sometimes uses the Internet to transport email messages and sometimes uses other means. Large parts of email transmission happen in ways that are independent of TCP/IP Internet links

In the 1980s, you’d sometimes see email addresses like “gandalf!milliways!rob@example.com. This specified a route of non-Internet computers to move mail between before finally reaching the Internet. This is a technique called source routing, where the ends of the network know the route rather than the forwarders in between.

Today, we have huge email services like Google’s GMail.com or Microsoft Outlook.com. Their architectures pass email around an opaque internal network that doesn’t necessarily use any Internet technologies. Of course, they probably do, but it’s something largely opaque to outsiders.

Remote terminals form their own network. They were an early form of the web before there was the web. In corporations, terminals would allow users to switch sessions with much the same ease that people have multiple web browser tabs open. Corporate apps would consist of forms to fill out in the same way as web forms. Indeed, today’s standards for forms from those terminal apps, such as using [tab] to move between fields, and the [enter]/[return] key to submit a form.

Most important were the teletex/videotex services like France Minitel. Users were able to access a wide variety of services, including porn, from dumb terminals and a dialup modem.

The upper three layers of the OSI Model were originally designed around terminals. For example, the Session Layer #5 was intended to deal with the limitations of terminals of the time, such as the fact that the dumbest terminals only had simplex links.

Today’s command-line apps are best modeled as running over this remote terminal layer.

5.6.3. Common features of services

The most important concept of the services layer in this model is that there is almost no commonality, that each needs to be understood in terms of itself rather than as fulfilling the needs of a common layer.

With that said, they do have a few commonalities.

The defining aspect of these is that they hide the underlying network. The service presents its own version of a network, with its own addresses and its own directory services.

A good example of this is how email uses DNS MX records. Email sent to an address like “something@gmail.comdoesn’t go to the machine named “gmail.com”. Instead, it goes to a server that handles email for that domain. In the screenshot below, we see that while “gmail.com” has IP addresses, email isn’t sent to those addresses. Instead, it’s sent to servers with names like “gmail-smtp-in.l.google.com”.

Figure: Email servers for gmail.com different than web servers for gmail.com

Another common aspect is users, with accounts, passwords, and sessions. The underlying Internet (or local network) is essentially anonymous, but these services generally require a username. Even if they allow some anonymous interaction (like much of the web), user identification is still an important aspect of the system.

With users comes security[59]. It’s something we are still trying to figure out. The latest version of SSL (TLS/1.3) was only standardized in 2018, and contains important improvements over the previous model for security. It’s something that probably shouldn’t be modeled – any understanding of the current state-of-the-art of cryptography will change in a few years as hackers break the old model.

An important component of these services are transactions and synchronization. The underlying networks are defined simply to get data from a source to a destination (packets and streams). For services, we care about request/response pairs. The simplest protocols simply match requests with responses, but there are a lot of more complex topics. Web stacks are constantly debating this, such as RESTful APIs vs. things like GraphQL or SOAP.

In truth, an entire textbook could be written on this subject. The success of big companies like Google largely comes from being able to synchronize transactions at scale.

5.6.4. Web

The justification for adding a third layer to our two-layer model (Internet-over-local) comes from the web. It’s a complex subject, for which entire textbooks have been written.

There are two major portions of the web:

  • The HTTP protocol itself (including things like SSL and QUIC)
  • The “stack” above HTTP, like JavaScript and PHP

There are three versions of the web protocol, HTTP/1, HTTP/2 (SPDY), HTTP/3 (QUIC). For apps/services written on top of this, all these protocols behave the same. Underneath, they look quite different.

The reason for these changes has been transport issues. Web apps make multiple outstanding requests/downloads at the same time. In the early days, this was managed by creating several TCP connections at once. An important principle is to reduce latency, the time it takes to setup a connection before data is transferred. Now that we recognize SSL as an essential feature of the network, SSL setup is integrated with connection setup in QUIC. Doing both together reduces latency.

On top of HTTP is a completely different stack. When we say full stack developer we aren’t referring to people who understand the core HTTP protocol and below. We are instead referring to developers who can do front-end development with toolkits using JavaScript and CSS, as well as backend services with PHP, NodeJS, and SQL.

The fascinating thing is that there are two stacks above HTTP: the stack that happens on the client, and the stack that happens on the server. In the history of networking, both sides have been roughly symmetric. With the web, they are asymmetric. What happens on the client-side is almost an entirely different industry from what happens on the server side.

Consider CDNs aka Content Delivery Networks. Major websites, especially the video streaming companies, maintain servers in most major cities. Most Internet traffic travels less than 200 km[60] (100 miles) to the nearest large city. It’s a billion dollar industry that is separate from both the client/server stacks, as well as being independent of the underlying Internet transport. My model imagines there’s a web layer on top of the Internet, but there are entire layers on top of the web (or maybe underneath the web).

5.6.5 BGP

BGP (border gateway protocol) is how Internet backbone and ISPs communicate routing information to each other. It runs on top of TCP. So is it a “service” in my model? Or part of the “Internet”?

A service like email, web, RPC, remote terminal must carry a “payload” and have “APIs” that are above the simple Sockets APIs.

BGP has none of these things. It carries no payload but routing information, and routing information has no use other than for telling routers where to forward packets.

BGP has other peculiarities as well, such as being used between neighboring routers on the Internet rather than being used end-to-end.

BGP could’ve been defined to run direction on IP in much the same way ICMP is defined, with its own scheme for fragmentation and retransmission. If that were the case, then we wouldn’t be questioning whether it was part of the Internet (like ICMP) or on top of the Internet (like web, RPC, etc.).

It’s the same sort of issue with IPv6 neighbor discovery protocol (NDP) and IPv4 ARP.

ARP has long confused people. Is it part of Ethernet? Or is it part of the Internet? Is it some sort of “convergence” sublayer that marries the Internet to Ethernet?

NDP is clearly just part of the IPv6 Internet, and not really related to any underlying local network. Thus, working backwards, we see that ARP, too, is clearly just part of the Internet, because it’s the same function. ARP should’ve been defined like NDP.

5.7. Control plane vs. data plane

This section presents a very different model of the network, unrelated to the picture we display above.

Our model of the network comes from what we see on the ends, the network stack in a Windows, Linux, or macOS device. An alternate model comes from looking at things in between, the routers and switches that forward traffic.

In this model we see a control plane and a data plane.

  • The data plane is the part that forwards traffic, often implemented in the simplest hardware possible, with logic gates rather than software. The simpler it’s done, the fast it can operate.
  • The control plane is the part that manages this, such as managing the routing tables, or in the case of the telephone company, establishing circuits (phone calls).

These described the phone company and the invention of the transistor. In the 1960s, the telephone system digitized itself, creating the T-carrier system that forwarded streams of bits through the system consisting of 56-kbps or 64-kbps “circuits”. That was the data plane, using the simplest logic possible to forward streams of bits. Computers were the control plane, configuring those circuits.

Unix was developed as a control plane operating system. The intent was that a Unix computer would be combined with a dumb switch, controlling how the switch operated.

The historic oddity here is that TCP/IP Internet confused the two concepts. The first Internet routers didn’t use hardware to forward packets. Instead, they were software running on a Unix machine. A packet would be received, and software would process it, looking up the addresses in the routing table, then sending it out the matching port.

In this new world, Unix was used for both the data plane (forwarding packets) and control plane (building routing tables).

Modern routers, like Cisco or Juniper, have reversed this. They now do forwarding in hardware at very high-speeds, while using CPUs/software for control plane activities like managing routing tables.

5.7.1. Data plane software

Normally, “data plane software” would be a contradiction, as historically data plane always meant hardware.

But in practice, the line has been blurred.

For one thing, a lot of packet handling is done by specialized CPUs running very tight programs, only a few instructions per packet. It’s really specialized hardware doing the heavy work, but still with a few software instructions being executed.

For another thing, there’s the Data Plane Development Kit or DPDK, a software package that turns any average PC into something that can reach data plane speeds.

Historically, an operating-system like Windows or Linux didn’t handle high-speed I/O. There was a big overhead for processing each packet added by the operating-system, regardless of how efficient the software was.

A solution to this is to run software that bypasses the operating-system. Such software would have its own driver for the network card that passed packets directly to the application, rather than to the operating system. The application could then be optimized to process the packets.

5.7.2. Example: masscan

A good example of the data plane software is masscan, a high-speed port scanner that can transmit millions of packets-per-second. It can either run through the operating-system as normal, or bypass the operating system completely.

When it goes through the operating system, you can see in the chart below that most CPU time is taken up inside the operating-system handling packets, and very little is left for masscan. This makes masscan run slowly.

From https://www.alchemistowl.org/pocorgtfo/pocorgtfo15.pdf 

When bypassing the operating system, all the processing done by the operating system disappears, with a small amount for the user mode driver being added.

The point here is that it converts the application into two parts, the data plane portion tightly optimized for 400 cycles-per-packet transmitted, and a control plane part of everything else running on the computer.

6. Alternative framework

The chapters above describe an alternative model for the network stack. But the OSI model is also used as a framework. For example, there are no Session or Presentation layers in the network stack, but they are still used as categories in a framework.

A network stack shouldn’t be used as a categorization framework. It’s false to assign theoretical functionality to a layer.

But we want frameworks. We use frameworks to teach every subject. What’s an alternative framework? This section attempts to provide such an alternative.

This isn’t a very good alternative – it’s just that it’s better than OSI.

6.1. What is an ontology?

We humans like to organize knowledge. The Wikipedia is a good demonstration of this. Every article describes related topics, how everything is put together. Articles are fit into a hierarchy or taxonomy or timeline or framework. At one time, this caused 97%[61] of Wikipedia articles leading to philosophy[62] if you clicked on the first link in the article. You’d follow the hierarchy up and up until you hit “philosophy” at the root.

The academic word for this is ontology, putting everything in categories and describing relationships.

The OSI Model looked like such a thing. Some of the layers were close to categories. Therefore, the first computer networking books used it as such. They assigned all networking concepts to one of seven boxes.

The reality is that none of the layers were intended as categories. More to the point, when you read the fine print of the OSI Model, it doesn’t actually assign much to any particular layer. For example, the Transport Layer #4 is only defined to be end-to-end. The functionality listed must be done somewhere in the stack by that point, though not necessarily within that layer, but could also be done by lower layers.

6.2. It’s not theory so much as engineering

The goal of these Wikipedia-style ontological frameworks is to describe theory, like science. But for the most part, networking is practice. It reflects engineering choices rather than scientific principles. It’s less like a biological taxonomy and more like a military hierarchy: we humans chose to put things there.

It’s a struggle choosing pure theory vs. practice. Are we describing the TCP/IP Internet that we are all using today? Or are we trying to describe theory, how networks have existed in the past and how they might exist in the future?

The OSI Model pretends its theory because it’s the standard. The idea is that there would be 7 fixed layers and that over time, new standards would be invented that would upgrade a layer. All networks of the future would follow this blueprint, so it described all future networks.

But the OSI Model was broken. It described the networks of the 1970s. No future network would ever work as the OSI Model described. We’d never have interoperable OSI network stacks, and we’d never get real Session #5 or Presentation #6 layers.

This alternative should be thought of mostly how networks have worked up to the point. It shouldn’t constrain new innovations that either reinvent the things we have today, or create entirely new layers on top of what we have today.

6.3. Physical transmission

While most categories don’t correspond to layers, we can make an exception for the physical transmission of bits on the wire. As discussed in the chapter on the Alternative Network Stack, the physical transmission of bits on the wire is really outside-the-stack, below it.

This is a larger category than the rest of “networking” put together. It’s largely science, the physical properties of materials and signals, rather than merely engineering, arbitrary decisions made by humans.

6.3.1. History

It has a rich history. Telecommunications goes back to prehistoric times. At some point, humans figured out how to transmit information over long distance, if only human messengers running long distance on foot or riding horses.

The Indo-European language family likely comes from the domestication of the horse of the Pontic Steppe, allowing the transmission of culture, trade and armies over large distances. Early empires like the Babylonians or the Romans relied upon road networks to maintain communications with far flung provinces. Other early empires used the ocean to transmit information with ships.

In the 1800s we saw the development of semaphore networks and Morse code. Around 1880 we saw the invention of both the digital telegraph and the analog telephone. The Internet as we know it didn’t appear until Jan 1 1983.

6.3.2. Digital signaling

One type of signal is just the positive/negative or high/low voltage that directly triggers transistors. These are the signals across the fiberglass circuit board between devices that are within centimeters of each other.

These sorts of signals were then exported from local ships to go across long distance wires. The famous RS-232 point-to-point serial link is a good example, using positive/negative voltages to transmit bits. The GPIO pins on a Raspberry Pi are another good example, using high/low voltages. These only work over short distances, a few meters.

The original Ethernet used digital signaling over several kilometers, but still needed a transceiver, an interface between the long-distance digital signal and the local digital signal.

6.3.3. Analog Modulation

Local signaling is digital, long distance signaling is analog. We modulate[63] the digital data onto an analog medium. A transceiver takes a digital stream on one end and converts it to analog signals, doing the reverse on the other end. It can become a hugely complicated process with multiple carrier waves, conversion of a symbol at a time instead of individual bits, where speed of light delays over short distances becoming problematic.

One of the key theoretical issues is understanding the Fourier transform, how any signal can be described in terms of analog sine waves. It’s the basis for SDR or software defined radio that uses math in software to modulate/demodulate a signal.

Another key theoretical issue is understanding spectrum and the Shannon limit. The speed of data (bits-per-second) is limited by the width of spectrum (in Hz) multiplied by the strength of the signal above the noise (signal-to-noise ratio). That’s why we keep needing to upgrade cables (like to Cat6) every time we increase the speed of Ethernet, to both increase the spectrum of electric signals the cable can carry as well as reduce the noise from the environment.

Basic physical properties of cables make a difference. They laid the first transatlantic undersea cables in the 1850s but found they couldn’t transmit information very fast due to such issues as capacitance and inductance of the electric wires. Over thousands of kilometers, these things mattered for speeds as low as 1 bits-per-second. Telegraphs developed complex equations[64] for figuring out how to deal with these physical limits to improve speed. What mattered sending 1 bits-per-second across 4,000 kilometers of cable back then continue to matter when sending 10gbps across 100 meters of copper cable.

10gbps is probably the limit for copper cables in office environments. NBASE-T working at 2.5gbps or 5gbps over copper seems to be the limit for the home. Strangely, it’s looking like homes will have faster WiFi speeds than Ethernet.

6.3.4. Radio

Radio communications (like WiFi or mobile phones) have their own categories. There is a lot of overlap since copper wires simply carry radio waves. But there are a number of issues unique to them. Multiple antennas (MIMO) and phased-arrays that can steer beams are one example.

6.3.5. Forward error correction

The original Ethernet was designed with a simple checksum/CRC on each packet to detect whether it was corrupted. If corruption happens, a higher layer protocol[65] would eventually retransmit it.

This works because there was always quite a bit of headroom between the speed of the network and the maximum possible speed. The chance of bits being corrupted was very low.

As we’ve pushed toward ever higher speeds, we’ve removed most of the headroom. Today’s links, like 10gbps, are roughly at the limits of what the wires will allow. That means bits regularly get corrupted at unacceptably high rates.

The solution to this is forward error correction. It means using a few extra bits that can be used to both detect and correct any errors. It’s often a tradeoff, such that adding 10% overhead for error correction means we can now transmit at many times faster speed.

Such error correction is everywhere. Memory (DRAM) in servers usually contains error correction, so that cosmic rays can flip a few bits without causing problems. Rotating-disk hard-drives have become crazy to the degree upon which they rely upon error correction.

Ethernet has a crazy number of sublayers, with multiple sublayers concerned with error rates.

6.4. Payload

While most categories don’t correspond to layers, we can make an exception for the payload. As discussed in the chapter on the Alternative Network Stack, the physical transmission of bits on the wire is really outside-the-stack, above it.

6.4.1. User sessions

One important part of the payload is user sessions. Users are outside the network, They login to use the network.

Consider the act of creating a new user account with a website. This creates some sort of relationship or association with the website, even though communication hasn’t happened yet. This is some sort of “session”, though largely inactive.

Now consider the act of logging into that account. Now things are more active, though in many cases, still not every active.

Consider how HTTP cookies work. When you log into a website (like Google.com), it stores a token on your hard-drive. That token represents an active connection even when you turn your computer off.

Indeed, you can backup your computer, destroy it in a fire, buy a new (and completely different) computer, restore from the backup, and then continue using this token/cookie as a logged in session.

Such tokens/cookies can last for decades. At one time, Yahoo’s login cookies were cryptographic information that never expired, usable years later.

Another example is how a user can be logged into multiple devices, but still be part of the same session. You do a search on Google.com on one machine, and then when you go to a different machine, that’s listed as one of your recent searches. I’m usually logged into Twitter simultaneously on my laptop, desktop, and mobile.

The point is that I’m a user of the network, not part of the network. Thus, all these user activities are placed in the payload category.

6.4.2. Data Encoding and Representation

This issue is discussed in its own subsection below in this chapter. In many ways, it’s its own category. In other respects, it belongs in the payload category.

The point to make here is that traditionally, the Presentation Layer #6 taught that these issues were to be handled by the network stack. Back in the 1970s, with the proliferation of terminals and character-sets, this made sense. Today, it doesn’t.

It’s more profound that just that. Take video streaming, for example. There are some misguided folks who assume that we’ll need to encode video differently for terrestrial broadcast, cable, satellite, or streaming over the Internet – meaning that encoding was network dependent.

But the correct way of approaching the problem is that digital video is encoded the same (like MPEG) and that network-specific issues are exceptions to this rule.

My father who worked for the Associated Press[66] tells the same sort of story. There were those who envisioned different payloads for different networks, compared to those who saw payload as independent of the network that transmitted it.

HTTP’s Content-Type struggles with this issue. On one hand, it’s useful metadata to attach to a document, noting that it’s an HTML document even if there’s no other indication of this. On the other hand, it still implies there’s a network function involved that negotiates content-types. This is what the Accept header does. It’s broken thinking coming from OSI’s Presentation Layer. In practice, almost everything ignores the Accept header.

6.5. A network

OSI’s Network Layer #3 is broken ontologically. It excludes Ethernet, which really is a network. It includes the Internet, which really is more of an internetwork rather than a network.

6.5.1. Links vs. network

The original idea that there was a distinction between links and a system whereby devices would forward packets from one link to another, forming a network.

In other words, if you have 2 devices, you probably don’t have a network, but once you have 3 devices, you probably do. I say probably here because the boundaries are still blurred.

6.5.2. Many to many

Another way of looking at it is that whenever there are many things talking to many other things, then it’s a network.

That’s why we classify the web as a network. There are no links. The entities on either end aren’t even peers, a web browser is very different than a server.

But there is still a many-to-many relationship.

6.5.3. Path

If a packet follows a path or route, from hop-to-hop, across links, then it’s probably a network. The logic that figures out that proper path is networking logic.

Networks may be layered on each other. The Internet is a network. But the links underneath the Internet may themselves be a network, such as Ethernet.

In addition, complex services that don’t route packets may still be networks. A TV broadcast system is rightly called a network, for example.

In terms of the web, the concept of a path is weird. A lot of servers we interact with are just front-ends that route our requests to various backends, such as databases or caches. There’s also the complicated DNS infrastructure that routes requests to the near content-delivery-network. Most Internet traffic travels only 100km/50miles to a data center in the nearest city where our Facebook, Apple, Google, and Netflix requests are handled.

In short, while the web appears to have no path, it’s got something along these lines that’s complicated.

6.5.4. Other concepts

The discussion above lists the items that clearly distinguish the network category, compared to things that aren’t networks. There are many topics we’d place in this category, such as the following:

  • Addressing: how the things communicating will identify each other.
  • That we want to transmit single messages or continuous streams between things.
  • Forwarding: how we decide which direction to forward things.
  • Circuit-switching vs. packet-switching, as well as other alternatives like “cell switching”.
  • Directory-services: how we find a thing’s address.
  • Bulk transport: this is also its own category.
  • Multicasting and broadcasting, sending one thing to many recipients.
  • Mobile, a special infrastructure needed for devices that are traveling around the world, whose route is constantly changing.

Most of these topics are discussed elsewhere in this text. The point is simply to list the sorts of things that belong in this category.

Some are their own categories, in particular, Bulk Transport is a large category itself, even though it’s also a subcategory.

6.5.5. Other networks

The flaw with OSI was defining a network to be only a specific thing that happened at layer #3. In our alternative model, networks are more general, and there can be many different networks layered on each other in a stack.

Some common networks you use every day would including:

  • Ethernet
  • MPLS
  • Internet
  • the web
  • Twitter
  • Netflix
  • CBS, NBC, and ABC (“broadcast” networks)

6.6. Bulk transport

A big part of this book is redefining the word transport.

6.6.1. Top of the network

OSI places the Transport Layer #3 somewhere in the middle of the stack. This book places transport at the top of a network. Each layered network has its own transport, while the most important is whichever network is at the top.

In other words, at the top of Ethernet is the transport LLC. At the top of the Internet is the transport TCP. When the web is implemented with QUIC, that becomes the transport. Transport percolates up to the top of any stack.

Another way of saying this is that transport is the last layer of the network stack before there is payload.

Being at the top means this is where the API resides. The API is how software interfaces with the network. The API for the TCP/IP network is known as Sockets, whereby programmers create end-points for TCP and UDP connections.

6.6.2. Transport addresses

The end-points of network communications aren’t the machines/computers themselves, with their network address. Instead, it’s software (processes, apps, services) running on those computers. They have an additional transport address along with their network address.

This can be seen with URLs. A URL might look like https://www.google.com:443/. This is because port 443 is the default transport address identifying the web server running on the machine. Every URL includes a port number, but defaults are used, port 80 for HTTP and port 443 for HTTPS.

The client apps also have transport addresses (TCP port number) from which they send data. Responses are sent back to that port, to the client app. This port is randomly chosen for each connection, rather than a fixed number.

6.6.3. Ends

Transport is inherently end-to-end.

This is the one place where the original definition of OSI Transport Layer #4 is actually correct: whatever transport does, it does it end-to-end.

6.6.4. Congestion-control

It is here that we become really theoretical about networking.

The most important property of any network is congestion. Without handling it, the network becomes overloaded and fails.

Throughout the early history of networking, it was assumed that congestion would be handled by the network itself. When routers encounter congestion, it means they have a packet destined for an outgoing link, but the link is full of other packets. Therefore, logically the right place to handle congestion is within the router.

The inspired thinking of the Internet was that this couldn’t possibly work. The only way to deal with congestion is on the ends. They have to slow down transmission of packets, or congestion will continue.

Therefore, the Internet is designed so that routers ignore congestion. When they can’t forward a packet, they silently drop it, telling nobody.

With the TCP protocol on the ends, the fact that the packet didn’t arrive at its destination (or the ACK didn’t arrive back) means that there was congestion somewhere in the path between the sender and receiver. TCP then slows down until packets are no longer dropped.

It’s a complicated dance. The way TCP works is that it’s constantly probing the maximum limit of speed until a packet is lost, then dropping back underneath the max limit. That way, when conditions improve, it can speed back up again.

6.6.5. Fragmentation and reassembly

Internet packets are limited to around 1500 bytes in size. Larger chunks of data, like files, need to be fragmented on one end and reassembled on the other. There may be other reasons for fragmenting data. For example, with phone calls over the Internet, not much data is transmitted, but packets are still fragmented. For example, every tenth of a second worth of sound may be its own packet.

The deep theory is that fragmentation and congestion-control are intimately related, that you need one when you have the other.

Implicit to reassembly is that packets will be sequenced. Packets can arrive at a destination out of order. They can’t be correctly put back together unless the destination knows the original order.

This is combined with congestion control in something called windowing. It takes a while before an ACK arrives back from the destination. The sender can’t wait for an ACK before sending the next packet,, so sends many at once. It keeps track of how many were sent. This is easiest done with sequence numbers.

6.6.6. Payload congestion

The promise of transport is that it handles things invisibly, that the payload doesn’t need to worry about the details of the network.

But the payload does need to know. Take Netflix as the prime example. It wants to transmit video streams as fast as possible. When there is congestion, it needs to transcode the video, producing a lower quality video at a lower bit rate. Congestion is something that it intimately needs to deal with.

6.6.7. Retransmission and reliability

A common description of transport is that it guarantees that packets will arrive reliably, meaning, that it will retransmit lost packets.

This is part of congestion-control. Almost the only reason networks lose packets is due to congestion.

6.6.8. Establishing connections

To make congestion-control and fragmentation work, the network stack has to track packets in flight. To do that, they must be associated with a connection.

To do this, a transport protocol creates a handshake, sending control packets (meaning, packets without data) back and forth.

6.6.9. Summary of transport

To summarize, all the ideas of transport are interrelated, particularly with congestion-control.

  • Establishing connections.
  • Fragmentation and reassembly.
  • Congestion control and flow control.
  • Sequencing of data.
  • Reliability.
  • End-to-end.
  • Buffering.
  • API access.

These things all relate to bulk data, so this section refers to bulk transport rather than transport.

Historically, there have been two uses of the network. One was short commands/responses that would normally fit within individual packets. The other use sends large chunks of data that could never fit within a packet. A third item is streaming, which is sometimes modeled as many short transmission, or one large transmission, depending.

Thus, this category of bulk transport can also be considered a subcategory of the network, as described above. Even if the network has sublayers/subprotocols, such as IP vs TCP, bulk transport is a necessary feature of the combination.

6.7. Security

This is a difficult category because it includes a ton of otherwise unrelated subjects.

6.7.1. Security is adversarial

A key concept of security is that it’s solving the problem of other people.

Take error correction and checksums as an example. They prevent accidental corruption of data, but provide no protection against deliberate corruption. An adversary can flip bits in a way that passes the checks, or modify the check as well to make it seem no corruption has happened.

6.7.2. Not just crypto

Security is not just cryptography and encryption. One huge network problem is distributed denial of service or DDoS. Some of the original ideas of the Internet stem from defending the network from nuclear attack, whereby packet-switching would automatically route around cities being destroyed by the bomb. But while the Internet is resilient when attacked from the outside, it’s less resilient when attacked from the inside. And everyone is now on the inside.

Engineers who focus on website reliability spend a lot of time coping with the accidental failures but also with DDoS. Server load has to be planned not just according to the load users are expected to place on the systems, but what will happen when hackers cause a DDoS.

6.7.3. Encryption

Encryption is a difficult problem because, for one thing, it’s impossible in the way people naively think. The naive notion was that we’d just have encrypted links between routers by hardcoding the same key at each end. This proved impossible, because we can’t really distribute keys widely among routers in a secure fashion. Adversaries can always intercept such keys, so can always eavesdrop on encrypted data.

Instead, we use public-key cryptography. This allows us to exchange keys even in the presence of an eavesdropper. The spy can see everything we do as we work together to create a key, and still not be able to discover the key themselves.

This is the fundamental property of SSL, the standard encryption technique. At the start of the connection, both sides generate a key that will then be used for encrypting the connection.

This defeats any passive eavesdropper, but not an active one that performs a man-in-the-middle or MitM attack. This attack pretends to be the counterparty to each side, and exchanges keys with both sides. It decrypts incoming packets and re-encrypts them (with the other key) for the outgoing side.

To fix this, we use another variation of public-keys and certificates. A certificate is used to sign the keys that are exchanged, so that when you interact with Google.com, you know it’s actually the real Google.com and not some fake trying a MitM attack. Google’s certificate is likewise signed by a certificate-authority. The list of valid authorities are included within every browser and operating-system.

In other words, you trust Microsoft to validate the trustworthiness of a number of certificate-authorities, and that these authorities can be trusted to know the difference between Google and an imposter trying to eavesdrop on the network.

The point of this section is that things are more complicated than just encryption. Instead, to keep things secret, we need an entire cryptographic infrastructure.

6.7.4. User authentication

User-authentication is a complicated topic we still can’t figure out. We fumble around with passwords being stolen, broken multi-factor authentication, and things like OAUTH.

The most common attack on the Internet comes from the fact that humans choose the same password for most websites. When a less secure website gets hacked, the passwords will get stolen, and the hacker will try them on the most secure website. User-accounts therefore get hacked on the most secure websites due to no fault of that website, but due to the user using the same password everywhere.

The fix for this is something called multi-factor authentication or dual-factor authentication. One such additional factor is sending an SMS message to your phone to verify that you are legitimately logging in.

6.7.5. Authorization

One of the fascinating things is that we still don’t know what authorization means.

The US anti-hacking law is known as the CFAA. It forbids intentionally accessing a computer without authorization. It seems clear what this means, but it’s not.

People keep posting things accidentally to public websites, things they don’t want public. Hackers often find such things accidentally, or deliberately search[67] for them. On one hand, the information is public, and people regularly access it without knowing it was accidentally posted. On the other hand, the hackers know it’s an accident, and that while technically public, they know the owner doesn’t want people accessing it. Are they authorized to intentionally access it? The courts aren’t sure.

A typical example is when you see something like articleID=167 on the end of a URL. Hackers will commonly simply increment the ID to 168 and see what’s there. Sometimes they find things they shouldn’t access. You see, it’s not public until there’s a link pointing to article 168. Naive companies might post article 168 announcing their quarterly earnings hours before they officially announce it, where their official announcement includes the link to that page. Smart hackers might access it early, and trade the company’s stock for vast profits before the public market knows the information.

Nobody knows whether this is legal or not. The entire reason the URL is editable is so that users can do precisely this sort of thing, to access website content in ways other than simply following links.

The point of this section is that authorization is a key part of the security category, but nobody is clear what this really means.

6.7.6. Auditing, logging, and instrumentation

The traditional cybersecurity model of the 1980s focused on internal threats, making sure that the employee isn’t secretly a spy. The model was to have an extensive auditing infrastructure that eavesdropped on all their activities that was separate from the normal system. In other words, even administrators of the system should not be able to evade the auditing features.

Today, the default is that administrators have full control. It’s a struggle to layer a separate auditing system on top. The reason ransomware spreads is the difficulty corporations have separating duties, from those who merely administer the technical aspects of Windows computers and those who monitor the security aspects.

Since the 1980s we’ve had an enhanced view of what we audit. We log a lot of things whether not because they control spies, but to detect anomalies that happen when hackers break into. We have a vast logging infrastructure that’s far different from the original ideas of mere auditing.

This includes now an instrumentation framework, whereby third parties can get in on the action, looking at events that happen in the system for correlation, intrusion detection, and enhanced logging.

6.8. Naming and directories

This category covers DNS, but it’s vastly more complex.

6.8.1. Address lookup

The Internet famously has DNS that translates names into addresses. When you access google.com, your browser first needs to translate that into an IP address like 172.253.63.101.

6.8.2. Identity

There is a difficult question about identity that has confused people since the birth of the Internet. Which is the true identity of the machine, the address or the name?

In the beginning, the Internet was designed with fixed addresses. A computer would be configured with an IPv4 address that would be its identity on the network.

When DHCP[68] was invented in the early 1990s, a decade after the Internet was created, DHCP was bitterly opposed by purists, because dynamic addresses were no longer the identity. Every time the computer was turned on, it might have a different address. The address was just a temporary token rather than an address.

The same is true for names like google.com. Every time you connect to that name, you’ll likely connect to a different address. Google’s DNS servers are constantly giving different results, directing you to the nearest server close to your physical location, and load balancing among many servers, directing you to the one with the least load.

Thus, neither side of the connection has a fixed address that is the identity of the computer.

There’s another problem with multihomed computers, those with multiple connections. Which of their addresses are the identity? When routers send back ICMP error messages, which should they use as the source IP address?

6.8.3. Resources

DNS is more than just a translation between names and addresses.

Consider email. When you send email to a domain like gmail.com, you don’t send to a server with that name. Instead, you send email to a server that handles email on behalf of that domain.

In DNS, this is known as an MX record. An MX lookup for gmail.com returns something like gmail-smtp-in.l.google.com.

Most of the email on the Internet these days is handled by services like gmail.com or outlook.com. Thus, if you do an MX lookup for twitter.com, you get a Gmail server like alt2.aspmx.l.google.com as the response.

6.8.4. Directory services

The idea of resources mentioned above can be arbitrarily extended to anything on the network or network adjacent. Users are a resource, as are individual files. Each object may have unique rolls and unique security requirements.

Managing Microsoft’s Directory Services for a network is more difficult than managing the underlying physical infrastructure.

6.9. Parsing and formatting

Elsewhere in this document, we discuss how encoding and representation of data is not a function of the network, that the Presentation Layer #6 teaches the wrong lesson here. For network payloads, data representation is a function of the payload itself. The payload like a PDF document determines for itself how it is represented by bytes, not the network. It’s a bad idea for the network to get involved in transforming data.

With that said, there is a network-adjacent category here. Traditionally, the concept has focused on how data is represented, how we take abstract concepts and represent them concretely with bits. Today, the focus is more the other direction, how we take a concrete set of bits and parse them to discover the abstract ideas.

The reason it’s so important is that it’s the primary way that hackers exploit computers. The bugs in code that they are exploit are usually part of this parsing process.

A full discussion of this is beyond this text. The point is that data encoding/representation is a subset of the parsing problem, rather than the other way around. It’s mostly a security issue, but it’s also a payload issue, and can be considered a sub-category in either of those categories.

6.10. Network management

Back in history, when they first defined the OSI Model and the protocols to fit within each layer, they were also terribly concerned with something called “Network Management”. It was a concern then. But somehow, it’s never been elevated to an academic discipline. For the most part, students learn from the point of view of writing software applications to use the network, but not how to keep the network running.

Take for example making a call from Europe to Japan. If done through the telephone system, then it’ll incur a fee from the telephone company, just for audio. If done with a mobile app like Facetime or LINE, then it’s free – even though it comes with 1080p video.

Our packets go through many companies between our computer and the destination on the other side. How does the money we pay our ISP get divided to pay all the intervening transit companies? The financial aspects of this are at least as important as the technical details how packets are routed.

One of the problems with the “NetNeutrality” debate is the degree to which it sidestepped network management issues. The original paper that coined the term focused on this as the core principle it was debating, but then politics hijacked the term to describe anything but this.

The point of this section isn’t so much to describe “network management” as a separate category, only to point out that there exist categories that don’t align with the original 7 – that even the original OSI documents had separate categories.

6.11. Governance of public networking

This is really an extension of the “Network Management” category above. There is a difference between managing a public network than a private networking.

From the beginning, “computer networks” always meant “private networks”, like for a corporation or university. That’s the big underlying assumption of OSI: they weren’t designing a network to be the cyberspace we have today with the Internet, but just another corporate network. The only difference was they’d have “open systems”, computers from different vendors that could talk to each other, which wasn’t possible with the “closed systems” vendors were selling at the time.

Even the TCP/IP Internet was designed this way. It was a private US DoD research network who let more and more people connect until everyone was connected.

“How the Internet works” is often about these high-level governance issues than the low-level technology.

Note that while the Internet accidentally became the public Internet, the backbone for cyberspace, the global telephone network was designed from scratch for that purpose. An interesting discussion here would be why the telephone network failed to become cyberspace, and this Internet thing succeeded instead.

6.12. Content delivery

By volume, most Internet traffic is now video content. This should therefore be a category in networking.

This is one of the justifications of the network-on-network architecture described elsewhere in this document. Netflix is a “network” built on top of both the web and Internet for streaming video to you. It needs to be understood as the entire network, such as with servers that transcode video to lower quality for lower bandwidth, rather than as a simple “application”.

7. Misconceptions

The thesis of this paper is that the OSI Model teaches misconceptions. This chapter lists the important ones.

When I talk to university professors, IT professionals, and hackers, they defend the OSI Model by explaining some difficult concepts the OSI model has taught them. When I discuss further what they learned, it soon appears they’ve learned a misconception[69]. The OSI doesn’t seem useful in teaching how the network works, but seems instead to mislead people.

Many things taught by OSI are partially true because we’ve retconned the model to fit reality. But this brings baggage.

Layering is a good example. You can use the lower 4 layers to describe the sublayers in Internet and Ethernet (TCP+IP, MAC+PHY respectively). But this teaches a large number of misconceptions, starting with OSI’s claim that these are layers part of the same network, rather than the reality, that the Internet and Ethernet are independent networks.

This section lists the major misconceptions. The sorts of things in this list are something that professors/teachers/textbooks insist are where the OSI is helpful in understanding the network, showing how they are actually misconceptions.

7.1. It’s not theory

Summary: The OSI Model is widely used as the theoretical basis for networking, compared to the practice of the TCP/IP Internet. It’s not theory. It’s simply another practice, moreover, outdated practice from 1970s mainframes that few understand.

.

7.1.1. What they teach you

Here is a typical description taken from the CISSP Guide[70] claiming that this is theory:

The actual OSI protocol was never widely adopted, but the theory behind the OSI protocol, the OSI model, was readily accepted. The OSI model serves as an abstract framework, or theoretical model, for how protocols should function in an ideal world on ideal hardware. Thus, the OSI model has become a common reference point against which all protocols can be compared and contrasted.

The Wikipedia[71] article describes it as:

The Open Systems Interconnection model (OSI model) is a conceptual model…

These statements are false. The OSI Model never attempted to define abstract concepts. It defined a concrete blueprint.

7.1.2. Poorly understood terminology

The main reason OSI is treated as theory is because people don’t know what it means. The model contains terminology that specifically applies to mainframe networks. People don’t understand mainframes, so when they encounter a term that was supposed to be specific to mainframes, they instead re-interpret it as meaning something generic to all networks.

This is what happened with “session”. When OSI named this layer, it had nothing to do with sessions in general, but with very specific issues connecting a terminal to a mainframe. They could’ve called this specific thing by any of a number of synonyms: Connection Layer, Channel Layer, Discussion Layer, Dialog Layer, Interaction Layer, Association Layer. Had we been a bit more lucky, they’d’ve called it Intercourse Layer #5, and this dreary topic would’ve been more enjoyable for decades of students.

What we think of as sessions (as in HTTP) are actually called associations in OSI, and is assigned to the Application Layer #7.

This session never meant the abstract, theoretical concepts we now pretend it meant. It never encompassed things like HTTP session cookies, nor SSL’s session handshakes, nor NetBIOS’s session service. We just pretend OSI anticipated all things that we call session, and that they intended this to be theory.

It’s much the same way we reinterpret the Bible or the Constitution or Nostradamus. In the United States, we are trying to figure out what “regulated militia” or “right to bear arms” means[72], imposing our modern interpretation of what we want those words to have meant in the past.

People are seduced by their own ignorance. They don’t know what OSI originally meant by session, so believe it can mean anything. This leads to the problem of everybody interpreting it differently. Pick up varying texts, flip to where they describe the Session Layer #5, and you’ll see everybody pretends something different.

The same applies to pretty much every other OSI term. They all meant something specific – to mainframes. It’s unclear what any of them actually mean in other contexts, so everybody, every teacher, every textbook writer, makes up their own interpretation of what they might mean.

7.1.3. Theory by command

The OSI Model was not created by academics trying to understand natural processes, but by bureaucrats making political compromises. It was driven by standards organizations (ISO and ITU) that promoted the interests of then-monopoly IBM and the big national telecom monopolies.

It’s similar to how the Soviet Union decreed Lamarckism as the correct version of evolution (instead of Darwin’s natural selection). It’s also similar to how the Catholic Church decreed the Ptolemaic earth-centric model instead of Copernicus’s heliocentrism. In all these cases, the experts of the time decided upon a group consensus which then became official doctrine.

It’s not that the academics meekly submitted to government edicts. Instead, the topic was too new at the time for them to challenge it. While today we have decades of knowledge to show conclusively that OSI’s mainframe model is wrong, back then OSI was as good as any other description.

In any case, it wasn’t initially theory. Networking isn’t science but engineering. Things aren’t discovered but designed. OSI was one design, the Internet another. We’ve discovered a few bits of theory, like end-to-end principles, but such theory is rare.

7.1.4. Generalizations

OSI looks like theory because the model tries to generalize things.

Many standards are defined in multiple parts, with high-level documents about design goals and low-level documents detailing implementation. For example, DNS was first specified in two documents. RFC 1034[73] specified the concepts while RFC 1035[74] specified the implementation.

The goal of this generalization wasn’t theory, but practice. It set forth the requirements that needed to be fulfilled. Nobody considers RFC 1034 a theoretical approach to all naming services.

In much the same way, the OSI model specifies a set of requirements for each layer. They are specified generally enough that multiple protocols can satisfy the requirements, but it’s still not theory.

7.1.5. Duplicate terminology

We’ve retconned OSI to fit the Internet, pretending they say the same things. This isn’t helpful, because it’s just forcing the student to learn two different sets of terminology to refer to the same thing.

An example is the Network Layer #3. We now teach students this refers to the same thing as the Internet protocol (IP). We don’t teach what it originally meant, such as connection-oriented network service. We teach instead that it pretty much exactly matches how the Internet works.

It becomes a circular definition. The Network Layer is defined in terms “whatever the Internet Protocol does”. At the same time, Internet Protocol is defined as providing the services of the Network Layer #3.

If the student can’t express the difference between the OSI description and the Internet description, then it’s pointless having both. It just imposes a needless cognitive load on the student.

7.1.6. Buzzwords

The theory ends up being little more than unexplained buzzwords. For example, the CISSP Guide[75] describes the Transport Layer #4 functionality as:

This layer includes mechanisms for segmentation, sequencing, error checking, controlling the flow of data, error correction, multiplexing, and network service optimization.

It appears to pull these buzzwords from the official X.200 standard. But it doesn’t explain these terms. They are just random words copied without understanding their meaning. They sound plausibly like they should just be English terms, but they were originally defined to have specific meaning in the context of the OSI standard.

For example, network service optimization means nothing like what you’d assume from knowing the English words. Instead, OSI uses this to describe the fact that in the original networks[76], two computers would first establish a connection at the Network Layer #3, which the Transport Layer #4 would then subdivide. Today’s TCP/IP works differently, a connectionless network, and hence, the phrase is meaningless.

Professors present these terms to students, but can’t actually explain them. If students go visit the professor’s office after class asking for clarification, they’ll use just even more abstract and theoretical, even more confusing[77]. It’s like when the family dog brings you a ball when you don’t want to play, so you throw it as far away as possible. Asking professors to explain the above text just leads to extreme vagueness in an attempt to get you to leave, not in an attempt to get you to understand.

7.1.7. Outdated

Whether OSI is theory or just practice, it’s still out-of-date. It described the networks of the 1970s, IBM’s mainframe networks and the X.25 networks of telecom monopolies.

The Internet was a paradigm shift that OSI failed to describe. People have since tried fixing the model, to make it look more like the Internet, but its fundamental concepts still don’t match.

In the future, when something comes along that’ll replace the Internet, OSI will fall even further behind.

The situation is like the Ptolemaic model (earth at center of universe) compared with the Copernican model (sun at center). Both predict the motions of the planets. Those who learned the old model see no reason to change, believing it to be the most useful way to explain the subject. It’s useful to them because that’s how they learned things. It’s not useful to students who are learning one model to explain a different reality.

7.1.8. History

After the major standards bodies (ISO and CCITT) published the first drafts in 1977, academics started referencing the model in their papers. Partly because they (falsely) believed it was theory, but mostly because they believed this is how future networks would work anyway. Even if you didn’t agree with the design, you’d end up having to work within the framework anyway. It was official.

When Andrew S. Tanenbaum wrote his “Computer Networks” textbook in 1981, he based the entire thing on the OSI Model, with chapters lining up neatly with each layer. He chose OSI for the reasons stated above, it looked like theory, and academics were using it as theory anyway. The various papers cited by the book all cited OSI.

Whatever OSI authors themselves intended (a practical blueprint) it became treated as theory.

But while academics cited it, and pretended it was theory, they didn’t much use it. The Internet went in a fundamentally different direction. They were following the same pattern as they do today: cite OSI as the theoretical framework for things, then completely ignore it thereafter while violating that theory.

7.2. It’s not a framework

Summary: The seven layers are used as an ontological framework, with seven boxes to assign network concepts and protocols. Even though the top 3 layers don’t exist as layers on the network, they are still used as categories to stick things into. It was never intended to work this way, and really doesn’t, with things unhelpfully stuffed into arbitrary boxes for no reason.

7.2.1. What they teach you

The layers in the OSI Model serve two purposes. The first is as a specific design for protocols. The second is as a some sort of taxonomy, hierarchy, classification system, category, framework, timeline, geographic location, or ontology. Concepts are assigned to layers even if there’s no intent for that protocol to handle them.

For example, there is no Presentation Layer #6 protocol, but nonetheless, people assign functionality to that layer, like encryption or compression. It’s a category.

A typical example of using it as a categorization system is the Wikipedia article on BGP that places it in the Application Layer #7.

Wikipedia is based heavily on such classification systems. Back around 2008, it was observed that if you clicked on the first link of an English-language Wikipedia page, you’d climb these hierarchies/frameworks until you got to the top, the article for Philosophy. This was called Getting To Philosophy[78]. It reflects this desire to assign everything to a hierarchy of knowledge.

The 7 layer model looks very much like such a system, but it really isn’t. For one thing, the real model assigns very little functionality to any layer. Most of how we now assign functionality to layers is actually just using TCP/IP as the model. Today’s Transport Layer #5 is almost never defined as OSI defines it, but how TCP defines itself.

Another example is how Wikipedia assigns functionality to layers.

For every function performed by the network, people try to assign it to the nearest possible layer – even if there’s no protocol or service at that layer which handles it.

.

7.2.2. Assigning functions to layers

The truth of networks is that most any function can happen at any layer.

Most of the functions we assign to Transport Layer #4, such as connections, fragmentation, flow-control, and retransmission are actually defined by OSI to happen at any layer. What makes this layer unique is that these are done end-to-end, not what these things are.

What we see as functions assigned to layers comes from reinterpretation. Functions of the Ethernet and TCP/IP protocols are assigned to the nearest OSI layers, and then the OSI layers are defined according to the functions of Ethernet and TCP/IP. Since TCP supports those features (connections, fragmentation, flow-control, retransmission), and we assign TCP to be Transport, we therefore claim those are the functions of the Transport Layer #4.

The real functions defined in layers are actually very small. For example, the OSI’s real definition of Transport Layer #4 is that it’s end-to-end, where the ends are defined as processes running on the two computers. That’s it, there’s nothing more.

The same is true of any layer. OSI assigns almost nothing to any particular layer, most of the functions modern texts assign to specific layers were handled by many OSI layers.

The minimal functionality of OSI is:

Application Layer #7

nothing

Presentation Layer #6

Negotiate a common way of formatting things (not actually doing any formatting)

Session Layer #5

Handle simplex or other constrained links for dumb terminals

Transport Layer #4

End-to-end between processes

Network Layer #3

Packets from computer-to-computer through many hops, along a path

Data Link Layer #2

Packets across a local link, between two hops

Physical Layer #1

Transmission of bits aka signaling

That’s it. That’s the framework, only a few pieces of functionality. The other 99% of functionality in the stack doesn’t belong to a layer.

Using these 7 layers as a framework means functions are arbitrarily assigned to boxes. This is not an aid to learning, but an impediment, because it relies upon rote memorization rather than reason. In other fields, such frameworks have reasons for assigning things to boxes. If you didn’t know which box where something belongs you could reason it out if you understand the concepts. With the networking stack, you can only rely upon rote memorization.

Test givers love rote memorization because it’s easy to test[79]. But it doesn’t help the student.

Much of this assignment isn’t functions so much as words. The function relaying happens at any layer. We just call it routing at #3, bridging at #2, repeating at #1, and gatewaying at other layers. We pretend these are different functions, in order to justify different categories, but they aren’t.

In our chapters on the Alternatives of the OSI model, we show network layers, but in a way that’s impossible to assign specific functionality to them. There’s some things you could assign to the bottom of a network, and some things at the top, but there’s no functional layers.

7.2.3. Assigning protocols to layers

In much the same way that functions are assigned to layers, so are protocols. That’s shown at the top of this section where BGP is assigned to the Application Layer #7.

The problem is that things like BGP are assigned to different layers, depending upon whether you use their protocol layer or function layer.

BGP transfers the network map between routers so they know which direction to relay packets. Thus, by function it belongs to the Network Layer #3. However, BGP runs over TCP. Traditionally, protocols above TCP are assigned to the Application Layer #7. There is confusion which layer to assign it to and nobody really agrees.

We see the same problem with DHCPv6 vs. NDP. They have the same function, configuring the Network Layer #3 router address (and other details). But DHCPv6 runs over UDP, which classifies it as an Application Layer #7 protocol. In contrast, NDP runs inside of ICMP, which is traditionally Network Layer #3 protocol. So even though they provide the same functionality, they are assigned to different layers.

Thus, framework categories don’t match real-world protocol stacks.

7.3. Top 3 layers are fiction

Summary: Many courses/textbooks ignore the upper 3 layers, acknowledging they are fiction, but others still pretend they exist.

7.3.1. What they taught you

The OSI Model has 7 layers. However, only the bottom 4 exist in any reasonable form on today’s Internet. This poses a problem for teachers. They choose one of two strategies:

  • Ignore the upper 3 layers and focus on the bottom 4.
  • Make something up to fill the upper 3 layers.

Responsible teachers choose the first. This figure shows what’s in two popular college textbooks, how they focus on the bottom 4 layers.

Other texts just make up fiction to fill these layers. A good example is the CISSP Guide[80], which teaches the full model.

Since they are not following any standard, what texts manufacture for the upper three layers is inconsistent. One text teaches one thing, another teaches a different thing.

7.3.2. Wrong anyway

It’s not simply that the upper 3 layers are fiction, it’s also that they teach the wrong concepts.

For example, while they acknowledge that the Presentation Layer #6 doesn’t exist as a distinct protocol or layer, teachers nonetheless claim the functionality exists, that this category handles encoding/representation. As we explain below, even replacing this layer with a category still doesn’t make it true. The network stack doesn’t handle encoding/representation – it’s not a function of the network.

7.4. Lower 4 layers are inaccurate

Summary: What OSI teaches is close, but not accurate. For example, today’s layer #2 is a local network and not a local link.

Summary: They teach that the bottom 4 layers are part of a single network, each performing a different function. This is how mainframes worked. This isn’t how the Internet works.

7.4.1. What they teach you

While the upper 3 layers of the OSI Model are fiction, the bottom 4 layers roughly map to Ethernet MAC+PHY and Internet TCP/IP.

OSI

Reality

My Model

#4 Transport

TCP

Internet

#3 Network

IP

#2 Data Link

MAC

Local Networks

#1 Physical

PHY

Sometimes teachers get weird and try to describe originalist OSI terminology (which is bad[81]), but most of the time, they just use Ethernet/Internet terminology (which is better).

7.4.2. It still doesn’t match Ethernet/Internet

One problem here is that Ethernet and Internet don’t actually match the original definitions of these 4 layers, the one created back in the 1970s. To make things fit, they lie about what OSI says, and lie about how Ethernet and the Internet work. They have to fudge both sides to meet in the middle.

The biggest lie is pretending that Ethernet isn’t a network. Ethernet is a full-fledged network that forwards packets hop-to-hop based upon the destination address. That’s what happens inside an Ethernet switch. It’s what happens immediately on the other side of that Ethernet cable you plug into your desktop.

They create a language to promote this untruth, such as calling relaying “bridging” when Ethernet does it and “routing” when the Internet does it. The truth is that it’s the same functionality with only minor differences[82]. What you call it doesn’t change what it is.

Likewise, the TCP/IP Internet is an internetwork and not a network envisioned by the OSI Network Layer #3. Again, you have to lie about what they originally meant in order to expand this to include the Internet.

It’s certainly true the Internet runs on Ethernet, but showing the details together in one model is the wrong level of abstraction.

The correct layer of abstraction is to define the Internet on Local Networks (like Ethernet). We can then expand the sublayers, showing how each has sublayers performing the same function.

In this way, the upper 2 layers are not entangled with the lower 2 layers.

7.4.3. Entanglement vs independence

This problem of entanglement is the biggest misconception I see in the professional community (IT workers, cybersecurity researchers, hackers). While they acknowledge that there’s independence between the Internet and local network, when they try to reason about things at the limits of their knowledge, they make the assumption that there’s dependency.

For example, somebody recently posted a video about how the IEEE 802 committee and IETF collaborate to define cyberspace, paraphrasing RFC 7241[83]. They missed the most important point, that they don’t collaborate, but work independently from each other. They are best seen more as competitors. The IEEE 802 committee standardizes local networks like Etherent and WiFi, while the IETF creates Internet standards.

It’s not an accident that OSI created this misconception, that was its purpose. Their original intent was that these should work together as part of a single network blueprint.

That’s why the Internet model[84] was fundamentally better. It correctly teaches that the Internet is independent of any local network.

                      +--------------------------+----+
                     |    Internet Protocol & ICMP   |
                     +--------------------------+----+
                                    |
                       +---------------------------+
                       |   Local Network Protocol  |
                       +---------------------------+

That why in my Alternatives chapters I propose a simple two-layer replacement:

Yes, each has sublayers. But the sublayers of one are unrelated to the sublayers of the other. They should not be entangled by combining the sublayers of each into a single model.

The diagram is staggered to show that they are independent, that the Internet can run over any links, even those that aren’t well behaved, like homing pigeons[85].

7.4.4. Rigidity

In the OSI model, there are only these 4 lower layers. They are highly rigid. This was the original design, because they were supposed to be 4 layers of the same network.

But the reality is that it rarely happens this way. On Internet backbones, there’s often another layer (like MPLS) between the local links and the Internet.

Teachers are constantly making excuses for OSI, that it’s just a way of conceptualizing things even though the real-world doesn’t work that way. But if we use the right level of abstraction, theory now matches the real world. We should teach using a model that we can defend as “the real-world really does work this way”.

An example is MPLS[86] used by ISPs to define their own network, underneath the Internet, but on top of local links. It might be modeled like the following. We now have three networks on top of each other, where addressing and relaying can happen in all three layers.

The same is true of VPNs. Conceptually, a VPN is just another “local network”.

My alternative model anticipates this. There aren’t 4 rigid layers of the same network, but a flexible system of heaping more and more networks on top of each other.

OSI is this way not because it was a useful abstraction, but because it was the original concrete intent. It wanted these bottom 4 layers to be exactly as stated, four parts of a single network, with no deviation. Reality diverged from this model almost immediately.

7.5. Layering

Summary: Things are layered in network stacks, but OSI’s conception of “layers” doesn’t exist.

7.5.1. What they teach you

Beyond enumerating the concrete layers, OSI teaches layering as an abstract concept.

For example, it teaches the following abstraction, where each layer is defined in terms of how it interacts with adjacent layers on this end, while at the same time talking to peer layers on the remote end.

This is fair description, but no layer actually works this way purely.

7.5.2. Things are layered, but there are no layers

The most important principle of network stacks is that things are layered on top of each other. But this doesn’t mean there are inherent layers, that something belongs to a certain layer.

One thing we like to layer is entire networks. The Internet is an entire network. In an ISP, it might be layered on top of MPLS, which is also an entire network. MPLS itself may be layered on local links, including Ethernet, which is an entire network.

Another thing that it layered are protocols. They certainly run on top of each other, but that doesn’t mean they belong to a certain place in a stack.

Consider the diagram of how MS-RPC[87], an important Microsoft protocol. It’s layered on a variety of protocols, which are in turn layered on top of each other.

There’s no consistent framework here. “Things” are simply layered on top of other “things”.

Some might claim that IPv4 and IPv6 are alternative “protocols” for the same “layer”, but that’s not really true. They are really two versions of the same protocol. They are too highly integrated with the rest of the Internet stack, like TCP, UDP, ICMP, DHCP, DNS, and so on.

The only places where you can actually declare a layer is at the abstraction level of an entire network, like the Internet layer or the Ethernet layer.

The upshot is that we don’t have generic ontological categories layered on each other. What we have is very practical protocols layered on each other, or whole networks layered on networks.

7.5.3. Layer independence

An important concept of layers is that they are independent from each other.

This is supposed to be OSI’s most important lesson, yet it fails to teach this because all its layers are part of the same integrated stack. While you are using OSI to pretend the layers are independent, the student is still hearing the truth that they are dependent. Every time you say a layer is responsible for functionality, you are declaring dependency among the layers, that one layer depends upon another layer fulfilling that duty.

Students usually learn this (wrong) lesson, and believe that everything in the stack is interrelated and coupled with each other.

That’s why this document’s alternative is better. The Internet and local networks really are completely independent from each other. The Internet doesn’t expect anything from the local network, other than packets going from one hop to the next. The local network technology, like Ethernet, doesn’t expect anything from protocols running on top.

But at the same time, the sublayers aren’t independent, within a network. TCP is highly dependent on IP, for example. This document is not going to attempt to teach independence here that doesn’t exist – while it’s very useful to sometimes think of these things alone, especially how IP datagrams are routed independently, at the same time, we have to acknowledge the dependencies.

That’s why the model of this paper proposes drawing boundaries around lines of independence, namely the Internet layered on top of local networks. Then, within each of these “layers”, explain the sublayers that are much more tightly bound to each other.

This high degree of coupling, this interdependence, isn’t an accidental flaw of the OSI model, but what it’s intended to teach all along. The original mainframe networks of the 1970s had layers that were all tightly coupled to each other. With mainframes, a single entity would control the entire unified network, whereas with the Internet, no such central control exists.

7.5.4. Too many layers

A popular defense for the OSI Model is that it doesn’t have to be flawless as long as it’s useful. But it’s clearly not as useful as people think. Students struggle to learn even the names of all 7 layers, much less whatever lessons are supposed to be taught with each layer.

A popular solution is to use mnemonics repeating the first letter of each name, like “Please Do Not Throw Sausage Pizza Away” or “Please Do Not Tell Salespeople Anything”. These match the letters PDNTSA, the first letter of each of the 7 layers of OSI.

The fact that students resort to such mnemonics demonstrates that OSI adds an unnecessary cognitive burden on the student.

Compare this to my simplified 2 layer model (Internet on local networks). The advantage is that every student can memorize it: it’s just the Internet layered on local networks. It’s not merely true, but also helpful.

7.5.5. Fixed layers

OSI pretends there are exactly 7 layers, like pretending that every building has exactly 7 floors, and that each floor has a purpose.

There are indeed floors with purpose. For example, buildings typically have one or more mechanical floors[88]. A lot of buildings place this on the 13th floor, because of superstition. There are likewise penthouses and basements, with obvious locations near the top or bottom of the building.

But there’s really no definition of exactly where such floors must be. A building may have a basement, then a sub-basement. Or, there may be floors above the penthouse.

The Internet doesn’t have a fixed number of layers. As your packets travel through the Internet, the number of layers below them changes dramatically. Likewise, the number of layers on top changes depending upon application. Then there are things like VPNs that layer the Internet on top of itself.

A building’s floors are stacked on each other, and network protocols are stacked on protocols. But there’s no class of floors or class of protocols that are stacked on other classes.

7.6. Packet headers

Summary: They teach that packets descend layers, with each layer adding a header. It sometimes looks that way in a packet-sniffer, but is not the correct abstraction. Instead, a protocol encapsulates something else inside its payload.

Many students have an epiphany when they see a packet-sniffer for the first time. Using such a tool they can see the packet headers that match layers.

The following is a picture from Wireshark, a packet-sniffer that captures packets from the local network. There are three panes to this display.

  • The top pane shows a list of packets that have been captured.
  • The bottom pane shows the raw hexdump of the packet.
  • The middle pane “decodes” the hexdump, showing what the raw bytes mean, such as which correspond to the IP header.

The following is a hexdump of the same packet as above, but I’ve colored each of the headers: Ethernet, IPv4, TCP, and then the SSL (TLS) Record header.

OSI describes this in terms of the packet descending the network stack on one end, with each layer adding its own header. The process is reversed on reception, with each layer stripping off its own header before passing onto the next layer above.

Figure from CISSP Guide

This is a bad way to conceptualize what’s going on. This isn’t what’s happening in theory, and only occasionally works this cleanly in practice.

In particular, TCP’s payload is a stream of bytes, not packets. When a TCP payload is small and fits into a single segment, it appears to fit the above figure. Both other times, a single TCP segment may carry multiple payloads, or a payload may span multiple TCP segments.

We see both happening in the Wireshark capture below. SSL (TLS) transmits data as a series of records. We see that a single TCP segment can contain several SSL records, and an SSL record can span multiple TCP segments.

In the packet below, we see five SSL records in a single TCP/IP packet. Four of these records are wholly contained within the packet, while the fifth spans the end of this packet and the start of the next.

To draw this with the above diagram, things look like the following:

This misconception has real-world importance. Take the Snort IDS[89] as an example. It tries to detect hacker activity by searching packets for signatures – a pattern of bytes unique to a specific attack. It treats each TCP segment as containing only one SSL record, and can’t detect attacks where this isn’t true.

The keyword “depth” is measured from the start of a packet payload rather than the start of the stream. For example, the following is a signature for the famous OpenSSL Heartbleed attack, matching an SSL Record header of |18 03 03 00 40| in the first five bytes of the packet (depth:5 keyword).

alert tcp $EXTERNAL_NET any -> $HOME_NET [21,25,443,465,636,992,993,995,2484] (msg:"SERVER-OTHER OpenSSL TLSv1.2 heartbeat read overrun attempt"; flow:to_server,established; isdataat:68; isdataat:!69; content:"|18 03 03 00 40|"; depth:5; metadata:policy balanced-ips drop, policy max-detect-ips drop, policy security-ips drop, ruleset community, service ssl; reference:cve,2014-0160; classtype:attempted-recon; sid:30525; rev:4;)

But, as we saw above, an SSL Record header can appear anywhere in the TCP payload, no merely packet boundaries. TCP does have a concept of packet boundaries, it’s something only we humans see. The following is a packet exploiting Heartbleed that doesn’t trigger the above signature, because it inserts a 7-byte TLS record in front, pushing the second record containing the attack deeper into the packet, beyond the depth:5 limit.

In today’s network stack, this model of adding headers to packets really occurs only in two spots. One is where the kernel adds a combined TCP/IP header, and the other wear it adds an Ethernet header.:

This is consistent with the alternative two-layer model, that the Internet runs over local networks like Ethernet.

Even though we see a succession of headers in a packet, the actual layer boundaries don’t quite fall on the lines that we appear to see. They are an accidental artifact rather than designed to work like this.

7.7. Local-only visibility

Summary: The OSI model pretends all the layers work together as a single network, that everything conforms to the same model. In truth, the local network only extends as far as the local Internet router. After that point, anything about the local network is stripped off, and we have no visibility into how the rest of the network is constructed.

The OSI standard includes the following diagram. Versions of this are used nearly everywhere, such as the cover of Tanenbaum’s 1981 textbook. This misconception teaches that the layers are the same throughout the network. This was the original intention. Firstly, OSI was designed for private networks, within an organization, where both ends were controlled by the same entity. Secondly, OSI really intended that every stage of the network would conform to OSI.

The following diagram is from a college textbook, showing all the parts integrated as one whole.

Jim Kurose, Computer Networking: A Top-down Approach, 7th edition 

The truth is that we only have visibility into our local network, or what happens on top of the Internet. The details of what exactly happens to our packet are opaque to us. I redraw the above model where the left-hand side conforms to our network, but how the rest of the network is unknown, sometimes with fewer layers, sometimes with more layers.

Figure: diagram from original OSI standard changed to highlight independencies

People imagine that an Internet packet contains all 7 layers. It doesn’t. It contains only the Internet Protocol portion. Any headers and trailers[90] that the local network does is not part of the Internet packet.

I encounter this misconception a lot. For example, one text describes a packet passing through a router as “changing the Ethernet addresses”. They used a packet-sniffer on both sides of a router and saw the same packet, just with different Ethernet addresses. But in fact, the two Ethernet’s were unrelated – the headers were stripped from the incoming packet then added to outgoing packet. That they looked similar was just coincidence, had different sides used different local network technologies they wouldn’t look the same.

This reflects the fundamental issue of the OSI Model making things too transparent. We are not supposed to know what happens on the other side of a router, yet it shows us. This leads to misconceptions.

7.8. Control protocols

People learn layers in terms of how data is transmitted, but there’s a lot of confusion about how control information is transmitted.

  • We need DHCP to configure IP addresses.
  • We need ICMP for control messages.
  • We need ARP and NDP to find MAC addresses on Ethernet.
  • We need DNS to translate names to addresses.
  • We need a sublayer in Ethernet to auto-negotiate the speed of the link.
  • We need SYN and FIN packets in TCP to create and destroy the connection.
  • We need certificate management in SSL that goes completely outside the network stack.
  • WiFi needs to associate with access points.
  • At all layers, sometimes we also need authentication for security.
  • At all layers, there’s a lot of network management going on, tracking usage for such things as billing and monitoring.

How data is transmitted is the easy part, control is the hard part.

It’s thoroughly misrepresented when trying to apply the OSI model to TCP/IP. Even though things like DHCP run on top of UDP, they aren’t part of the “application layer”. The function of DHCP is solidly part of the IP protocol function. The same applies to BGP, even though it runs across TCP connections between routers, it’s solidly part of the IP “layer”.

7.9. Ours is not the original OSI Model

Summary: The original model referred to IBM SNA mainframes and telecom X.25, whose concepts don’t apply to modern networks like the Internet and Ethernet. What they are teaching you is a retconned version of OSI.

The OSI Model is officially described in either ITU X.200[91] or ISO/IEC 7498[92]. It’s the same standard, published by different organizations. ISO focused on standards for computers, while ITU focused on standards for telecommunications. Since this is “computer telecommunications”, they worked together on a common standard.

When your professor gives you reading materials, they don’t suggest you actually read these documents. They are largely incomprehensible. The problem is that modern students don’t have the historical context to understand any of this. Techies still use serial links and terminal-like command-lines, but these have none of the issues that plagued early networks.

Instead of using the original, professors use a retconned version of the model.

As repeated elsewhere, “retcon” means “retroactive continuity”. It’s like how in the second Star Wars movie, Darth Vader was changed to now be Luke’s father, despite the original movie saying Vader killed Luke’s father.

There is no standardization among the retconned versions. Every professor and textbook makes up their own version of what they think the OSI Model should’ve said.

There’s some consistency among the lower four layers. They pretend the lower four layers match the two layers of Ethernet (MAC+PHY) and the two protocols of the Internet (TCP+IP).

There’s no consistency among the next two layers, Session #5 and Presentation #6. Good textbooks ignore these, but most textbooks make up crazy things that don’t match anybody else’s description.

The top layer, #7, is sort of a catchall. Even though it’s described in wildly different ways, nobody really cares, because it’s simply “everything else”.

This text has a chapter on the History of networking that tries to give a background for understanding the original OSI in its original context. The more you learn about what OSI really meant, the more you realize it doesn’t actually describe today’s networks.

7.10. Standard

Summary: They aren’t teaching the original OSI standard, but their own non-standard interpretation. In any case, it’s not a standard that people should follow.

As mentioned above, the OSI you learn in class is not the actual standard, but your professor’s or textbook’s interpretation of what they think the standard should say. They present it as “here’s what OSI Model says” when the truth is “here’s my model”. They are trying to pass off their own model of networking as the official OSI Model.

The consequence is that nobody is following the original standard. Sure, they cite the standards ITU X.200[93] or ISO/IEC 7498[94], but they don’t expect that you’ll ever read them and discover that what the standard actually said is not what they claimed the standard said.

But the real misconception here is that people feel that we should follow official standards, that even if current networks don’t comply with the standard, that we should change them so they should.

In other words, it’s not the OSI standard that’s wrong, but the real world of the Internet and Ethernet.

We are experiencing that with Tesla’s NACS or North America Charging Standard. It wasn’t the official standard, but simply Tesla’s custom connector. But since most electric cars in North America are Tesla’s, as well as most charging ports, it’s still the de facto standard. In summer of 2023, competing car makers and charging networks started announcing a switch to Tesla’s de facto standard, switching away from the official de jure CCS1 standard backed by the Biden administration. Only after all this did the official SAE standards organization start crafting an official standard around NACS.

When SSL was first developed in the 1990s, there was a lot of debate from people rejecting it because it didn’t fit the model. It wasn’t just that it was in the wrong layer, but it didn’t conform to the X.800[95] Security Architecture either.

It wasn’t a rational debate, there’s no reason why SSL should conform to any OSI standards.

In the end, it did adopt the X.500 standard for certificates. Though, this is pretty much nominal, as X.500 has been changed to fulfill the needs of SSL primarily with everything else a secondary consideration.

It’s hard to argue that something is false when a standard says that everyone has agreed it’s true. But it’s like standards claiming the value of π (pi) is 22/7[96]. Such standards argue that this approximation is good enough (3.1428) vs. (3.1415), and thus it’s “truth”. But sometimes it’s not enough. NASA’s standard for π is how it’s represented in a 64-bit floating point number, or 3.141592653589793[97], precise enough that the error for the most distance objects in our solar system will be off by less than a centimeter. But for other needs, even that’s not precise enough, and you shouldn’t use it just because it’s the official “standard”.

This discussion is just not political but technical. There’s a widespread misconception that packets can’t appear on the network because that would violate the standard.

What happens when you put a SYN and FIN flag in the same TCP packet? In some people’s minds, that can’t happen, because that would violate the standard[98]. But of course it can happen, you can craft the packet yourself, send it to a target, and see what happens. Port scanners do this because different targets respond differently, so you can fingerprint them based upon how they respond.

OSI was never a theory or a correct framework. Part of the reason people treated it as such is because it was the standard. In other words, if everyone adopted the standard, then it would become the theory and framework. The academic papers and early textbooks around 1980 are wrong, because nobody actually adopted that framework, but would’ve been right if they had.

But OSI was a failure. The original plan was never finished, having been made obsolete by the Internet. Despite the fact that the official standards documents exist, it’s a defunct standard. It’s a lie, not truth. No matter how many standards organizations insist it’s the truth, it’s still a lie.

7.11. The Application Layer #7

This layer is also described in Chapter 3 (consensus description) and Chapter 8 (historical description).

Summary: It was never clearly specified whether layer #7 was inside the stack or outside the stack, where human users fit into this, and so on. It ends up being all of this and none of this.

OSI never figured out what the Application Layer #7 was supposed to mean, and we still live with this confusion. It was there because IBM’s SNA (and competitors) put something there. It’s largely a catchall, where you put things you can’t assign to any other layer, but there’s no theoretical description of this layer.

The question is whether applications live inside the Application Layer, as part of the network stack, or whether applications live on top of the network stack, outside it.

In other words, are programmers expected to learn how to implement network concepts when building applications? or are they expected to learn how to use underlying service/library APIs that already implement the concepts?

Take SMTP (email) and HTTP (web) as an example.

When writing email software, programmers typically implement SMTP themselves. SMTP largely lives on top of, outside the network stack.

When writing web software, programmers typically use libraries (or services), like OpenSSL, NodeJS, nginx, and so on. HTTP and SSL thus live as part of the network stack, inside it.

A completely different way of looking at the problem is how the lower 6 layers of OSI intend for interchangeable protocols. It’s like how you can exchange TCP for UDP, or exchange IPv4 for IPv6. That’s a fundamental principle behind an OSI layer, not a place to assign concepts, but a place to assign alternate standard protocols.

You can’t really exchange “application” protocols for each other. You can’t exchange SMTP for HTTP, for example. You want either an email application or a web application, and can’t implement a web browser using SMTP.

All Network Layer #3 protocols should do roughly the same sort of thing, while Application Layer #7 protocols should do completely different things.

The Application Layer has become the Miscellaneous Layer, the box where everything is assigned that doesn’t fit within the network proper. This is just confusing and weird.

In our “Alternatives…” chapters, we make this distinction clear, differentiating between things that use the network stack (payload) vs. things that are part of a network stack (services or networks).

7.11.1. Service Elements

To some extent, OSI did eventually intend that this would be inside the stack, defining things that would be used by apps running on top of the stack.

For example, the Association Service Control Element or ACSE was designed to handle user logins for all the applications. (What most people call sessions OSI actually called associations.

In addition, if your app needed to transfer files, it was supposed to use a standard protocol like FTAM or File Transfer Access and Management.

But as noted above, this is just a catchall. The 6 layers below this had a set of otherwise equivalent/analogous protocols. But in layer #7, there’s a whole suite of unrelated things.

On the Internet, this sort of thing never appeared. Each protocol like SMTP or HTTP handles everything itself. There’s no suite of layer #7 from which you build a single app, you build the app based on one protocol.

7.12. The Presentation Layer #6

This layer is also described in Chapter 3 (consensus description) and Chapter 8 (historical description).

Summary: This layer doesn’t exist. There is no protocol that implements this, other than a defunct official standard that was only ever used in a functionless mode.

Summary: This category doesn’t really exist, either. Representation/encoding is not a property of the network as envisioned in the 1970s, but a property of the data.

Back in 1977 when the OSI model was first conceived, computing was dominated by remote terminals connected to mainframes. There were many different types of terminals, with different capabilities and different manufacturers. They had different control codes[99] for doing things like placing the cursor at a location on the screen. They had different character-sets, mostly famously IBM’s EBCDIC vs. the standard ASCII, but also different 8-bit extensions to the 7-bit ASCII. Even 5-bit Baudot was still in use.

Before networks, networks didn’t exist (sic)[100]. These days, people assume that everything will be in a standard format, like JPEG or PDF. But why would they? Data wasn’t transferred between computers. The entire lifetime of data would be on that one computer, from creation, use over many years, and eventual destruction. The only software that would read the data would be the same software that wrote the data. The only exceptions are the aforementioned text, when a computer would send data to a teletype to be printed out.

Thus, computers were free to represent data any way they wanted. Every computer represented data differently internally – because there would never be a need for external representation. Some computers didn’t even have 8-bit bytes but had things like 18-bit words.

To network different computers together would thus require translation of one format to another. This would include something with 36-bit words using EBCDIC for text to a device with 8-bit bytes using ASCII. This was naturally seen as the role of the network, an inherent feature of networking. And this is what teachers teach is the function of the Presentation Layer #6.

But today, data never lives on just one computer, it lives on the network, and is passed from computer to computer, often radically different computers. I’m drafting this on a macOS laptop, but you are probably reading it on a Windows desktop or an Android phone.

Except for some legacy “internal representation” concerns (like “locale”), data has a consistent external representation. A PDF file has the same format regardless of machine. IBM mainframes have a radically weird character-set called EBCDIC and Windows machines have internal character-sets called codepages (along with 16-bit Unicode). But none of this changes how PDF files are formatted, it’s the same file regardless of platform.

The key lesson is this: data representation is a feature of the data and not the machine or the network. It’s not a layer in the network stack, but it’s not even a category in an ontological framework. Everything you learn about data encoding in your network class is based upon an outdated 1970s thinking.

This broken thinking is still with us. For example, Apple created their own programming language called “Swift”. It’s very good. Somebody decided to port this language to IBM mainframes. So of course they included charset translations, such that source files would be translated to EBCDIC automatically. This is nonsense. Swift uses the UTF-8 character-set, it’s a property of the source file, and not of whatever platform is being used to compile it.

Terminals and character-sets were the original Presentation Layer issues, but people soon tacked on compression and encryption. These are humorously not even the same thing. This is the flawed thinking that starts with a very specific goal, and then people unreasonably tacking on everything else that sounds even similar to that goal.

In the original mainframe networks, you needed to negotiate terminal codes and character-sets. It was a hard requirement that affected almost all communications at the time. It was reasonable for them to stick this into a layer.

But none of this thinking applies to either compression or encryption. Indeed, the X.800 Security Architecture standard points out that these sorts of things can be done in any layer.

Originally, the Presentation Layer didn’t even do the translation itself. Its purpose was simply to negotiate a representation acceptable to both parties. It was the the responsibility of the application to actually do the translation. The correct answer to “which layer encodes data” is “Application Layer #7” not “Presentation Layer #6”. But to pass the test, you’ll have to give the wrong answer.

They did get around to defining an official Presentation Layer protocol (X.216[101] and X.226[102]), but they were never used as such. The protocol was complex, because any feature from layers #2 through #5 had to be controllable in the interface to layer #7. But what this protocol adds to all the rest is simply negotiating an acceptable transfer syntax.

In practice, either it negotiated something predetermined (like ASN.1) or didn’t negotiate anything specific at all. I spent a lot of time with protocol analyzers in government-regulated industries (like the power-grid) and I never found any practical negotiation.

In particular, the protocols they eventually standardized in the 1980s didn’t even do what they were originally intended to do in the 1970s, which was to negotiate character-sets and terminal-codes. The FTAM and VT protocols negotiated the use of ASN.1 to represent their protocols, and then did their own things to deal with character-set encodings and terminal control codes.

The upshot is this: every time they mention the Presentation Layer #6, it’s a lie. Even when teaching the original intent, it never actually turned out that way with official OSI protocols. In the modern world, the concept doesn’t even work, because data encoding is not a function of the network.

7.13. The Session Layer #5

This layer is also described in Chapter 3 (consensus description) and Chapter 8 (historical description).

Summary: The Session Layer was never defined to manage sessions.

Summary: What we call “sessions” today is what the OSI model called “associations”.

Summary: Some recent texts have started confusing the Session Layer with the Transport Layer.

A surprisingly large number of tests ask students to answer that “the Session Layer handles sessions”. It’s surprising firstly because this is so obviously true[103], just look at the terminology. Why ask such obvious questions? Has any student ever gotten this wrong?

It’s surprising secondly because it’s actually false: according to the original OSI documents, the Session Layer didn’t manage sessions. Still, the right answer for the test is to say this layer handles sessions, but it’s just the wrong answer in reality. It’s the application that manages the session, according to OSI.

Nobody really knows what the Session Layer #5 actually does. Your teacher likely didn’t know. When you pass on your knowledge to students, you won’t know. When students come to you with questions, you’ll become vague and handwavey, describing things in even less defined terms, admitting things don’t work precisely this way. It’s like a dog who comes to you with a tennis ball when you don’t want to play, so you throw it as far away as possible to frustrate the dog. The further away you throw the ball, the more peace you’ll have. In much the same way, instead of straight answers to students asking “what is the Session Layer”, you try to answer in a way that sends the student away so confused they can’t think of another question to ask.

The reason for the confusion is that none of these words have any specific meaning. When OSI named this layer, it had nothing to do with sessions in general, but with very specific issues connecting a terminal to a mainframe. They could’ve called this specific thing by any of a number of synonyms: Connection Layer, Channel Layer, Discussion Layer, Dialog Layer, Interaction Layer, Association Layer, Intercourse[104] Layer, Relations Layer, and so on. All of these words would’ve sufficed for their purpose, Session is an arbitrary choice.

The functionality of the Session Layer was a political compromise. By 1977 when the OSI Model was drafted, telcom X.25 was up and running with teletex/videotex services. These were extremely dumb terminals, often with a CRT little different than a television set (meaning 40 columns by 20 lines). They had severe limitations.

For example, these extremely dumb terminals were often connected with half-duplex lines, meaning they could either send or receive, but not both simultaneously (which would be duplex). Therefore, the Session Layer would allow both sides to negotiate the fact that a half-duplex line was being used, then provide synchronization primitives so that each side knew which mode they were in, whether they were currently receive-only or send-only[105].

The official documents call this dialog management. That’s what the original Session Layer #5 actually does, manages dialog, where “dialog” doesn’t mean any generic dialog, but this specific limitation of network links and terminals of the 1970s.

The OSI wanted to make the existing telecom X.25 network an acceptable part of their future standards, so they included this as a layer.

Of course, it was obsolete almost immediately. The 8-bit microprocessors were plummeting in price, such that by the time OSI was officially standardized, nobody wanted half-duplex communication anymore. The biggest teletext/videotext network would be France’s Minitel in the 1980s, and they never really needed the features of the Session Layer. And of course, nobody else did, either.

It would exist as a protocol in the official OSI stacks, but it was a protocol that neatly removed itself. In practical use, where government regulators have forced industries (like the power grid) to  use OSI protocols, the features of the Session Layer (like synchronization, checkpointing, simplex operation) aren’t used. It’s technically true that it exists, every connection includes a Session Layer negotiation  at the start, but it’s not used after that. It’s a vestigial protocol.

So we are forced into this weird situation where we have a Session Layer #5 in the official OSI 7 Layer Model diagrams that does not now exist, never existed, and nobody can explain what it’s doing there.

But this hasn’t stopped people redefining it. They now imagine anything plausibly called a “session” now belongs to this layer. Even “connections” that should just be a feature of the lower Transport Layer #4 are now sometimes claimed to be managed by this layer.

NetBIOS exists on Windows networks. It has a feature it calls “session”. Therefore, many people put it in the Session Layer #5. But by “session” it really just means “connection”. It’s more correct location should be the Transport Layer #4.

Likewise, HTTP has “session” cookies and SSL has “session” tokens. Even though they use the word, neither are related in any fashion to the Session Layer #5.

This shows the problem that people use “session” to describe a lot of things, many of them conflicting. By “session” we don’t mean anything in particular.

We did have a sort of session back in 1977 with terminals connected to mainframes. A user could log onto an application on the mainframe, such as “payroll”, establishing a session. They could then leave that session active while switching to another session on a different mainframe for “accounts billable”. In OSI terminology, these are called associations, and they happen in the application.

The reality with “session” is that it’s not really a thing, that whenever you see it, it means something radically different. NetBIOS sessions, SSL sessions, and HTTP sessions all mean completely different things to each other. You can’t put them into the same category in your ontological framework. You certainly can’t put them into a single layer.

As a wholly separate discussion, it should be pointed out that some people have begun confusing the Session Layer #5 with Transport Layer #4. Since there is no real-world Session Layer protocol, and no real-world functionality that comes close to what was intended by OSI, people’s description of this layer mutates more than the other layers.

7.14. The Transport Layer #4

This layer is also described in Chapter 3 (consensus description) and Chapter 8 (historical description).

Summary: The general idea of transport is something that belongs on the top of the stack, not the middle. TCP and UDP are the top of the Internet network, and LLC is the top of the Ethernet network.

7.14.1. End-to-end

In the original definition of OSI, the only definition of the Transport Layer #4 is that whatever it did, it did it end-to-end.

People today associate this layer with the Internet’s TCP protocol, which creates a connection, segments data, sequences segments, retransmits lost data, handles flow control, and so on. So they define the Transport Layer #4 as the place where such functions are handled.

But in truth, OSI defines these as things that can be handled at any layer. Instead, what’s unique to this layer is that whatever is done, it’s done end-to-end.

This doesn’t even map to the Internet’s idea of end-to-end. In OSI, the lower Network Layer #3 establishes a connection between machines, whereas Transport Layer #4 creates connections between processes running on those machines.

7.14.2. Internet Protocol is connectionless

OSI was based upon IBM mainframe and telecom X.25 networks, both of which had a connection-oriented Network Layer #3. This means that the computers on either end create a connection between them. No matter how many different apps, or how many browser windows, only a single computer-to-computer connect would exist at Layer #3.

The purpose of Layer #4 would be then to subdivided that computer-to-computer connection into multiple endpoint-to-endpoint connections.

This is why OSI uses the word “multiplexing”, because it split the Network connection into multiple Transport connections. That’s also why it uses the word “optimization”, because this layer has to prevent one of those Transport connections from hogging all the bandwidth available to the Network connection.

The Internet Protocols IPv4 and IPv6 are connectionless. There is no established connection. Routers simply forward packets based upon the destination address, without any concept of packets being related to each other.

It’s only TCP that understands the concept of a connection. Each connection is independent. Each tries to hog all the available bandwidth, which is just congestion handled by any other. In other words, whether there’s one computer in your home watching two streams of Netflix, or two computers each watching a different stream, either way, they are constrained by the same amount of bandwidth.

It’s always tedious when textbooks mention multiplexing and optimization as part of the Transport Layer. They don’t exist in the modern network.

7.14.3. Transport belongs on top

OSI puts transport somewhere in the middle of the stack (layer #4). The reality is that it describes things that belong at the top of the network.

Layer #2 provides transport services in something called LLC. But when we the Internet on Ethernet, those services move up to TCP at layer #4 (no longer using LLC).

Likewise, when we use HTTPv3/QUIC, transport features move upward in the stack (layer #7) and we no longer use TCP.

The lesson is that as we add more layers to a network stack, transport functionality keeps moving upward.

We also saw this historically.  ARPANET, from the early 1970s, was designed around the TCP, a transport protocol at the top. The Internet Protocol (IPv4) as added underneath TCP. The network has a natural bottom (physical transmission) and a natural top (transport). As networks evolve, more and more stuff gets put into the middle.

7.14.4. The endpoint

Transport is end-ot-end, but what precisely defines an “end”?

A lot of documents claim the ends are devices on the edges of an internetwork. For example, the CISSP Guide claims:

[Transport] establishes communication connections between nodes (also known as devices)

This is false. It’s not the device, but a process on the device. The word process means something like an app or a service.

Outside the computer, connection is identified with a socket, consisting of the IP address and port numbers.

Inside the computer, the socket is represented as a resource handle or file descriptor. Inside the operating-system (Windows, Linux, macOS) is a big table that matches sockets to processes.

In short, transport is between endpoints within devices, not the devices themselves.

7.15. The Network Layer #3

Summary: OSI teaches that the network happens only in one place in the network stack. In fact, it happens in multiple places. The web is a network running on top of the Internet. Local networks like Ethernet are networks that run underneath the Internet.

Summary: The original OSI defined the Network Layer #3 as being connection-oriented. The Internet broke that model, using connectionless datagrams instead.

7.15.1. There are multiple networks in the stack

OSI teaches that the “network” happens only at one layer in the 7 layer stack. As our alternative model above shows, it’s a lie. There are many networks[106] stacked on each other.

Teaching OSI therefore demands teaching the lie that the other things are not networks. Every network class starts with pretending Ethernet is not a network, that the Internet is the only network.

7.15.2. The original Network was connection-oriented

This is described in the section above under the Transport Layer #4, as that layer was designed assuming this layer was connection-oriented.

This connection-oriented means that a connection must be established between the computers on either end plus all the routers/switches/relays in between. The route/path a packet will take is negotiated ahead of time. Every packet between those two computers will then follow the same route.

Back in the 1970s, both IBM’s mainframe and telecom X.25 networks were connection-oriented. Even the original ARPANET TCP was designed this way.

The major issue was congestion. Establishing a connection first allowed bandwidth to be allocated ahead of time so that it could be guaranteed. For example, if you needed a guaranteed 64-kbps for a voice phone call, this could be allocated at the start of the connection. If the bandwidth isn’t available, then the connection fails at the start rather than halfway through, which was considered superior. If the connection succeeds, then it’s guaranteed not to be congested afterwards.

The alternative is a connectionless network, which simply forwards packets individually, with reference to a connection. Routers were therefore much simpler. They didn’t need to keep track of connections. They just had a routing table. Whenever a packet arrived, the router would relay the packet in the correct direction.

Connection-oriented requirements, such as retransmitting packets due to congestion, would have to happen at a higher layer.

While ARPANET, IBM mainframes, and telecom X.25 all used a connection-oriented network, other groups built connectionless networks, namely Xerox with its XNS/PUP network, and French researchers with their CYCLADES network.

The ARPANET researchers were so inspired by this that around 1978 they changed their architecture, transforming the ARPANET into the Internet.

The consequence is that things like congestion are now handled in different layers than anticipated by OSI. In the OSI model (including IBM SNA, telecom X.25), congestion was handled at layer #3, but in the new Internet, congestion was handled end-to-end with TCP – or even higher, in apps.

It was a radical Network design, and really, it never really worked right until almost a decade later, in the late 1980s. This is rather amazing – we look at this as some sublime, almost magical insight into an alternate network design. In truth, it took a lot of refining to make it work. If they couldn’t have come up with solutions in the late 1980s, then it’s likely we’d have the OSI Connection-Oriented Network Protocol as the basis of the modern Internet.

The OSI Model standards we have today have been updated, allowing for a connectionless Network. It’s part of the retconned effort that tries to fit the model to the Internet.

Note that the Internet still struggles dealing with congestion for phone calls, where sometimes packets are dropped and you don’t hear something from the other side. Despite such disadvantages, it’s still ultimately superior.

A good historical note is RFC 787[107], which details this conflict between OSI being connection-oriented and not descriptive of the Internet, which was connectionless.

7.16. The Data Link Layer #2

This layer is also described in Chapter 3 (consensus description) and Chapter 8 (historical description).

Summary: This was designed to match SDLC, it never really applied to Ethernet or anything else underpinning the Internet.

The OSI Model intended the Data Link Layer #2 to describe point-to-point links. In other words, when it says “links” it actually means “links”, and not something like Ethernet.

In the beginning, links were dumb, with dumb devices (not computers) on either side, such as a teletype. Early cables could have as many as 25 wires for sending control signals out-of-band. The modern concepts of network protocols is about sending control information in-band, alongside the data. This needs smart computers to parse the data, separating headers from payload. But when devices are dumb, this isn’t as practical.

Take flow-control as an example, where the receiver tells the sender to slow down. Today, this is done in-band, inside network packets that are parsed on the other end. Back in the old days, this needed to be a separate wire, communicating this state out-of-band (meaning, not on the data wires). A dumb device could then automatically slow down. Thus, a paper-tape reader could feed data to a printer – paper tape readers work faster than printers can print, so this control wire would signal frequently telling the paper tape reader to slow down.

With computers on either end of a link, this complexity of extra wires isn’t needed. You can just send such control information as part of the data stream.

IBM created a protocol called SDLC[108] to do this. With the invention of 8-bit microprocessors, devices that were previously dumb now became smart, so there was probably a smart device on either end of a link.

Ethernet never fit this model. It wasn’t point-to-point, and was instead more complex supporting hundreds of devices connected to a wire. Teaching OSI means lying about the truth, denying that Ethernet is a network and insisting it is instead just a link.

7.17. The Physical Layer #1

This layer is also described in Chapter 3 (consensus description) and Chapter 8 (historical description).

Summary: it’s not really a layer in the stack, but something that exists beneath the stack.

This is perhaps the only accurate layer, because it’s hardware. Hardware makers are much more concerned with following standards.

The only thing I think is wrong is the degree to which people underestimate the complexity here, that it’s much greater than all the rest of networking put together. Also, it’s not really a layer in the stack so much as the layer underneath the stack.

7.18. Exchangeable components

OSI was a standards group. The purpose of a layer was to define the boundaries of a replaceable component. The purpose wasn’t to provide a theoretical basis where functionality was supposed to be implemented.

In other words, the purpose of the Transport Layer #4 was to define the point at which end-to-end reliability happened. By this it means that users of the Transport Layer would see such reliability. It doesn’t mean that the Transport Layer handles reliability – it may be handled elsewhere in the stack.

This is what happened with telecom X.25. They handled connections and reliability in the lower layers. Therefore, it wasn’t handled in the Transport Layer protocol used with X.25.

It’s also what happens with ISOTP, which implements a shim between what applications expect with a transport protocol and Internet TCP. You don’t find the complete OSI stack anywhere today, but ISOTP over Internet is common in highly regulated industries (like the electric power industry).

This idea of exchangeable protocols has a mixed history. It’s mostly false, because when you change protocols you almost always have to change everything on top. Using IPv6 instead of IPv4 requires that applications be rewritten to handle it.

On the other hand, there are some successes. When you do that rewrite, you can do so in a way that makes code agnostic to whether IPv4 or IPv6 or IPfuture is being used. Instead of adding IPv6-specific features to the Sockets API, they added IPgeneric features. I do this with my networking tools, they support IPv6, but there’s nothing in the code that mentions IPv6.

A good analogy for this is how paper sizes are standardized. When printing a document, you can choose whichever size best fits your purpose.

They wanted to build the same thing with OSI. That you can arbitrarily mix/match things according to what’s your best interest. In practice, it couldn’t work this way. What people wanted was a completely different stack, or added layers (like MPLS or HTTP/3 QUIC).

7.19. Later misconceptions

A lot of misconceptions are taught early, but some are developed years later.

For example, a lot of people think TCP is part of the Session Layer. Nobody taught them this, but it pieces together the bits they remember. The differences between “connection” and “connection” have blurred, primarily because there aren’t any differences except for the arbitrary made-up ones professors teach. Thus, “Session” sounds like the thing that TCP does.

This is the consequence of teaching them a lot of arbitrary things that aren’t true. It becomes jumbled in their minds. They have a hard enough time remembering the things that are true without getting overloaded by things that aren’t.

7.20. It was never useful

People defend their education by claiming OSI was once useful.

It was never useful. It was obsolete the moment it was written. The problems described in this document have been there from the very start. The OSI model was written to describe the mainframe and telecom networks that existed at the time, and did not anticipate the Ethernet and Internet networks that were released shortly after.

OSI was written by people who didn’t understand networking. They focused on terminals talking to mainframes because that was the bulk of their own personal experience. That’s how OSI was designed.

In 1981 when Tanenbaum created the first college textbook on Computer Networks, TCP/IP hadn’t even been released yet. It was somewhat correct, in that most academic papers of the time referenced the OSI Model in some fashion. Tanenbaum’s book would’ve been a useful introduction to help you read the academic papers of the time.

But those papers were trying to extend the mainframe model of networking. The early TCP/IP documents didn’t mention OSI. Ethernet partially did, but was already retconning it, pretending OSI said something different in order to make it fit.

The Internet is something that evolved despite OSI, not because of it.

7.21. Truth

You know something is deeply wrong here by the fact that no networking class expects the student to read the original OSI standard (ITU X.200[109] or ISO/IEC 7498[110]). Indeed, reading and understanding these documents is counterproductive – you’ll no longer get the right answer on the tests.

What teachers use is their own modern interpretation. It’s like teaching students Romeo and Juliet by using West Side Story.

Everybody has a slightly different modern version. There’s a rough “consensus OSI” among textbooks and teachers where the main things are the same, but there’s variation on the smaller details. What they are often actually teaching is simply the TCP/IP Model, but using the nearest OSI terms.

Teachers hide the lies. They don’t warn students that reading the actual OSI will hinder rather than help them on the test. They don’t warn students that this isn’t how the modern TCP/IP Internet works.

It’s weird that we are even having this conversation, that deliberate lies can ever be justified.

8. History of the Mainframe/Telecom Network

People don’t appreciate how grossly out-of-date the OSI Model is because they don’t understand the historical networks. They see something like sessions and think it describes some transcendental, eternal principle about networks. The reality is that it was coined to refer to how terminals talked to mainframes, and that what we think of as sessions are actually described by OSI using other terms, like associations, and that everything OSI teaches about sessions is grossly out-of-date.

OSI was already obsolete the moment it was written in 1977. It was true that it described the important networks that existed at the time, mainframes in data centers and remote X.25 connections. But technology had already moved on, making those old networks increasingly obsolete.

From the beginning, people have been busy retconning OSI. The word retcon means retroactive continuity. It’s a term that refers to long series of fiction, like comics, novels, movies, or TV series. It describes a process where later authors reinvent what the early works actually said. For example, in the first Star Wars movie, Luke is told Darth Vader killed his father Anikin. But in the second movie, it turns out Darth Vader is Luke’s father, and Obi-Wan explains why from a certain point of view it could be said Vader killed Aniken.

The OSI Model works much the same way. What it actually says has been changed to conform (as much as possible) to the modern TCP/IP Internet. People pretend that was said back then was intended all along to apply to the Internet. It’s not true – what was originally intended was something very different than the modern Internet. Even with retconning, it still does a poor job of describing the Internet.

This is how technical details evolved into theory. People ignored what OSI actually intended and reinterpreted it mean something abstract and theoretical. This is easy to do when you don’t understand what was originally intended. It’s like how OSI Session Layer #5 came to encompass a lot of things. Nobody knew what it meant, so were free to assign anything to that layer.

8.1. Steampunk circa 1880

Our history starts around 1880 with the inventions of the telephone and digital telegraph, as well as electromechanical tabulating machines.

Telegraphs based upon analog signaling existed before this, such as the famous Morse Code. It sounds almost digital, but it’s not, as the lengths of dots and dashes are variable.

But by the 1880s, electromechanical teleprinters and teletypewriters were invented that could handle binary data. Electromechanical means motors, gears, switches, solenoids, etc. The binary logic of today’s computers could be implemented using coils of wire wrapped around moving iron rods to open and close switches.

These devices could print incoming messages to paper without a human operator. They encoded text with a 5-bit “Baudot” character-set. This later evolved into 7-bit ASCII, which in turn evolved into variable-bit UTF-8 encodings.

Electromechanical devices were the basis of “computing” in the steampunk era before WW I. This was before transistors, before vacuum tubes. There was a century of electromechanical computers and calculators from Babbage’s early Difference Engine in the 1820s, to tabulating machines of the early 1900s, through WW II fire control computers (like the Mark 1) and encryption machines (like Enigma).

Data could be carried by voice wires, but the reverse wasn’t true: digital lines could not carry analog voice traffic[111]. Therefore, it was voice networks that became the basis for something we called telecommunications. Telecommunications meant transmitting anything over long distance, including radio signals, submarine cables, and so on.

In the beginning, telegraph companies and telephone companies each had their own networks, string their own copper wires in cities, and long distance between cities. Eventually these companies merged. In most countries, they also merged with the postal system, so were known as PTTs or “post, telegraph, and telephone” companies. In this text, I call them telecoms[112], which is short for telecommunications companies. Some independent telegraph companies still existed, but they largely leased their wires from the telephone companies rather than stringing their own wires.

But the difference between data and voice remained.

In 1964, Paul Baran published his famous report “On Distributed Communications”[113]. It’s widely hailed for seeing into the future, seeing the packet-switched Internet as it appeared 20 years later. But in reality, it was rooted in the past telegraph networks. It envisioned a network forwarding what would essentially be telegraph messages. His network could’ve been created purely from electromechanical computers.

Not only telecommunications but also computing comes from the late 1800s. The 1890 census was counted (tabulated) using punch counters and an electromechanical “computer”. These were certainly nothing like the computers we imagine today. They weren’t programmable, for one thing. But they still “computed”.

8.2. Non-interactive batch jobs circa 1960

Early computers didn’t have screens or keyboards. They had panels with blinking lights and switches. For computation, people submitted batch jobs, usually punched cards, paper tape, and/or magnetic tape. A “job” contained both the data and the code that was to run. The results of the computation would be returned on new punched cards, paper tape, magnetic tape, printed paper that humans could read, like payroll slips or invoices.

Early pictures will sometimes show teletypes. These usually weren’t connected directly to the computer. Instead, they were connected to punch tape or punch cards machines. A user might type on the keyboard to punch a tape, then feed that punch tape into the computer’s reader. Early computers were fundamentally non-interactive.

One of the most important early applications of networking was “remote job entry”, the ability to submit such batch jobs remotely, getting the results back remotely. Even through the 1980s, many computers didn’t have the interactivity that we associate with all computers today – not even terminals.

8.3. Non-interactive control systems circa 1960

As described above, “computation” was done as batch jobs. But a different class of computer would be always running, always running software. These were control systems, attached to external devices, like a factory. They would collect input from various places, crunch numbers, send commands back out to control physical actuators and valves. They would also blink warning lights if necessary.

Even today, you can buy new “current loop” convertors from Amazon.com. These are devices that convert ancient 1880s telegraph signals to modern 1960s serial lines.

This is the source of the “blinking light” control panels from early movies featuring computers. You see the computers sitting in the background doing work. But what you don’t see is a terminal attached to them.

Some did have a CRT display, but it was a non-interactive display rather than an interactive terminal. It’s like the movie Wargames with those huge displays showing incoming Russian missiles. Even though WOPR in the movie did allow for interactivity (that’s the plot after all, guessing the password JOSHUA), the bulk of the computer’s work wasn’t interactive.

8.4. Interactive terminals circa 1970

Eventually, people figured out that they could connect teletypes directly to the computer, instead of simply using them to punch tape. They would type on the keyboard sending characters directly into the computer, which would respond by sending characters back to be printed on paper.

This started with the earliest computer in the 1940s, but was too expensive for most tasks. It takes a lot of computer power (relative to computers of the time) to process a single keystroke. It was cheaper to use a terminal to punch cards or tape, then feed the cards/tape into the machine all at once. By the late 1960s, we saw interactive terminals appearing in office environments, first as consoles inside the computer room, then moving out into offices on average worker desktops.

This was the first command-line, which we still have on modern computers today. Even if most users don’t use the command-line, it’s still the basis for how techies manage and hack computers.

Printing to a scroll of paper is inefficient. For one thing, the paper eventually gets expensive. For another thing, it means interacting only one line at a time – there’s no moving the cursor up and down lines.

At some point, terminals were invented using a CRT screen. The CRT or cathode ray tube was the technology televisions used for a century before everyone switched to the LCD/LED screens.

I think of 1968 as the year of the interactive terminal. It’s when Dough Englebert did his “mother of all demos” using a mini-computer as a (correct) futuristic prediction of what computers would look like. But it’s also the year of the movie Hot Millions, the first time in entertainment we see the hacker sit down in front of an interactive terminal and type things.

Hot Millions (1968) where the hacker tells the computer to print checks he cashes.

Terminals were limited by the technology at the time. Consider a screen of 80x25 characters. This requires 2-kilobytes of memory. That was extraordinarily expensive at the time, costing as much as $100,000 in today’s money. Instead, you’d have oddities like the “acoustic delay line” whereby the characters were encoded as vibrations traveling down a wire. The system was constantly reading pulse out one end to draw on the CRT screen and re-introducing them onto the other end. Walking on shaking floors or kicking the cabinet caused the screen to become corrupted.

But by the early 1970s, Intel and other companies were shipping cheap RAM chips and 8-bit microprocessors. This quickly transformed computing to the standard where office workers would spend their time at a screen and keyboard. Still, no mouse, it was just a screen of text, but it was what we modern people would recognize.

This interactive terminal is the reason for the upper three layers of the OSI Model. The Session Layer was primarily about the primitive states of terminals at the time. The Presentation Layer was about the fact that different terminals supported different command-codes and text encodings. The Application Layer was focused on how applications ran on mainframes, but used on terminals.

The primary applications during this time, the late 1970s, were:

  • remote job entry
  • virtual terminals
  • file transfer
  • email
  • filling out forms
  • text editing

Since the upper three layers of the OSI Model were designed for layers in terminal communication, we don’t have them in modern networks. Most educators ignore them. But others try to stick things in these layers. If you don’t know how such terminals worked, the layers seem to describe mystical and eternal properties of networks. So some educators try to make something of them – claiming that while the layers technically don’t exist, the concepts still do. The concepts really don’t, as described in the Misconceptions chapter.

8.5. IBM SNA

The first draft of the OSI Model was inspired by IBM’s SNA[114][115].

Back in the 1970s, IBM nearly monopolized the computer industry, with over half of the industry’s total revenue. Whatever IBM did was the standard – everyone else tried to emulate that standard, to be compatible with it. Many computing innovations of the era came from IBM labs[116]. It wouldn’t have made sense to create a network standard that wasn’t compatible with IBM’s standard[117].

IBM sold powerful mainframes. These were famously huge, taking entire rooms for a single computer. As less powerful mini-computers appeared on the market, IBM adopted them to offload tasks from the central mainframe. Their purpose was to act as peripheral controllers, controlling things like disk drives and printers.

In addition, this time saw the birth of the smart terminal. Prior to 1970, computers either were not interactive at all, or used a teletype terminal and a command-line. By the late 1960s, devices appeared with full screens, where text could be placed at any location on the screen. This saw complex applications, usually based around human employees filling out forms on the screen. Today’s web forms are descended from these applications, such as using the <tab> key to move between fields and the <return> key to submit the form.

The first struggle was getting two devices to talk locally. The second struggle was then to get packets hop-to-hop through a remote network.

The local problem was complicated by the fact that there were hundreds of ways to connect two devices locally. It was a hard problem due to the simple technology of the time. As technology progressed, people kept inventing new ways to string a wire between two devices. Many of these devices didn’t even have a CPU or software, and were controlled by the signals on wires. The full RS-232 cable has 25 wires even though it strictly needs only 4 (one pair to send, one pair to receive). Today’s computers send control information in-band, along with the stream of bits, which software reads on the other end. Devices in the 1970s needed separate control wires.

IBM eventually invented a protocol called SDLC[118] as a layer on top of physical links. Whatever the actual physical differences, they all appeared the same at the SDLC layer. Variants of this were then standardized, such as HDLC, LAPB (for X.25), and LLC (for Ethernet).

SDLC was a local transport layer. It has packets, can resend lost packets for reliability, can fragment packets, uses flow-control, can interleave multiple packet streams, and so on.

It is what the OSI designers intended by Data Link Layer #2. But it’s largely unused today. Today, OSI Data Link Layer #2 is taught to refer to only Ethernet’s MAC sublayer, ignoring LLC and this rich history of SDLC.

Today’s Ethernet preserves SDLC as a sublayer called LLC. Formally, it looks like the following.

Ethernet

#2 Data Link Layer

LLC

MAC

#1 Physical Layer

PHY

In today’s networks, you’ll still see LLC headers occasionally, such as when you sniff raw WiFi packets. It’s largely just a vestigial header, adding 8 bytes of useless data to frames. It’s used in simplest mode, meaning, one that doesn’t handle fragmentation, flow control, or the other features it was designed for.

It’s hard to express how important this was. In 1977 when OSI was drafted, they had only progressed through layers #1 and #2. That’s the part they understood. The layers after that point were largely speculative.

The rest of IBM’s SNA protocol suite depended upon SDLC. You couldn’t run it without SDLC, or something that looked the same (such as LLC on Ethernet).

OSI adopted this flawed thinking. Their design for the first 4 layers integrated them together, all parts of a common network stack, with the upper layers dependent on the lower layers. The design of Ethernet and TCP/IP took the opposite approach. Ethernet was designed to be independent of anything layered on top, and TCP/IP Internet was independent of any links/local-networks running below.

The rest of SNA’s protocol stack was based largely upon terminals (and other peripherals, like disk drives and printers).

One of the confusions about the modern OSI Application Layer #7 is whether it’s part of the network stack, or the thing that runs on top of the network stack. In IBM SNA, it was part of the network stack. The “VTAM” service provided APIs for application writers that took care of all the basic features of applications on behalf of programmers. A lot of user interaction was done on “communications controllers”, 16-bit minicomputers attached to the mainframe, with the mainframe itself only handling the backend “transactions”.

This document tries to splits this Application Layer #7 into two things, either a payload outside the network stack, or a service layer inside the stack.

As a final note, SNA could never have evolved into an “Internet”. It was designed for building networking within an organization, not for building a network that spanned organizations. It’s closer to USB, connecting peripherals to a central computer, than it is to TCP/IP.

8.6. Telecom X.25

The telephone system was the first cyberspace.

It was based upon “circuits”. In the beginning, these meant copper wires, electrical circuits where a voltage at one end would go out the other end. Switchboard operators used patch cables to connect one electrical circuit to another.

Wikipedia: https://en.wikipedia.org/wiki/Telephone_switchboard

With the invention of the transistor, during the 1960s the telephone network upgraded to digital network, using the “T carrier system”. Instead of copper wires carrying analog voice signals, wires now carried streams of bits. Instead of switchboards, we now had digital “switches” that would forward streams of bits. Copper wires could support multiple streams of bits, such as a T1 line that would carry 24 simultaneous streams of 64-kbps each.

Computers were far too primitive to route packets like today’s Internet. All they were capable of doing was redirecting streams of bits.

It was a good model for voice, because a phone call meant a constant stream of analog voice encoded into a digital stream. But it was bad for computer data, which tended to be silent most of the time, sending data in bursts. A customer still paid for the digital circuit even when they had nothing to send.

This is especially a problem with terminals. They want a connection, but spend most of that time not transmitting data. It became habitual to dialup a connection, use the terminal for a brief time, then shutdown the connection. Terminal applications were optimized for this, to maintain a “session”[119] even when the network connection was temporarily shutdown.

Packets instead of streams are inherently better for computer data. Telecoms developed X.25 as a tradeoff, still using virtual circuits, but allowing packets to be sent instead of streams of bits. Customers could now be charged only for the data[120] sent.

The key thing about a circuit is that every switch between the source and destination (caller and callee) had to reserve resources for the circuit. They all how to know the existence of such a thing.

The Internet doesn’t work that way. Instead, the sender just sends packets. Routers in between process packet individually, as if this were the only packet they’d ever see between the source and destination. If the packet were part of some sort of “connection”, then that was something the ends knew (“end-to-end”), not something routers knew.

This was a big debate in the 1970s. The telephone company believed the Internet “packet-switching” wouldn’t scale to the future, while the people building the Internet believed the “circuit-switching” wouldn’t work.

In terms of OSI, the Internet’s layer #3 is a”connectionless”, while the X.25 layer #3 is “connection-oriented”. In fact, what we now consider Layer #4 was handled by Layer #3 in X.25.

The OSI Transport Layer #4 was intended to not handle transport connections itself so much that it guaranteed such things were handled by the network stack at that point.

This mostly worked. There are a lot of regulated environments (like the power industry) that are forced to use OSI Transport Layer #4. They successfully deploy these apps either on X.25 or on top of TCP (using something called ISOTP, a layer on top of TCP that emulates the OSI Transport Layer #4).

Like with IBM SNA, the telecom X.25 stack was monolithic. All the parts depended upon each other. They had their own physical connections, their own datalink (LAPB, which was similar to IBM’s SDLC), and so on. It’s only after the OSI Transport Layer #4 can you consider things to be independent (people run other networks, including TCP/IP Internet, over X.25 links). Even then, they still offered applications that were tied to their stack.

The telephone companies had their own applications based on the X.25 backbone,  teletext services. These consisted of terminals using a TV, an  8-bit microcomputer, and a telephone modem. The most famous were the Minitel services, sponsored by the French government to put one in every home.

By “teletext” we mean special fonts with graphical characters from which rudimentary pictures could be drawn.

The OSI Session Layer was modeled after teletext terminals. Special flow control issues necessary for such simple terminals and connections. Some where simplex, able to communicate only in one direction at a time.

That’s why the Session Layer doesn’t exist. The world quickly moved to personal computers and much smarter terminals, and the needs of the Session Layer disappeared. It was never really implemented for actual shipping teletext terminals, and it was never used as intended for real-world systems that attempted to use the OSI protocol suite.

Thus, even though the OSI Model was created with X.25 specifically in mind, it’s not comprehensible.

  • The Physical Layer #1 are serial protocols that look nothing like Ethernet. It’s not even much like the clean RS-232 serial protocol that we have today.
  • The Data Link Layer used the LLC sublayer (implemented as SDLC), which isn’t used for modern protocol stacks.
  • Its Network Layer was connection-oriented with virtual circuits.
  • There’s no Transport Layer.
  • The Session Layer was sorta modeled after its needs, but nobody ever really used it.

8.7. Xerox PUP and XNS

The Internet as we know it was inspired by work at Xerox.

In 1968 they demonstrated the computer of today, a workstation with a graphical screen and mouse. This was known as the Mother of All Demos. It required a minicomputer costing hundreds of thousand of dollars, and was impractical at the time. It wasn’t until the Macintosh shipped in 1984 would it become practical.

Figure: The Mother of All Demos[121]


Xerox also developed Ethernet to work with this workstation. They not only developed Ethernet as a
local network, but also their own internetworking protocol called PUP or PARC[122] Universal Packet that was essentially complete in 1974. PUP had limited addresses, an 8-bit subnet number and 8-bit host number. This limited the network to 64k devices.

Later in the 1970s, Xerox upgradde this to XNS IDP/SDP which had a 4-byte network number and 6-byte host number – where the host number was the same as the Ethernet MAC address. XNS was far ahead of its time, having in the 1970s features that wouldn’t become standard for another decade on personal computers. A good example is how it built applications on top of its own RPC protocol named Courier.

The Internet’s TCP/IP copied a lot of ideas from PUP and XNS.

In 1981, the DoD paid consultants to add both the Internet TCP/IP protocols and XNS IDP/SDP protocols to BSD Unix. Either could’ve become the basis for the world-wide cyberspace that we see today. However, the DoD only funded routers that would forward IP packets instead of also IDP packets.

While not used on the DoD backbone, XNS would dominate office networks of the 1980s. Microsoft’s LAN Manager system often used XNS (as well as other protocols). Novell used a minor variation of XNS, which they called IPX/SPX, as the basis for their wildly successful office network system.

The remarkable feature of XNS is that it was truly a peer-to-peer protocol, envisioning an office environment with powerful computers on people’s desktops. The rest of the network designs of the 1970s focused instead on connecting terminals to mainframe-style computers. Even TCP/IP was initially afflicted with this narrow view – they tried to imagine more, but it was terminal access and file transfer that drove their thinking.

8.8. ARPAnet and the TCP

ARPAnet was created in 1969 as part of a DoD funded research effort. The goal was to create something that interconnected DoD research sites around the country.

In the beginning, the protocol was known only as the TCP[123]. Only around 1979 did the TCP split into the TCP/IP protocols we know today. TCP’s original design to create a stream of bytes like a like a local point-to-point cable, but remotely over a complex network. The details of that complex network changed over time while the byte stream model remained roughly the same. That’s why services defined for the old TCP Arpanet could work with only minor changes on the new TCP/IP Internet.

They didn’t know what they wanted to do with the protocol. The early applications were:

  • Telnet (remote terminal)
  • FTP (file transfer protocol, using Telnet as the control channel)
  • RJE (remote job entry)
  • SMTP (simple mail transfer protocol)

And that’s pretty much it.

The driving factor in its success was the inclusion with Unix in 1983. The DoD paid a consulting firm to add a TCP/IP stack to BSD 4.4 Unix. When the Internet was turned on, it already had many vendors providing compatible equipment. Universities already had computers compatible with the new network and were eager to connect.

8.9. Ethernet circa 1974

Around 1970, devices were connected with point-to-point cables like RS-232. Interconnecting many devices meant stringing a cable directly between them. Wiring up 10 devices would require 50 point-to-point cables.

12 devices with a point-to-point link between each device

Another way to interconnect those 10 devices would be to buy an 11th computer, then simply wire the other 10 computers with point-to-point links to the central computer, then have it forward traffic as necessary. This would be a “router” or “bridge” or “switch”. It’s what we do today with the Ethernet switches in our homes, but back then, it was both unreasonably expensive and unreasonably slow.

Figure: 12 devices connected through a 13th device, how Ethernet works today but too expensive in 1970s

A third option would be to string a single cable among all 10 computers. It was vastly cheaper than all the options, but suffered the problem of contention. It wasn’t normally done because there was no way to stop one transmitter’s signals overlapping with another, corrupting the data.

Figure: 12 devices that all connect to a common wire

Ethernet invented a solution to this problem of collisions, called CSMA/CD - carrier-sense multiple-access with collision-detection. “Carrier-sense” means a station listens until the wire is free before transmitting, and “collision-detection” means continuing to listen in case another station starts transmitting. In that case, the station would backoff and try again a random amount of time later.

Part of the problem here is that the connection to this cable had to be cheap. What costs pennies today cost more than $1000 in 1970s dollars. They weren’t designing the optimal solution to this problem so much as a cost acceptable solution. The ability to sense signals correctly was difficult.

In 1974, DEC, Intel, and Xerox created Ethernet to fit within these constraints. It used a single thick coaxial cable that could stretch for a couple kilometers (about a mile). Every computer had a transceiver that connected to the cable. A transceiver not only translated the internal bits from the computer into signals for the wire, but could detect such things as signals colliding.

Around 1980, the IEEE created their 802 group for standardizing local area networks. The IEEE 802.3 standardized a version of Ethernet compatible with the DIX (DEC-Intel-Xerox) standard.

Around the same time 3Com started selling Ethernet hardware. When the IBM PC became popular, 3Com Ethernet adapters for the computer became very popular.

What’s important about this is that in many ways the OSI model predates Ethernet.

Ethernet is a network that only tries to interconnect offices, buildings, and campuses. It’s a local area network. It can be used to carry more advanced network stacks like TCP/IP, and a bunch of other stacks like XNS, Vines, AppleTalk, DECnet, IBM SNA, and telecom X.25.

But this was foreign to the OSI concept. OSI envisioned a network stack whereby the local connections (like the Data Link Layer #2) were more integral with and dependent upon the remote internetworking features (the Network Layer #3). It doesn’t actually envision something like Ethernet being totally independent of anything else, or an internetworking standard like TCP/IP that’s independent of the links below it.

8.10. History of the physical layer

This document mentions serial links like RS-232 many times. They are still used today, but they are also an important part of history.

Most of the network stack is notional, written in software. It’s at the PHY or physical layer that networking becomes real, where bits[124] are transmitted over wires[125].

It’s unclear where the history of the PHY layer starts. We could choose slaves carrying cuneiform tablets along established roadways in ancient Sumeria (3500 BC). Another good choice would be the optical telegraph network in France around 1800, which featured encryption, side-channel attacks, and man-in-the-middle attacks.

8.10.1. Baudot 5-bit digital telegraph

But for the purpose of this text, we are choosing the Baudot telegraphs from around 1880. They used electromechanical devices with gears, solenoids, and switches.

Morse Code (from the 1840s) was analog. While the dots/dashes could be written to paper, a human operator was still needed to read them. A mechanical device couldn’t convert them directly to text.

In 1880, a digital system was invented, using 5 bits per character (upper-case only). A mechanical system would print incoming messages directly to text that any human could read, with no training.

Better yet, signals could be amplified. The longer the wires, the more a signal degrades. The benefit of a digital system over analog is that amplifiers can regenerate a perfect digital stream of bits.

The problem with the Baudot telegraph is that in order to move electromechanical equipment on the other end, it needed to transmit power, which is voltage times current. Today, we call them current loop systems, and they still exist.

8.10.2. RS-232 serial link

The RS-232 serial link is still important in modern times. It’s not something the average user sees on their desktop (having been supplanted by USB), but is widely used in industrial and office environments.

Around the 1950s we saw the invention of the transistor. We call them solid-state, meaning no moving parts (like electromechanical devices, with moving gears and solenoids) and no vacuum tubes. They can be triggered simply by changing voltage – without transmitting power/current.

The RS-232 standard was introduced in 1960 to take advantage of solid-state data transmission. You can still buy new devices on Amazon.com that convert between the older Baudot signals to “modern” RS-232 signals.

You can likewise buy new from Amazon.com an RS-232 to USB converter.

In short, you can (in theory) connect a Baudot teletype from the 1880s to the latest modern computer and make it work. Moreover, so much of this older equipment exists in the world that they still sell new converters to make it work.

The thing to notice about RS-232 is that technically, it needs only 3 wires, but the original standard specified 25 wires. The extra wires/pins are used for control or signaling.

The bare minimum is a transmit wire, a receive wire, and ground. Data is transmitted by relative voltage to the ground wire.

The extra wires were for out-of-band signaling. For example, the other end of the wire might not be ready to receive data. Therefore, two wires were needed for one side to express the desire to send data, and the other end to signal readiness to receive data. Since communication was bi-directional, four wires were needed for this purpose, two in each direction.

We call this a handshake or protocol. Additional wires to signal such conditions were needed because the devices were built from simple transistors.

RS-232 was called a serial cable, because only one wire carried a stream of bits in one direction. There were also such things as parallel cables that used many wires to send data, such as 8 wires, one for each bit in a byte. An example was the Centronics port which had the same number of wires as the full RS-232 spec (25 pins). Since it transmitted data in only one direction to a printer, it could get rid of most of the control pins and use them to send data instead. It was approximately 8 times faster than using RS-232 for the same level of technology.

But in general, parallel cables never became popular. It’s just easier sending data at higher speed across a single wire than trying to work with 8 parallel wires at the same time. All links are essentially serial links.

Today, most signaling is sent as in-band protocols, carried as data that software handles. Thus, most of the pins aren’t used, so the typical RS-232 connector is only 9-pins. Even then, it’s possible to connect things using only the bare minimum 3 pins.

8.10.3. Simplex, half-duplex, and full-duplex links

Today, it’s taken for granted that links will be bidirectional. Back in the early days, this was expensive and therefore optional.

One solution was simplex links that transmitted data only in a single direction, such as to a printer. This cut the needed number of wires in half, cutting the price in half.

Another solution was a half-duplex link which was also only in a single direction at a time, but where the sides could change places. One side would transmit for a while, then send a signal switching sides, and then receive.

Today’s links are almost always full-duplex, sending and receiving at the same time. This is usually done with one set of wires for transmit and a different set for receive. In some cases, like WiFi, advanced electronic allow sending and receiving at the same time on the same wires.

Today, everyone is full-duplex, so we don’t care about this distinction. But in terms of OSI, the Session Layer was intended to solve this. It was designed for terminals that needed simplex or half-duplex operation.

8.10.4. RS-422 and RS-485

The original RS-232 transmitted signals relative to the ground wire. This limits speed and cable length because the ground wire fluctuates. All the other pins are also relative to ground, disturbing things.

To fix data is transmitted as a pair of wires, as the difference in voltage between the two wires. We need one pair to transmit and another pair to receive. This dramatically increases the speed, from 115.2-kbps over short cables to 10-mbps over long cables.

This was standardized as RS-422 in 1975. It uses a basic 9-pin connector. Two pairs are used to transmit and receive data. Another two pairs are used to indicate readiness to the other side. The ninth pin is the ground. Otherwise, data was transmitted like RS-232. Thus, a single device could support both at the same time, with a simple connector change. RS-232 could be extended for long distances with a simple connector that changed it to RS-422.

RS-422 isn’t really used today, but is the stepping stone to AppleTalk (below) and RS-485.

RS-422 is primarily a point-to-point standard like RS-232. But a lot of uses of such systems is point-to-multipoint, where a single primary devices needs to control many secondary devices. (USB is defined to be point-to-multipoint).

RS-485 is just a multipoint version of RS-422. It has a slightly different electrical specification to allow for the fact that multiple devices have the ability to transmit at the same time. If they did so, it would corrupt data, so some software is needed to prevent this, to direct which device can transmit when.

RS-485 is important today because it’s still incredibly popular in industrial equipment.

Today it’s common to see devices that handle RS-232, RS-422, and RS-485 all at the same time.

8.10.5. USB – the universal serial bus

In the 1990s, we saw a proliferation of connectors on the back of a computer. There were already multiple ports for keyboard, mouse, RS-232, and Centronics. New devices, from light-pens to game controllers, needed their own expansion card. Apple had its own separate standard connectors.

The industry worked together to unify these – to make a universal port. This would, of course, be a serial port. It would allow multiple devices to be connected to the same port at once, and thus would be a bus (point-to-multipoint) rather than a point-to-point link.

Hence, the Universal Serial Bus or USB was born, shipping in around 1998. The early standard allowed for 1.5 mbps as the minimum speed for things like keyboards and mice. Today’s USB 4 goes up to 80 gbps in theory, though in practice most things don’t go faster than 5 gbps.

There were several major innovations over things like RS-232.

The first is the assumption that microcontrollers would exist on both ends. Even the simplest mouse or keyboard needs an 8-bit CPU to conform to USB. This was a major change in thinking, as the original standards like RS-232 allowed for dumb devices on both ends. You could connect two teletypes together with no microprocessor involved. But by this point in history, everything had microcontrollers anyway, so nobody cared.

The second was that the port would often need to power attached devices. The early standard allowed just enough power for keyboards and mice (0.75 watts), then 2.5 watts, and today, 240 watts[126].

The third innovation was the connector. Typical connectors assumed that you’d screw in the connection. Obviously, you don’t want a mission critical cable to fall out. But with USB, the connector is designed to be reinserted many times with nothing holding it in but friction.

The interesting aspect of this is in thinking of the layered model and connecting a USB Ethernet port. There is a complex network between the computer and the Ethernet device layered underneath all these communications. It just doesn’t fit the OSI Model, and nor should it. When things are layered, they should be opaque to each other, and they shouldn’t conform to a preconceived list of which layers should exist.

8.10.6. TTL, GPIO, I²C, SPI

The discussion so far has been about long-distance serial communications, sometimes as far as kilometers/miles.

But these days, networks have shrunk to the size of a single computer. Your laptop contains a small integrated network interconnecting smart devices, like the camera, keyboard, trackpad, battery, moisture sensor, boot ROM, and so on. A simple computer like a laptop can easily have 30 microcontrollers.

There are a number of standards for this.

The reason we discuss them is because of the Raspberry Pi. It has a row of pins that can be used for such communications. These pins can be configured in different ways.

By Evan-Amos - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=49898536

The way of connecting things was called TTL for transistor-to-transistor logic. This isn’t a serial communication standard, just a way of connecting transistors together. Depending upon technology, a transistor expects voltage in a certain range, +5 volts for older transistors and +3.3 volts for new CMOS transistors. It’s defined for use by dumb transistors on either end.

But we can also use it for communication between relatively smart devices, between pins on microcontrollers. It’s pretty ad hoc and non-standard.

Raspberry Pi pins can be configured as GPIO or general purpose I/O. They work by software simply turning them on or off. For example, they might be use to turn on an LED. The idea is that software should only turn them on/off infrequently, like less than 100 times a second. But this can be done so fast, millions of times a second, that they can also be used for communication. This is called bit-banging, reflecting the fact that it’s inelegant and using things in ways other than originally intended.

A better standard is I²C or inter-integrated circuit protocol. There are usually microcontrollers on either end, but it can also be used to talk to relatively dumb devices. With a small number of gates, a simple device like an EPROM can receive a packet, extract an address, and then return the requested data. It can be implemented using only 2 wires for bidirectional communications. It’s usually less than 1-mbps.

You’ll find this all over circuit boards, on the Raspberry Pi pins, and even video cables. It’s over I²C that your computer discovers the resolution of your monitor, whether older VGA or modern HDMI.

SPI or serial peripheral bus is even faster, but uses 4 wires instead of 2. It’s used internally for things like SD cards. When you connect a “HAT” onto a Raspberry Pi, it’ll often configure the pins to use SPI running at ~10-mbps.

8.10.7. UART - universal asynchronous receiver/transmitter

In the history of computer networking, there is an invention called the UART, especially when combined with a microcontroller to create a smart UART.

To start with, there’s synchronous communications, consisting of a constant stream of bits. A good example is the output from a video card sending data to a monitor.

Asynchronous communications is when the line is silent. Incoming data needs to announce itself, a short sequence of bits are received, then communication ends.

In the original wire services sending news in the 1930s, the incoming signal needed to give the receiving teletype time to start its motors and gears before it was then capable of printing to paper. When stories arrived, you’d hear the chug-chug-chug of the startup process, followed by the sound of printing to paper.

Modern networks are always asynchronous, of course, sending packets at random times. But that’s a different sort of asynchronous than meant here.

Before there were network packets there were just teletypes. One end might be a human typing on a keyboard. Each character (byte) would be asynchronous. Each character would start with 1 bit, followed by the character (which could be only 7 bits for ASCII), followed by a parity bit, followed by a number of stop bits. Today’s RS-232 links are almost always 8-N-1, meaning 1 start bit, 8-bit characters, no parity, and 1 stop bit.

You need a circuit that’ll receive the incoming stream of bits and stick them in a buffer. Likewise, you need a circuit that’ll format outgoing bits.

A typical UART receives the incoming stream, converts them to characters, stores them in a small buffer, then generates an interrupt. By the time the CPU gets interrupted, there are probably multiple characters in the buffer. The CPU then does the work needed to pull the bytes out and process them.

In the 1970s, UARTs started to come with microcontrollers, often a Z-80 microprocessor. This upgraded the device to a serial controller. They not only had the power to receive incoming characters, but also the incoming packets.

With IBM’s Bisync and SDLC link protocols, it was usually the smart UART that would handle that part of the stack, freeing the main CPU.

When Apple put an RS-422 port in its original Macintosh, it used one of the smart UART chips, using upgradeable firmware. This port could then be used for a wide variety of purposes, from connecting to an IBM mainframe or using its own AppleTalk local area network. This cost an extra $5 per Macintosh.

These days, when we talk about UARTs, such as in the context of Raspberry Pi’s, we are probably talking about slow byte-by-byte communication rather than high-speed packet networking.

8.10.8. AppleTalk

When Apple shipped its first Macintosh, it included an RS-422 port. Instead of a simple UART, it used a smart serial controller (discussed above).

At the time, local networking cards would cost $1000, like Ethernet. Apple avoided this high-cost by creating a network using the RS-422 port. A simple $50 connector and cheap telephone wire could be used to network many local Macintoshes together, including a laser printer. It was slow, 230.4 kbps, but it just worked, and was extremely cheap.

Apple was criticized at the time for not having real business computer with Ethernet or Token Ring network connections, but in practice, it had the best network of the 1980s. It just worked, cheaply, in an environment where everything else was expensive and difficult to keep running.

Things like DHCP, mDNS, and UPnP protocols are designed to make things just work, but in many ways, they are still inferior to AppleTalk’s original protocols.

8.10.9. History of the word protocol

To understand the Data Link Layer #2, we first need to understand the word “protocol”.

Obviously, the technical term is somehow derived from diplomatic protocols of the 1800s, the formal specifications of when to bow to a monarch, salute a military officer, roll out the red carpet, who sits at the head of the table, where your salad fork is placed next to the plate, and when to shake hands.

In the early 1960s we had serial links, meaning a cable that would transmit a stream of bits between two computers. The subsection above discusses RS-232, which we’ll use as the model for all such links.

The full standard RS-232 cable had 25 wires. Only 2 carried the transmit/receive signals. The others carried control information.

For example, when connecting a teletype to a mainframe, it would raise the voltage of pin 20. This would tell the mainframe that the teletype (or teleprinter) was now ready to receive data to print to the paper.

This is a handshake or protocol. It required just a few transistors in order to handle it. Each control wire meant a different thing.

You could (and often did) connect two dumb devices with an RS-232 cable, neither of which had a CPU or ran software. It was all implemented using a few transistors, which were expensive at the time.

Another way of sending control information is in-band signaling. This means that signals are sent on the data wires instead of separate control wires, which would be out-of-band signaling.

Specifically, this was done by sending bytes/characters with special meaning. In ASCII, the first 32 values (0 through 31) had special meanings.

For example, sending the value 13 to the teletype would cause a “carriage return”, causing the print head to move back to the start of the line. Sending a value of 10 would cause the printer to scroll to the next line. Thus, when printing text, every line would end in a CR-LF pair.

These in-band control values could still be handled by simple transistor logic rather than a CPU and software.

Thus, ASCII was not just a character-set, it was also a protocol.

By1967, IBM was definitely using the word protocol to refer to its Bisync protocol that likewise used similar special characters. There were two versions, one that used EBCDIC and one that used ASCII. A protocol converter could be used to convert one to the other, so that a carriage-return in EBCDIC would be converted to a carriage-return in ASCII.

The first devices labeled protocol analyzers seemed to pull out all 25 pins of RS-232 connections, showing which pins were active. Later protocol analyzers pulled apart messages from links like Bisync.

Note that during this time, such links were character oriented. Each character would start with 1 or 2 “start bits”, followed by 5 to 8 bits to encode a character (Baudot, ASCII, extended-ASCII), followed by a parity bit, followed by a stop bit. Thus, such protocols were character-oriented rather than bit/byte oriented.

There was some notion of text headers. The first 4 character codes in ASCII:

  1. Start of header (which contains To:, From:, etc. text fields)
  2. Start of text (the message)
  3. End of text
  4. End of transmission

Thus, with this early notion of protocols, we already had the notion of headers and payload.

The thing to notice here is that these protocols were dumb (no CPU or software involved) and character oriented (part of the character-set definition). This principle goes back to the 1880s with the invention of the teletype, the Baudot code/protocol, using electromechanical logic instead of vacuum tubes or transistors.

With the creation of cheap memory and the 8-bit microprocessor, suddenly the devices on either ends of a link were both intelligent rather than dumb.

It’s at this point thing were upgraded to the notion of protocols that we have today, with the expectation that they would be handled by software, and they would be bit/byte oriented instead of character oriented.

Instead of text-oriented messages, we now had packets.

One of the first protocols was IBM’s SDLC, upon which the Data Link Layer #2 is based. Other early protocols include ARPANET and the early TCP. Xeros developed a suite of protocols called PUP that later became XNS.

It’s at this point that we get the modern usage of the word, where “protocol” refers primarily to the header bytes at the front of a packet.

8.11. OSI – Open Systems Interconnect

In 1974, IBM was selling its “SNA” networking products. By 1976, telecoms began offering X.25 service. By this time, Xerox had developed its PUP network stack, and many universities had experimental networks like ARPAnet TCP and CYCLADES.

This spawned political battles.

  • There was a fight for the computer “packet-switched” model against the telephone “circuit-switched” model of the telephone companies.
  • All the other computer maker’s fought against IBM domination of the market.
  • Everyone not funded by the US DoD wanted something other than ARPAnet TCP[127].

Various standards groups and subcommittees under the aegis of the ISO and CCITT/ITU-T were started. ISO wanted to standardize “computer” networks while ITU-T wanted to standardize “telecom” networks.

In 1977, Charles Bachman created the first simple proposal of 7 layers based upon Honeywell’s version of IBM SNA. In 1978, Hubert Zimmerman fleshed this out into a complete draft. It is in 1978 that we see others referencing this OSI Model, such as in academic papers. It’s roughly what was later standardized in 1984.

When the model was drafted, they could only imagine the way that mainframes and telecoms worked. They heeded some of the work on TCP and XNS, but ultimately, it didn’t make much impact.

The first 4 layers are designed as an integrated stack, because that’s how IBM SNA and telecom X.25 were defined.

The upper 3 layers were ill defined, even then. We see the influence of IBM SNA and telecom X.25 in the upper part of the stack. They were focused on remote terminal access to mainframes, so that’s what the upper three layers of the OSI model are based on.

The reason OSI went nowhere is that by 1983, we had personal computers running full TCP/IP and XNS stacks talking peer-to-peer – without terminals or mainframes. We saw an explosion of office environments running IBM PCs and university environments running Unix workstations. We saw an explosion of application-layer protocols that looked nothing like the remote-terminal and file-transfer of OSI.

Over the years, OSI has been “retconned”. People liberally reinterpretted what was written, to the point that modern OSI frequently disagrees with original X.200 OSI. While schools teach OSI everywhere, nowhere is the actual original X.200 standard taught. The language is incomprehensible, because it’s talking about the networks of the 1970s, which nobody has any experience with.

8.12. The original 7 layers

Here is the definition of the original 7 layers, as they intended back then.

8.12.1. #7 - Application

Contrast with current consensus definition of Application Layer #7 described in Chapter 3, and the misconceptions about this layer in Chapter 7.

An application was something that ran on the mainframe. Users would access it via terminals. The application then would control peripherals, like printers, tape drivers, databases, and so forth.

As defined by IBM SNA, the mainframe was an asymmetric relationship, with the application running on the mainframe the master and everything else subservient. IBM called the other side of a connection a Logical Unit.

It wasn’t until the Internet model became successful that IBM added peer-to-peer connections, with two equal applications on either side of a connection instead of a primary/secondary. This was known as APPN or Advanced Peer-to-Peer Networking, using Logical Unit 6.2.

A mainframe was at the center of the network running many applications. The rest of the network usually consisted of minicomputers and microcomputers, each usually dedicated to a single task.

Nowadays things are quite different. When you interact with an app on your phone, that’s backed up with services on the Internet, most of the computations are actually done on the phone, with very small transactions going across the network.

In other words, in the old days if you were using something like Twitter, the application would be on the mainframe. Nowadays, the application is on the phone.

8.12.2. #6 - Presentation

Contrast with the modern consensus of Presentation Layer #6 in Chapter 3. Also look at the misconceptions about this layer in Chapter 7.

This layer meant two things in the original OSI definition: translating terminal codes, and translating file formats (usually character-sets).

The 1970s saw an explosion of the terminal, a device that could display a screenful of text (often 80x25 characters). Different terminals supported different control codes. A control code told the terminal how to display a character, such as writing it in bold or giving [x,y] coordinates where to display on the screen (rather than the default, one the bottom of the screen).

On modern Linux/Unix, negotiating terminal control codes is done with a package like curses or ncurses. Back in the 1970s, this was done with elements in the network stack. Thus, the Presentation Layer #6 was specified to handle this.

Another transformation was character-sets. Back in the old days, every computer had a different character-set. A common problem was converting between IBM’s EBCDIC character-set and the standard ASCII. Another common problem was non-English languages, which used variations of EBCDIC or ASCII.

Text files needed to be converted when transferred between computers.

This turned out to be anachronistic. Modern terminals support the standard ANSI codes, so nobody negotiates control codes. File formats and character-sets (UTF-8) are universal rather than tied to a machine.

It was always a wrong idea to assume the network would handle translation.

8.12.3. #5 - Session

Contrast this with the modern consensus description of the Session Layer in Chapter 3 and misconceptions in Chapter 7.

This was created purely to deal with older terminals used in teletex/videotext applications in Europe, like France’s Minitel. It has no other meaning. “Sessions” as envisioned by OSI just don’t exist anymore.

8.12.4. #4 - Transport

Contrast this with the modern consensus description of the Transport Layer in Chapter 3 and the misconceptions in Chapter 7. Also look at this book’s definition of transport in Chapter 6.

Both IBM SNA and telecom X.25 were designed with a “connection-oriented network layer”, meaning computers first established a connection between themselves, and then apps would create multiple transport connections on top of that one network layer connection.

In other words, for a pair of computers, there would be a computer-to-computer connection underneath, then multiple app-to-app connections on top.

The Internet doesn’t work like that. There is no computer-to-computer connection (using IP), only the app-to-app connection (using TCP).

The original OSI specification for the Transport Layer #4 lists a bunch of functionality, but none of it was unique to that layer. Instead, the only truly unique item of the transport layer was that all of this was done end-to-end.

8.12.5. #3 - Network

Contrast this with the modern consensus description of the Network Layer in Chapter 3, and the misconceptions in Chapter 7.

Both IBM SNA and telecom X.25 were designed with a “connection-oriented network layer”.

That means when two computers wanted to talk, they did the equivalent of creating a phone call. Every router, switch, or relay in between needed to reserve resources. If resources weren’t available (due to congestion), the connection would fail.

The Internet works nothing like that. Each packet between two computers is sent individually. Routers handle each packet alone. We call them datagrams, because from the router’s point of view when processing the packet, everything is contained  in just that one packet. If there is some requirement that several packets be associated with each other, like data spread across multiple packets, that’s not the concern of the modern network layer.

When the OSI Model was revised in 1994, the network layer was retconned to also allow the Internet method of connectionless network layer.

8.12.6. #2 - Data Link

Contrast this with the modern consensus definition of Data Link Layer in Chapter 3 and the misconceptions in Chapter 7.

The original Data Link Layer #2 was modeled almost entirely after IBM’s SDLC, and not the Ethernet MAC.

In the beginning, there were dumb links between devices. For example, you could connect two teletypes directly together, typing on one to print out on the other. When links were dumb, you need extra wires to communicate connection information, such as the slower device telling the faster device to slow down.

That’s why the RS-232[128] standard has 25 pins, even though strictly only 3 are needed (signal wires for each direction, plus ground).

When both sides of the link are smart, such as using 8-bit microcontrollers, then the extra pins are no longer necessary. Instead of out-of-band signaling using wires, in-band signaling can be used, with packets. For example, the slower side of a link can send a packet telling the faster side to slow down.

As part of its SNA network stack, IBM defined SDLC or synchronous data link control for this purpose. Telecoms then integrated a version of this for X.25.

SDLC had roughly the same function as TCP does today: establishes connections, fragments/reassembles large chunks, does flow control, resends lost packets, and so on. But it only did this on the local link.

But none of this is used on modern networks. Instead of doing it locally, we just rely upon it being done remotely, end-to-end across the network.

For the most part, today’s networks don’t have what was originally intended as the Data Link Layer. It really doesn’t exist. We now stick Ethernet’s MAC layer there, but it’s really not an equivalent.

8.12.7. #1 - Physical

Contrast this with the modern consensus definition of the Physical Layer in Chapter 3, and the misconceptions in Chapter 7.

As mentioned elsewhere in this textbook, the physical layer (where digital  bits are translated to analog signals) has always been the biggest component of networking. Rather than a quick summary here, there’s an entire subsection on the history of this layer earlier in this chapter.

8.13. Andrew Tanenbaum “Computer Networks” (1981)

In 1981, a university professor named Andrew Tanenbaum published a textbook called “Computer Networks”. He used the OSI Model as a blueprint, dedicating a chapter to each layer.

It’s here that we see the creation of the OSI ontological framework, of assigning everything to a layer where it “belongs”. It looks like theory. Academics love to teach theory. So OSI became that theory, even though it was never intended as anything more than a practical blueprint for a specific design.

Since OSI is outdated, the textbook is already “retconning” things, changing the interpretation to fit the real-world. It was liberally interpreted to apply to anything in networking.

Later textbooks from Douglas Comer and Richard Stevens followed this format. OSI became entrenched – all future textbooks were written by people who learned themselves from these textbooks. OSI became a meme. Even though modern university textbooks try to remove the more obviously false stuff, they are still copying the implicit model.


Figure: “Computer Networks, 1st Ed. (1981)” front cover

8.14. C, Unix, and 32-bit microprocessors

Today, we live in a homogenous environment where all operating-systems look like Unix (even Windows), are written in C, and run on 32-bit (or 64-bit) microprocessors with memory protection.

In 1980, this wasn’t the case. There were a lot of 8-bit or 16-bit processors. Even high-end systems looked odd, like 36-bit CPUs. Each vendor of hardware wrote their own custom operating-system in assembly language.

Unix was written in C. For the first time, we had an operating-system that could be compiled for any CPU. It was popular on the 32-bit VAX supercomputer, but at the very low end we saw the use of the Motorola 68000 (68k).

The 68k was really a 16-bit processor internally. To add 32-bit numbers, numbers were divided in half and passed twice through the ALU. But it had 32-bit registers, and a 24-bit address bus (capable of addressing 16-megabytes of physical memory). It didn’t have memory protection needed for Unix, but that could be provided by an additional chip.

The early 1980s saw an explosion of high-end workstations and micro-minicomputers based upon the 68k running Unix, such as the fabled Sun Microsystems.

This set the stage for the late 1980s explosion of RISC chips. Now that hardware makers didn’t need to write their own operating systems or software, they were free to explore radical designs in CPUs.

What’s important about this history is that OSI was originally envisioned as each hardware making their own stack that needed to communicate with the same protocols. What really happened with TCP/IP is that many vendors adopted some version of the BSD network stack. The TCP/IP Internet became popular not because many people followed the standard, but because they all shared the same implementation.

8.15. Multiprotocol office networks in the 1980s

The late 1970s saw many computers based upon 8-bit microprocessors, such as the famous Apple ][ and TRS-80 computers released in 1977, and the IMSAI computer seen in Wargames. But despite some business features, such as the VisiCalc spreadsheet application, they weren’t used that much in office environments.

The year 1979 saw the release of the Intel 8086 and Motorola 68000 processors. These were “16/32 bit” hybrids, 16-bit processors internally but which could handle memory protection and enough memory (around a megabyte) to run full operating-systems like Unix and bigger applications for the office.

It was the IBM PC shipped in August of 1981 that began a wholesale transformation of the office environment. Worker desks were cleared of their clutter and occasional typewriter in order to make way for a desktop computer. Today, it’s inconceivable that an office desk would not have a computer, or at least, space for a laptop.

Networking soon appeared to interconnect these computers.

There was an explosion of local area networks or LAN like Ethernet. IBM made a competing LAN called Token Ring. A cheaper alternative was known as ARCnet. There were many more products from different companies – each company wanted to lock in customers to a custom design rather than compete[129] with open standards.

But for the most part, they all operated essentially like Ethernet at various price points and speeds. Differences that seemed critically important at the time have blurred in the hindsight of history. For the most part, just think of everything as Ethernet.

These office networks were multiprotocol. Ethernet wasn’t designed to carry Internet traffic, it was designed to carry any traffic. For example, the following packet shows a Novell IPX packet.

Multiple protocols could coexist on Ethernet because of the EtherType field. A value of 0x8137 indicates Novell IPX, a value of 0x0800 indicates Internet traffic.

There were a lot of stacks co-existing with each other: TCP/IP Internet, Novell IPX, Microsoft LANManager, Banyan Vines, AppleTalk, DECnet, IBM SNA, and telecom X.25.

There were even multiprotocol routers that would simultaneously route packets for more than one of these protocols across the same long-distance links. A company might lease a circuit from the telephone company to connect two offices, then route multiple protocols across the same link, like TCP/IP, Novell IPX, and AppleTalk.

All these, even TCP/IP Internet, were designed to be private networks for the organization. Every large corporation had their own Novell IPX network, but they weren’t interconnected.

But even though TCP/IP was designed for private networks, people started linking them together. ARPANET was a private network using TCP/IP to connect DoD (military) research sites. The National Science Foundation (NSF) funded a similar network to interconnect university computer-science departments that didn’t have DoD funding. They connected this network to the ARPANET. Others kept doing the same, until they accidentally formed the Internet.

The most popular office product of the time was the file-server, which continues to be important today. Desktop computers would connect to those remotely over the network, so that the remote drives appeared as a local drive[130].

The most popular file-server was Novell NetWare (whose equivalent to TCP/IP was known as IPX/SPX). By the early 1990s, most of the world’s office traffic consisted of Novell’s.

Microsoft had its own file-server product called LAN Manager, using the now infamous[131] SMB or Server Message Block protocol. In the 1980s, it was sold through resellers rather than directly. Each reseller would choose a different network stack underneath the SMB file-server protocol, such as a custom protocol called NetBEUI, Xerox XNS, or even Novell’s IPX/SPX. With the release of WinNT and Win95, Microsoft just included SMB directly into the operating-system rather than as a standalone product.

The file-server formed its own network. Many applications were written that logically just interacted with files on the disk drive. Operating-systems supported locking features, where an application could lock a range of bytes in a file. This was used to synchronize apps on the network, so they could all read/write the same file without conflicting each other.

An early example of this was cc:Mail, the most popular email application around 1990. It wasn’t technically a networked application, as it interacted with a big file on the file server. It worked regardless of the underlying network protocol, whether it be Novell, Microsoft LANmanager, or Banyan Vines.

The point of this lesson is that while today’s local networks carry mostly just Internet traffic, back in the day, they carried a wide variety of other traffic.

8.16. Interoperability and open systems

OSI gets its name from this concept, “open systems”. It means systems from different vendors that can interoperate with each other, as opposed to closed systems from a vendor that only worked with its own products. It’s the intent of standards, that different companies would make products that conform to the standard.

Some vendors like IBM and HP eventually made OSI-compliant products, but they were closed-OSI. It was difficult getting OSI products from one vendor working with those of another vendors. Customers never wanted OSI products, but they were often mandated by government regulations (GOSIP). Thus, OSI products that didn’t actually interoperate were perfectly adequate for meeting regulatory requirements.

Interoperability was the obsession of the 1980s and early 1990s. The major trade show of the industry was known as “Interop”, which started as a small conference where people brought products together in order to test interoperability, that the TCP/IP talk on my computer could talk to the TCP/IP stack on your computer.

9. Proposals

Here are things I propose be done.

9.1. Terminology

This book is primarily about theory and practice, which the OSI doesn’t match, and we need to stop using it.

A wholly separate idea is terminology. Words mean whatever people mean by them.

Take the word “hacker” as an example. There’s a community of techies who style themselves as “hackers”, who will tell you “well actually, hacker only means a gifted techie, not a criminal, use cracker for cybercriminals”.

The reality is that hacker means whatever NYTimes, CNN, or Associated Press mean by the word. It always has connotations of criminality or witchcraft. Nobody owns the word, nobody has the authority to define what it should mean. Dictionaries are authoritative only because they document what it does mean, how most people use it.

The same applies to terminology derived from OSI. What matters is what people mean by the terminologies, not wherever they came from.

I’m specifically referring to the phrases layer #2, layer #3, and layer #3. People don’t really refer to OSI with these terms, many don’t really even pay attention that they come from. They use these terms to refer to local networks (usually Ethernet), Internet protocol, and end-to-end respectively.

There’s no real reason to try to expunge such terminology from the language.

People use these terms along the lines of the table below. Note that we don’t need to include OSI names like “Data Link Layer”, which nobody really says.

Layer #4

TCP, UDP

NAT

Internetwork

Layer #3

IP

router

Layer #2

MAC

switch

Local Network

Layer #1

PHY

repeater

Importantly, we need to keep stressing these aren’t part of one stack, but two stacks. In other words, stress that this is the foundation of the model.

Another common term is “transport”, as in Transport Layer Security. We need to stress this refers to the top of the stack rather than any particular layer. It’s generally layer #4, but not necessarily, such as when using QUIC, which is a third network layered on top.

9.2. Teaching and textbooks

The fundamental proposal is stop lying. I read textbooks and materials and am astonished by the number of falsehoods they preach. It’s all based on the same idea that it doesn’t matter as long as it helps[132] the student learn. They have confused thinking, that as long as they are learning lies they are learning something, and that this is better than learning nothing.

First, abandon the top three layers, even as categories or analogies. They are complete fiction, anybody teaching them is absolutely wrong. This is not a controversial proposal, because they are mostly abandoned or de-emphasized already.

Second, I propose using the real names for what we have now. OSI isn’t theory. You aren’t teaching a principle like the “Network Layer” and an implementation like the Internet protocol (IP)[133]. They are instead just two names for the same thing – the second name is superfluous.

#4

Transport

TCP

#3

Network

IP

#2

Data Link

MAC

#1

Physical

PHY

Thirdly, the most controversial of my proposals is collapsing these four layers in favor of two layers, downgrading those layers to sublayers.

Showing the sublayers together is the wrong abstraction. If you break apart these layers in the same diagram, then it needs to look like the following. The #1 confusion among students and professionals is assuming that Ethernet is part of the Internet instead of being a completely separate network. The independence of the Internet and Local Networks has to be stressed over and over.

Elsewhere in this text, I present an even more complex network stack. This is just my model for deprogramming professionals, experts, and teachers of the OSI cult. I don’t propose this as the actual model we use for introductory textbooks. I think it’s a good model, I’ve put a lot of thought into it, and I’m an expert on this. But just because it’s accurate doesn’t mean it’s helpful. Teachers with experience teaching students would know better what’s helpful.

There’s probably more OSI terminology worth preserving, so if you have any suggestion, leave comments on social media (e.g. @erratarob on Twitter).

9.3. Professional certification

Professional certification is tricky. It’s less about understanding[134] the subject and more about understanding the jargon. If a profession uses confusing language to describe networks, it’s still the correct language of the profession no matter how inaccurate in the real world.

In my own industry, the most popular certification is known as the CISSP[135]. Study guides using OSI are full of untrue statements. For example, the official study guide teaches students about the Session and Presentation Layers – gross lies about these things.

Other topics are flawed as well. I can pick on most any other subject, from cryptography to secure coding, to point out flaws. However, no other section is as flawed as the OSI chapter.

My proposal is the same as above: teach the reality of networks, namely that the Internet is layered on local networks. Don’t teach buzzwords that students will never understand.

9.4. Wikipedia

Wikipedia doesn’t represent truth so much as consensus. If everyone agrees something is true, then isn’t it true? If all the reputable authorities agree something is true, then isn’t it true?

The article on NetBIOS is a great example. All the reputable authorities agree it’s part of the “Session Layer #5”. That’s partly because the NetBIOS specification clearly says it provides “session” features. And it’s partly because NetBIOS runs on top of TCP, which everyone agrees is “Layer #4”, so it must be “Layer #5”? You can find documents from Microsoft, Cisco, Oracle, and IBM that all agree NetBIOS is part of the Session Layer. These are the top experts in networking, and they all agree.

They are all factually wrong, as this book explains thoroughly. OSI never meant anything in particular by “session”, and what NetBIOS actually means by “session” is “connection”. It’s clearly a Transport Layer #4 protocol, not Session Layer #5.

Wikipedia wants frameworks, taxonomies, timelines, and other ontologies. Discussing each topic in how it relates to other topics is an essential part of Wikipedia. It demands something like the OSI Model to pigeonhole things.

But the current use is lies. Wikipedia’s framework is both arbitrary and not really in agreement with any consensus. Sure, they cite some authority every time they assign something to a layer, but it’s still lies.

Take for example the following diagram that I pulled from the page on BGP.

It’s functionally is that of the Network Layer #3, but it runs on top of the Transport Layer #4, implying it belongs to the Application Layer #7. Different authorities disagree.

The correct answer is BGP belongs to Layer #3. However, there’s not enough authorities supporting the right answer.

My ontological framework is vastly superior to the one they have based on OSI.

My proposal is this.

First, Wikipedia needs to get a better ontological framework for classifying things. I propose one here. I’m a top expert and I’ve put a lot of thought into it. I think it’s good. But I don’t insist it’s the best one – just better than the false framework they have now.

Second, it needs to get rid of those lies. DHCP belongs in the Internet layer, for example.

Third, it needs to expurgate 90% of mentions of the OSI model from its articles. They are repeating misconceptions and lies rather than truths. NetBIOS has nothing to do with OSI and there exists no reason to tie OSI to NetBIOS. That’s going to take experts like me writing up things that Wikipedia editors can cite as reference.

To be fair, Wikipedia does a pretty good job with this nonsense. There are examples of total lies, like assigning NetBIOS to the Session Layer. But there are also a lot of things it gets right. It’s a mish-mash, sometimes an editor with a clue writes the right thing, and sometimes one of the clueless blindly cites sources that get it wrong – reputable sources that get it wrong.

9.5. Standards

Standards organizations need to recognize they aren’t going to make OSI happen. The standard needs to be officially deprecated.  It would be helpful to point out nobody is following the OSI Model, instead of the current pretense that everything is compliant.

The IEEE still develops new Ethernet and WiFi standards. They describe Ethernet as fitting into the right layer of the OSI model, even though no stack looks like this.

This can be removed without changing anything. If you like the Data Link and Physical Layers that’s fine. It’s just everything above that should be undefined. It should always have been left defined. Ethernet should not impose the requirement that its payload follows any sort of standard.

Payload

Data Link

Physical

OSI standards exist outside the 7 Layer Reference Model. The X.509 SSL certificates are driven by the ITU SG17 standard’s group, which is technically part of OSI. They are encoded in ASN.1, another OSI standard. LDAP (lightweight directory access protocol primarily used by Microsoft) comes from OSI DAP.

Like with Ethernet, they all pretend to fit within the OSI 7 Layer Model despite doing nothing of the sort. X.509 references the X.800 “Security Model” that tries fit things within the layer model:

But while it claims conformance, it also smartly says that anybody can do these things outside the 7 layer model. Which is how it works on TCP/IP, outside OSI.

10. Glossary

This section lists some of the more common terminology in this paper, as used here in this paper. This far from comprehensive, I assume readers will simply consult Wikipedia for any term.

+access-point - A WiFi access-point is the device that connects to the radio waves on one side and a wired network on the other side. An access-point can be either a router itself for forwarding Internet traffic, or just a bridge to Ethernet, with routing done with other devices. [Wikipedia]

+ACK - See acknowledgement.

+acknowledgement - After data is sent, some sort of signal or packet can be sent in the reverse direction as an acknowledgement that data was received. If the sender fails to get an acknowledgement in time, they will resend the data. Most network protocols implement some form of acknowledgment, but some don’t, requiring the programmer creating apps to do this themselves. [Wikipedia]

+address - An address specifies the destination of a message or packet, where it should be forwarded to. Examples are MAC addresses for Ethernet, IP addresses for the Internet, and email addresses. Addresses are often mistakenly described as the “identity of a host or node on a telecommunications network” [Wikipedia]. This is false. It’s like how your phone number is not your identity, you might have multiple numbers, you might change your number, and so on. While it is sometimes useful to think of addresses as identities, in most cases, it’s harmful.

+API - “Application Programming Interface”. See interface. Most software is written to take advantage of pre-existing software, such as that provided by the operating-system, libraries, or web frameworks. The API is the interface to such code, with specific rules programmers follow. For networking, the most popular API is Sockets, provided as part of the operating-system. Another popular API is OpenSSL, for creating encrypted connections on top of the Sockets API. Web services are described as having APIs, where the programmer of the client accesses well-documented services on the server. For example, a Twitter API to get the tweets of “@erratarob” would consist of a web request to the URL GET https://api.twitter.com/1.1/statuses/user_timeline.json?user_id=erratarob&count=10, which then returns a JSON document in response.

+AppleTalk - This was developed by Apple in the early 1980s to interconnect early Macintoshes and Laserwriter printers. It was a local network technology using simple telephone cables at 260-kilobits/second, along with a simple internetworking protocol to interconnect subnets. That all computers (such as Macintoshes) should come with built-in networking was a radical idea in the early 1980s. Part of it still exists on top of Internet protocols, namely the file sharing protocol. [Wikipedia]

+application - This doesn’t mean anything in particular. Some people insist it refers to something concrete, even though they can’t usefully say what that is.

+ASN.1 - “Abstract Syntax Notation 1”, a method of describing complex/abstract data structures in ways that can be concretely represented with bytes to send on the wire. It’s notable because when students are taught Presentation Layer #6, ASN.1 is frequently mentioned.

+ASCII - “American Standard Code for Information Exchange”. This was an early character-set for teletypes. It supports only 7-bits, meaning that it encodes only 128 characters. This is barely enough for upper-case, lower-case, numbers, punctuation, and basic control-codes. Since bytes are 8-bit, there are a number of version that support languages with other characters, such as the LATIN-1 extension that supports most European languages. The important aspect of  ASCII is that it’s both a character-set and protocol: the control-codes included such things as “start of text” byte that would indicate the start of a message. Such control bytes were an early text-based protocol for devices that didn’t have CPUs/software. Simple logic could process a byte with a special meaning. [Wikipedia]

+AT&T - This is the huge US telecom monopoly of the 1970s/1980s. Today, it’s a big Internet provider and telephone company in the US. It controlled the circuits for long distance (wide-area) telecommunications in the past. The government broke it up into regional companies (“Baby Bells”) in the 1980s. Half have merged back into AT&T, while others merged with Verizon and CenturyLink/Lumen.

+autonomous-system - On the Internet, a group of subnets and routers under a single authority is called an autonomous system. Routers exchange route information by autonomous system numbers rather than subnet IP addresses to reduce the amount of information needing to be exchanged. [Wikipedia]

+backbone - The Internet backbone consists of companies that primarily interconnect ISPs, data centers, and other backbones. Your Internet traffic primarily goes from your home/business, to your ISP, to a backbone, possibly across other backbones, then to a data center, then to website you are talking to. [Wikipedia]

+bit - “binary digit”. The bit is the fundamental unit of digital information, representing one of two values, either [0, 1], [on, off], [false, true], [red, blue], and so on. All networking traffic is eventually transformed to bits and sent through wires, fiber optics, or radio waves. [Wikipedia]

+BITNET - In the 1980s, this was a type of internetwork. This was a service connecting corporate and university computing centers, primarily for exchange of emails. Most of the nodes on the network were big IBM computers. [Wikipedia]

+BGP - “Border Gateway Protocol”. BGP is the primary way that routes are mapped on the Internet. When two organizations connect, their routers use BGP to exchange routing information. Each router creates its own map of the Internet based upon the information received via BGP. BGP is primarily used on the border of organizations, with different protocols used inside the organization’s networks. [Wikipedia]

+bridge - A bridge is just another name for a device that relays packets (see relay). Most uses refer to Ethernet relays that forward packets based upon MAC address. An Ethernet switch is also called a bridge. [Wikipedia]

+broadcast - This refers to sending a packet to all devices on a local network. For this reason, local networks are sometimes called a “broadcast domain”. Broadcasts wouldn’t work on the Internet as a whole, because that would require flooding the Internet with billions of packets. Broadcasts are primarily used for local devices/services to find each other. [Wikipedia] [Wikipedia]

+bus - The word “bus” refers to multiple things attached to the same wire. Typically, wires only interconnect two devices. When they connect more than two, problems arrive, like messages on the wire colliding when two people transmit at the same time. While today’s Ethernet is point-to-point, it was originally defined as a bus. CANbus, the network found in cars, is a bus network. [Wikipedia]

+byte-order - A single byte can represent numbers up to 255. Larger numbers need multiple bytes. A multi-byte value can be sent left-to-right or right-to-left. In other words, consider the two-byte number 0x55AA. You can send this as 55 AA, or AA 55, depending upon which byte-order you prefer. This is also called endianness. [Wikipedia]

+cable Internet - There are several ways of providing Internet service to homes and small businesses, using fiber optics, telephone wires, or cable TV wires. When using cable TV wires, we call this “cable Internet”. A cable-modem is used to connect the home to the cable. The technology that encodes the signal is known as DOCIS. It uses the same technology as cable companies use to encoding digital television channels. The most popular cable Internet provider in the United States is known as Comcast. Most cable Internet is now fiber-to-the-neighborhood – fiber optics are used to carry traffic to a box in the neighborhood, with copper coax cables splitting off from there. As Internet speeds increase, the copper coax cables are getting better, but also the cable companies are putting fewer subscribers on each cable. [Wikipedia]

+cable-modem - A device for connecting a home or small-business network to the Internet. The most basic unit is an Ethernet bridge, relaying Ethernet packets onto non-Ethernet technologies of cable TV. But most customers choose more complex cable-modems that include both a router (isolating the home network) and WiFi. [Wikipedia]

+character-set - A character-set maps numbers to characters (letters, digits, punctuation, symbols). The most common character-set, Unicode, chooses to represent the letter ‘A’ with the number 65. The competing EBCDIC character-set chooses to represent ‘A’ with the number 193. In order to communicate text over a network, both sides have to agree upon the same character-set, otherwise things will look garbled. The Japanese, who have traditionally used very different characters, call this problem “mojibake”. These days, almost everything has converted to Unicode, so it’s not a significant problem. It’s mentioned in this text because it was once assumed that every computer would have a different character-set, and the network would be responsible for translating between them. Today, everything is Unicode, and when it isn’t, it’s somebody else’s job to translate, not the network. [Wikipedia]

+CAN bus - “Car Area Network bus” or formally “Controller Area Network bus”. Most every car has a CAN bus that connect various components together, such as the engine, anti-lock brakes, airbags, the transmission, the entertainment system, and so on. It’s a fairly low-speed network. Modern cars now also include high-speed Ethernet for transmitting such things as hi-def video from multiple cameras, for self-driving features. [Wikipedia]

+CCNA - “Cisco Certified Network Administration”, a popular professional certification from Cisco, the largest vendor of networking gear. This book will not help you pass that test, but instead likely confuse you with heterodox claims. [Wikipedia]

+certificate - This verifies the identity of a website (and other network services), guaranteeing that if you visit www.google.com, that it’s actually Google and not a hacker pretending to be Google. Web browsers and operating-systems include lists of trusted authorities (“certificate authorities” or “CAs”) that cryptographically sign certificates to attest to their authenticity. [Wikipedia]

+certification - There are many “professional certifications” by various vendors and non-profit organizations that certify a professional understands networking. This may be network-specifica, such as Cisco’s CCNA (Cisco Certified Network Administrator) or as part of some other IT certification (such as the CISSP or Certified Information Systems Security Professional, focused on cybersecurity rather than just networking). Generally, certification requires a test as well as “credits” from taking educational courses. This book will not help you pass a certification exam, we describe them here only to demonstrate how even the experts are wrong about OSI.

+checksum - This is a technique for detecting when data has been corrupted. It simply traits data as a series of numbers and adds them altogether. The resultant value is sent along with the data. The receiver does the same algorithm. If the result doesn’t match the checksum sent with the packet, then corruption must have happened somewhere on the network. [Wikipedia]

+circuit - In the beginning, telephone and telegraph companies would use physical circuits, copper wires with analog repeaters. As the telephone network was digitized in the 1960s, physical circuits would become virtual circuits sending streams of bits. Establishing a circuit, or making a phone call, would consist of contacting each switch in turn and reserving resources. A circuit was constantly transmitting bits, regardless if the user had something to send.

+circuit-switching - As described above, early telegraph and telephone systems used circuits. Establishing a circuit meant contacting every switch along the route, reserving resources for the circuit. This was known as “circuit switching”, in contrast to the “packet switching” of the Internet. [Wikipedia]

+CISSP - “Certified Information Systems Security Professional”, the most popular professional certification in the cybersecurity/infosec community. It’s mentioned frequently in this document due to its grossly inaccurate descriptions of networking in general and the OSI Model in particular. Reading this book will definitely not help you pass their test, which requires you to repeat the inaccurate claims of their training materials. [Wikipedia]

+CLNP - “Connectionless Network Protocol”. This was the internetworking protocol the OSI standards groups defined as the alternative to IP. The original intention of OSI was to use a connection-oriented protocol similar to X.25. However, the popularity of the Internet caused them to retcon the OSI Model and design something that worked substantially like the Internet.

+congestion-control - Like automobile traffic on roads, networks experience congestion. A router in the core of the network may receive traffic from multiple incoming links that overload an outgoing link. Controlling congestion is one of the most important issues in networking, if not the most important. The Internet controls congestion end-to-end, such as with TCP or QUIC. When a packet fails to arrive at the destination, it’s assumed that it was dropped somewhere on the network due to congestion, so the transmitter slows down until packets are no longer dropped. Routers inside the network don’t need to do anything complex when faced with congestion – they silently drop the packet and hope that the ends notice. See also [Chapter 6]. [Wikipedia]  

+connection - In networking, this refers to two sides talking to each other. It’s something that can happen in many places in the networking stack, though we most often talk about it happening with TCP. There’s usually a start, middle, and end to the connection. We often call alternatives, like UDP, connectionless, but which mean just a single packet is sent without first establishing a connection, but even then, we’ll sometimes refer to the exchange of such packets back and forth as a connection.

+connectionless - There are two types of network traffic, connection-oriented and connectionless. In connection-oriented traffic, packets are sent back and forth to establish the connection, after which data is sent. In connectionless traffic, packets containing the data are simply sent with no preceding setup. These are often called datagrams because they stand alone, like a telegram in the early telegraph networks. One use for datagram is broadcasts or multicasts, where data is sent to multiple recipients at once. Another use is short request/responses. A good example of this is DNS, which requires only a single packet for a request and a single packet for a response. [Wikipedia]

+connection-oriented - See connectionless.

+control-code - Control-codes are special characters in character-sets like ASCII that cause events to happen, such as for teletypes to scroll to the next line, or even just beep. For more advanced terminals with screens, more complex control-codes could do more complicated things, place text anywhere on the screen, not just the bottom line (which is what happens by default, as terminals emulated teletypes). It was once thought that a network layer was needed to translate control-codes, the Presentation Layer.

+CRC -  “Cyclical Redundancy Check”. This is a type of checksum that uses more complex math than simply adding up all the numbers, doing a better job of detecting corruption in packets. See checksum for more information. [Wikipedia]

+CRT - “Cathode Ray Tube”. This was the technology used to build televisions from the 1930s to around 2000, after which they were replaced by LCD/OLED/plasma flat panels. They were also used to build computer screens, first for vector graphics, then for raster graphics. In the 1970s and 1980s, the word “CRT” was used synonymously for terminal. [Wikipedia]

+cyberspace - According to William Gibson, who coined the term, cyberspace represents all the computers and the world, whether they are directly connected or not. This is larger than just the Internet, which represents only the computers connected to each other. [Wikipedia] [Neuromancer] [Snow Crash]

+datagram - This is a synonym for packet, often with the connotation that it is stand-alone instead of part of a connection. An example is a DNS request, where the entire request fits inside one packet, and the entire response fits inside another. In contrast is an HTTP connection, where responses from a server must be sent in many packets. The Internet Protocol (IP) is a datagram protocol, because from the router’s perspective, it processes packets individually, one by one as they arrive, ignoring any concept of an ongoing connection. It’s the higher layer protocol TCP on the ends (see end-to-end) that is aware of an ongoing connection. If a programmer wants only simple datagrams (such as for DNS requests), they use UDP or the User Datagram Protocol instead. UDP offers some features of TCP, such as port numbers and checksums, but removes any concept of a connection. [Wikipedia]

+data link - This is a term coined by OSI. A traditional point-to-point link consists of a physical cable between both ends, with simply bits (or bytes) being transmitted as a stream. When those bytes are formed into packets, with protocols and headers, then it becomes a data link. The protocols LLC and SDLC are prime examples of this. Ethernet’s MAC layer has been retroactively defined to be part of the data link.

+defragmentation - See fragmentation.

+DHCP - “Dynamic Host Configuration Protocol”. This protocol is used to assign IP addresses to computers. When you connect to a WiFi, it’ll use DHCP to assign an address appropriate for the local subnet. DHCP is the only method for doing this with IPv4, but with IPv6, DHCP is optional, with other methods also supported. [Wikipedia]

+DNS - “Domain Name System”. The Internet uses numeric address (32-bits for IPv4 and 128-bits for IPv6) to route packets. These are too hard for humans to use, who prefer names instead, like www.twitter.com or www.wikipedia.org. DNS is the system that translates between names and addresses. When you click on a link in a browser, it’ll first resolve the name into an address, then communicate with that address. If DNS fails, then it’ll appear as if the Internet is down, when in fact, the routing of packets may still work. A lot of public Internet failures have instead been DNS failures. [Wikipedia]

+DOCSIS - This is the technology used by cable Internet to transfer Internet data inside technology originally designed for cable TV. Its upper layer looks like Ethernet, while its lower layer looks completely different from Ethernet. DOCSIS has many layers, one of which imbeds those Ethernet frames inside MPEG streams – the same MPEG streams that normally carry video. This text uses DOCSIS as a demonstration that in the real world there are many more sublayers than in the OSI Model, and networks are layered on networks. [Wikipedia]

+endian - See byte order.

+end-to-end - End-to-end means that things are handled on the ends of the network rather than in the middle. This can be a technical concern, such as how congestion control is handled on the ends rather than by routers in the middle. It’s also a philosophical point of view: the telephone companies imagined that services like “video phone calling” would be handled by the telephone company inside the network, but with today’s Internet, companies only route packets. It’s the device on the ends that handles video calls, like Facetime. [Wikipedia]

+error-checking - This term doesn’t refer to anything specific. However, many textbooks insist that some layer is responsible for error-checking. They might be referring to checksums or CRC checks. They might be referring to acknowledgements. In cryptography, it might refer to hashes.

+error-correction - This doesn’t refer to anything specific. It may refer to something called “forward error correction” in the Physical Layer #1 (PHY) that sends enough extra bits to correct any small error. It may refer to how transport protocols like TCP retransmit packets if they fail to arrive at their destination.

+Ethernet - A technology developed in the early 1970s to interconnect multiple computers to the same wire. Since then, the technology has changed, we now connect a computer via a single wire to an Ethernet switch. Because it defined a network, it violated the OSI Model. [Wikipedia]

+file-system - The term file-system refers to the hard drive within your computer. In this document, describe services like Unix NFS and Microsoft SMB that make a remote server look like a local drive. [Wikipedia]

+flow-control - This is what we call congestion-control across a single data link, or across a single TCP connection. See congestion-control.

+flow-label - IPv6 was defined with an extra field in the header that could be used by apps on the ends of the network to communicate information to routers for special handling. I’ve never seen it used.

Figure: CloudFlare description of the OSI Model

+formatting - Formatting is one of the funniest words under discussion here, because it means different things while always meaning the same thing. Some descriptions are that Data Link Layer #2 is responsible for “format”, because it formats packers for the local wire. Others say it’s Presentation Layer #6, assigning that layer data formats like ASN.1. Thus, you have something like the CloudFlare version of the OSI Model where they declare “format” as a feature of both Layer #2 and Layer #6. It’s all nonsense, “format” is something that happens everywhere in a network stack.

+FTP - “File Transfer Protocol”. This is one of the original applications of the early Internet, along with Telnet, Remote Job Entry, and SMTP for email. It’s still used, but most people now transfer files via HTTP instead. [Wikipedia]

+fragmentation - The Internet sends data in packets up to around 1500 bytes in size. Larger, bulk data needs to be split across multiple packets, then combined back against when received. This is called “fragmentation and reassembly”. It implies a number of features, such as adding sequence numbers so fragments can be reassembled in the right order. It implies tracking which fragments have been received, retransmitting those lost in the network. It implies congestion-control, to slow the sending of fragments.

+frame - Another word for packet, but in the context of Ethernet. We typically say “Ethernet frame” and “Internet packets”.

+gateway - A synonym for a relaying device, like a router. In particular, when configuring your Internet settings via the command-line, the nearest router will be labeled the “gateway”. That’s really the only modern use, though sometimes email servers will be called “email gateways”. [Wikipedia]

+HDLC - This is the official ISO standard version of IBM’s SDLC. See SDLC. [Wikipedia]

+header - The bytes at the start of a packet containing protocol information, such as an address telling where the packet should be relayed/forwarded. [Wikipedia]

+hexdump - This is the standard way of displaying a packet, where every 4 bits are represented by a character in the range [0123456789ABCDEF], and every 8-bit byte represented by two such hex digits followed by a space. [Wikipedia]

+hop-count - See TTL.

+host - A term from the early days of the Internet referring to any computer attached to the Internet. This was to contrast devices on the edges using the network with the devices inside the network like routers.  [Wikipedia]

+HTTP - This is the protocol for the web, a type of network layered on top of the Internet. The most notable feature, from the perspective of this text, is its independence from the underlying Internet. It has several versions, most recently HTTP/3 (version 3). [Wikipedia]

+IBM - “International Business Machines” Around 1980, IBM had 50% of the entire computer market. It was almost synonymous with computing. It was an early pioneer, back when computing devices were mere electromechanical tabulating machines. It was the dominant computer vendor in the industry from around 1955 to  1985. In the early 1970s, IBM had roughly 50% market share in the computer industry. It sold big computers to the government, military, and big businesses. That business largely stopped growing around 1980, with all further growth in the industry going to other companies. It’s notable because the OSI Model was designed to reflect IBM’s early networking. See also SNA. [Wikipedia]

+ICMP - “Internet Control Message Protocol”. These are packets that contain control information instead of data. For example, when a router drops a packet because the TTL/hop-count has been exceeded, it sends back a message to the sender using ICMP. [Wikipedia]

+IDP - “Internet Datagram Protocol”. This was part of Xerox’s competing internetwork protocol stack known as XNS. It’s similar to IPv4 and IPv6. The major difference was the style of addresses. An address was 10 bytes, consisting of a 4 byte network number and a 6 byte local address, matching the Ethernet MAC address. Routers would only use the 4 byte network number for routing, then use the local address once the packet had reached its destination network.

+internetwork - A network of networks. Traditional networks were limited by location or by organization. An internetwork layers an entirely new (and independent) network on top of local/organizational networks, using a completely different technology. In other words, a home network would  be the lower layer, constrained by the limits of the home, while the Internet is a network that spans the globe.

+IP - “Internet Protocol”. This is the protocol that forms the modern Internet. It’s a 20 to 40 byte header that contains the destination address where the packet is being sent plus a few pieces of control information, such as the TTL/hop-count. When routers receive a packet, they look at the destination address, and relay the packet in the appropriate direction to that destination. See also IPv4 and IPv6.

+IPv4 - “Version 4 of the Internet Protocol”. There are two versions of IP (Internet Protocol). The first was created in 1980 and is called IPv4, the second created in 1995 and is called IPv6. The major difference is the size of the address with IPv4 only able to address 4 billion unique devices on the public Internet. See IP.

+IPv6 - “Version 6 of the Internet Protocol”. See IPv4. [Wikipedia]

+IPX - Novell’s “Internetwork Packet eXchange” protocol, a network stack as feature rich as Internet’s TCP/IP. Had Novell tried, it could’ve created a rival internetwork. IPX is nearly identical to the earlier XNS protocol stack from Xerox.  [Wikipedia]

+ISO - The international standards organization whose members consist of national standards organizations. The United States is represented at the ISO through ANSI, the American organization for standardization. Early efforts for standardizing computer technologies were done under the auspices of the ISO. [Wikipedia]

+ISO/IEC 7498 - This document[136] from ISO specifies the OSI Model. It’s the same content as the ITU-T X.200 document. The two organizations had started their own “networking” standards efforts and then merged them into one, hence the two documents. The ISO focused on “computer networking” while the ITU focused “telecommunications” networking. They realized that they are much the same thing, so merged the efforts.

+ISP - “Internet Service Provider”. An ISP provides services on the edge of the Internet, connecting consumers and businesses to the backbone. The largest ISPs in the United States are AT&T, Comcast, Charter, Verizon, and CenturyLink/Lumen.

+ITU - This is the “International Telecommunications Union” first established in 1865 to define standards for telecommunications networks. [Wikipedia] The ITU, specifically the ITU-T subdivision, is responsible for the X.200 series of OSI standards. The ITU-T was named the CCITT during the development of the standards, but later changed its name.

+job - Today’s computers are always turned on, running many programs at a time. In the early days of computers, they were usually turned off, only turned on in order to run a single set of calculations, then turned off again. The task they were to run was labeled a “job”. One of the first uses of networking was to submit jobs remotely over the network.

+LAN - “Local Area Network”. In the 1980s, there were fundamentally two types of technologies, one for an office network inside a building or campus, and long distance communications. The word LAN was used to refer to that local network, whereas WAN referred to the long distance network. Today, the technologies have merged, so there’s no real distinction. [Wikipedia]

+LAP-B - This is a variant of IBM’s SDLC protocol that was included with X.25.

+latency - This is the time it takes for a packet to travel from one end of the network to a target destination. The delay from San Francisco to New York is at least 30 milliseconds due to speed-of-light delay. Traditional geosynchronous satellites in high-orbit have a minimum of 600 milliseconds of delay, making gaming and voice/video phone calls impractical. Low earth orbit satellites like Starlink have around 30 milliseconds added delay. [Wikipedia]

+library - Software is built from modules. Popular modules that are used across many different software products is called a library. A library offers a well documented interface. A typical example is OpenSSL, which is a complete implementation of the SSL standard. When creating networked applications, a programmer will typically program to the OpenSSL APIs rather than the Sockets APIs. Thus, instead of creating a TCP connection to a target, an app will create an SSL encrypted connection, with the underlying TCP connection created implicitly by the library. Libraries are one of the major forms of APIs, contrasted with operating-system APIs. Whereas operating-system APIs are considered inside the network stack (like Sockets), library APIs are usually considered to be on top of the network stack, separate.

+link - See point-to-point.

+LLC - “Logical Link Control”. The pre-Internet network standards of IBM SNA and telecom X.25 used a protocol across point-to-point links to improve the robustness of communications, such as re-transmitting packets lost due to corruption from electromagnetic interference. When Ethernet was invented, it had no equivalent, and thus, SNA and X.25 could not work over Ethernet. The protocol LLC was defined to handle this. This isn’t used with Internet traffic, because such retransmissions are handled end-to-end instead of locally. LLC is nonetheless an interesting vestigial feature of Ethernet. [Wikipedia]

+load balancing - A technique that splits traffic across multiple computers. Even though it may appear you are talking to a device with a single IP address, you may in fact be talking to multiple computers, with a front-end device distributing the load. Other alternatives are using DNS for load balancing, where you appear to be talking to the same named computer, but where traffic is sent to multiple IP addresses. [Wikipedia]

+LoRa - This is a low power radio communications standard that’s becoming increasingly popular, especially for talking to smart devices in office buildings and the home. [Wikipedia]

+MAC  - “Media Access Control”. Ethernet has two fundamental parts, the part that sends streams of bits (PHY) and the part that collects these bits into packets, and uses addresses (MAC addresses) in the first part of the packet in order to forward the packets toward their local destination. Most people claim this is layer #2 in the OSI Model, but with Etherent switches, this book claims best thought of as Layer #3. Ethernet is a full network of its own. The Ethernet MAC layer is often paired with Ethernet PHY, but has also become a lingua fraca of local networking, used on top of other technologies, like WiFi, DOCSIS, MPLS, and so on.  [Wikipedia]

+MAC/PHY - In much the same way as TCP/IP is synonymous with the Internet, this book frequently uses MAC/PHY as synonymous with Ethernet. The MAC sublayer and PHY sublayer of Ethernet are its two most important components. When using this pair, this book is reflecting the modern notion that the Ethernet MAC sublayer is equivalent to layer #2. But this book suggests things are more complicated, that the MAC layer is best thought of as layer #3 as well.

+MAC address - This is a 6 byte number that was originally defined for Ethernet. Devices on a network need a unique address, which is often assigned manually or automatically. Ethernet solved this by creating a globally unique address burned into the hardware chips, so that each Ethernet chip would have a unique hardware address. This solved the problem of having to configure addresses. The first three bytes, the OUI, is assigned to a vendor.. [Wikipedia]

Figure - from https://www.faa.gov/about/history/historical_perspective A picture of an early IBM mainframe taking up an entire room

+mainframe - This refers to the room-sized computers that dominated business computing from the late 1950s through the early 1980s. This business was dominated by IBM. The chapter on History goes into a heavy discussion of mainframes. The OSI Model was designed to describe how mainframes worked, namely with a single powerful computer in the center of the network with peripherals and terminals attached to that network. The point of this entire book is that this model of mainframe networking doesn’t describe the modern Internet, that using the mainframe model to learn the Internet just confused people. [Wikipedia]

+Minitel - This was a service provided by the French government from 1980 to 2012, where they put an 8-bit dumb terminal in most every home to supply videotex online services. In this text, it’s used to describe the history of the telecoms and the direction they were going, in contrast to the Internet that went in a very different direction. [Wikipedia]

+MPLS - “Multi-Protocol Label Switch”. This is a type of network that’s often seen sandwiched between local networks (like Ethernet) and the Internet. A large ISP or backbone will use MPLS to simply routing across their private network. In other words, Internet routing technology is used on their borders with other Internet companies, but packets are routed using MPLS internally. This book uses MPLS to show how we don’t have a single network stack with many layers, but multiple networks layered on each other. [Wikipedia]

+MS-RPC - “Microsoft RPC”, see RPC. [Wikipedia]

+multicast - These are special addresses (both Ethernet and IP addresses) that when traffic is sent to them, they’ll reach multiple devices. It’s a common way to discover devices on the local network, like the “all routers” address being used to send control packets to all routers that might exist locally. A hotel network might use multicast over typical networking technology, where tuning to a channel means receiving a stream of multicast packets. A protocol (called IGMP) notifies the nearest router which “channel” the TV is tuned to, and therefore, which multicast streams need to be forwarded to it. [Wikipedia]

+multiplexing - This is a funny term that wherever you see it, you need to run away. It means multiple things sharing the same connection. As networking technologies slowly evolved in the 1970s, every time you adding something on top of existing technology, you realized that multiple users of the high layer needed to share the resources of the lower layer. So each time they did this, they had to add a “multiplexing” feature to the existing lower layer to support what they were doing at the new higher layer. One such case was the original OSI idea of a Network Layer #3. It would establish a connection between machines first, then allow multiple concurrent Transport Layer #4 connections to share the single layer #3 connection. Thus, the OSI document says that layer #4 is responsible for multiplexing the network connection. This has been widely misinterpreted as meaning that multiplexing is a layer #4 feature. It’s not, it’s a feature of every layer

+multiprotocol network - This refers to the fact that a local network like Ethernet can simultaneously support many internetwork protocols at the same time, such as IPv4, IPv6, XNS, IPX, AppleTalk, and so on. Office networks in the 1980s were heavily multiprotocol. These days, with just the Internet, the term is no longer relevant.

+NAT - “network address translation”. The IPv4 Internet has run out of unique addresses. To compensate for this, NAT is a type of router that allows multiple computers behind the router to share one public IPv4 address. On the private network, addresses in ranges like 10.x.x.x or 192.168.x.x are used. The NAT then translates the addresses when packets flow through the router. NAT works by tracking outbound connections, but can’t accept incoming connections, because it doesn’t know which private computer should receive the connection. In this way it serves as a firewall, allowing outbound connections but blocking inbound connections. [Wikipedia]

+NDP - “Neighbor Discovery Protocol”. It’s a protocol in IPv6 that enables a device to discover the local router, and other information about the local network. It handles some of the same functionality as ARP and DHCP. [Wikipedia]

+NetBIOS - Through the 1980s and 1990s, this was a base layer for Microsoft networking that ran on top of multiple networking stacks, such as raw Ethernet (local networks only), the Internet, as well as other competing network stacks of the time like Xerox XNS and Novell IPX. It was a dominant protocol in the 1980s, but almost never seen today It’s notable in this document because most educated sources falsely pigeonhole it into the OSI Session Layer #5 when the closest equivalent is Transport Layer #4. [Wikipedia]

+network - This is something that happens when 3 or more computers try to talk to each other, as opposed to simply a link when only 2 computers are trying to communicate. [Wikipedia] [Wikipedia]

+network address translation - See NAT.

+network stack - See stack.

+NFS - “Network File System”. This is a file server protocol used by Unix and Linux systems since the 1980s. This book describes this to point out how it should be thought of as its own network, rather than an application. It’s its own network layer on top of the Internet. It competes with Microsoft’s SMB file server protocol.  [Wikipedia]

+Novell - This was the largest vendor of products for office networks in the 1980s and early 1990s, based around its “Netware” file-service. It disappeared in the early 1990s partly because of the rise of the Internet, and partly because Microsoft started providing file-services built-into Windows. It’s notable for its IPX network stack that rivaled the Internet TCP/IP, and could easily have formed a competing internetwork had Novell tried. [Wikipedia]

+OpenSSL - This is the most popular library that implements the SSL protocol. SSL is a key component within web network stacks. It’s implemented as a library API instead of as an operating-system API. [Wikipedia]

+operating-system - Example operating-systems are Linux, Windows, macOS, and iOS. An operating-system consists of a “kernel” underneath “user-mode” apps and services. The kernel creates a virtual machine for each user-mode program, keeping them separate, partly for reliability, and mostly for security reasons. What’s interesting for this book is how the Internet TCP/IP stack is implemented in the kernel, but the web is implemented in user-mode programs like the web-browser and web-server. [Wikipedia]

+OSI - “Open Systems Internetconnect”. A set of standards developed around 1980s defining a set of networking technologies. They were based upon how mainframe and telecom networks worked in the 1970s, which is very different than how the Internet works today. OSI was supposed to be the future of networking, but isn’t used today. By “open” it means that the specifications are available for anybody to read so that different vendors can make their own software that nonetheless follows the same standard and can interoperate. By “interconnect” it means simply “networked”.

+OSPF - “Open Shortest Path First”. This is a routing protocol for local networks, widely used in corporations. The name means that the network can be wired up with loops, but that the protocol allows routers to still send traffic in only one direction rather than in a loop. [Wikipedia]

+OUI - “Organizationally Unique Identifier”. The first 24 bits (3 bytes) of a MAC address that is assigned to an organization, such as A4:D2:3E being assigned to Apple. This makes MAC addresses globally unique: each OUI prefix is assigned to a specific vendor, who then uniquely defines the suffix to each piece of hardware they manufacture (like Ethernet chips).  [Wikipedia]

+packet - Early computer networking transmitted streams of bits (or characters). An alternative (and the way things are done today) is to group bits/bytes into packets from around 60 byte to 1500 bytes, add additional bytes for protocol headers, then send the packets through the network. It’s similar in concept to a telegram from the 1800s that had the name of the destination at the top and the body containing the text message. Breaking data up into packets, instead of streaming bits, simplifies computer networks in many ways. [Wikipedia]

+packet-sniffer - This is a device (or software) that eavesdrops on network traffic. Since network traffic is sent in packets, and “sniffer” evokes that idea of spying on the traffic, the term “packet-sniffer” was formed. The most popular packet-sniffers are tcpdump on the command-line and Wireshark as an app. [Wikipedia]

+packet-switching - Early digital networks based upon the telephone system transmitted streams of bits known as virtual circuits. The network was formed from circuit-switches that would manage these virtual circuits and forward data. The term “packet-switching” was coined as an analogy, but where packets are routed. A packet-switch is therefore just another name for router, but in the context of old time telecommunications. [Wikipedia]

+payload - Anything carried by the network. Confusion happens because the contents of payload are often related to the network, so some people treat some payloads as part of the network stack.

+peer-to-peer - This refers to a situation when two sides of a network connection are equal with each other. This is the natural state of the Internet today, but wasn’t the norm for mainframe networks. The OSI Model wasn’t really defined with peer-to-peer networking in mind. Today’s use of the term peer-to-peer is different, referring to things like BitTorrent that transfer files in a network without any particular server. It’s related, but with a different emphasis. [Wikipedia]

+peripheral - A device like a printer, modem, tape deck, disk pack, or other device connected to a computer. Peripherals can be dumb, controlled by wires attached to a computer, or smart, controlled by commands sent over network links. With today’s computers, almost anything attached via USB is considered peripheral. In the early days of computers, the idea of a “network” was connecting peripherals and terminals to a mainframe.

+PHY - This abbreviation refers to the physical part of networking, the point at which a transceiver converts digital bits into analog waves on a wire. It’s paired with a MAC, the portion that separates a bit stream into packets. [Wikipedia]

+physical - We use this word to refer to where the signal hits the physical wire, either a copper cable, fiber optic, or antenna.

+ping - See also latency.

+point-to-point - Early communications technologies consisted of a single cable with devices on either end – a point-to-point cable. This is still how most things work, such as an Ethernet cable connecting a computer to a switch. The definition of a network is where we’ve moved beyond just two devices communication.

+presentation - This doesn’t mean anything in particular. Some people insist it refers to something concrete, even though they can’t usefully say what that is.

+protocol - This word originally comes from international relations, such as how ambassadors were to interact with host countries, governing such things as the proper handshake or bowing. For example, ambassadors do not bow to foreign monarch, as they are not one of the monarch’s subjects. When two electrical devices talked, they similarly needed rules for handshakes. This was often done with separate wires, such as signaling on one wire that a message was about to start, followed by sending the message on data wires. This was out-of-band signaling. As devices became smarter, protocols became in-band, sending special characters to do the signaling. Later, protocols referred to large structures of data (many bytes) that surrounded the data, as headers in a packet. It’s confusing because while the term originally referred to actions it now refers to contents of a packet. This is because signaling the start-of-message is the same concept regardless if it’s done as an action (signaling out-of-band on a different wire) or content (signaling in-band as a special character code).

+PTT - “Postal, Telegraph, and Telephone”. Until the 1990s, most countries had a single government entity responsible for post (mail), telegraphs, and telephones. These were known as the PTTs. The United States was different, where government ran the post, but private companies like AT&T and Western Union handling other services. With the rise of the Internet, most countries have privatized these functions. For this book, the interesting historical fact is how the PTTs attempted to drive the adoption of OSI instead of the Internet. [Wikipedia]

+PUP - “PARC Universal Packet”, an early internetworking technology developed at Xerox’s PARC research center. This was highly influential on the later Internet. [Wikipedia]

+QUIC - This is a version of HTTP, know as version three, or HTTP/3. It integrates SSL as part of the protocol. It’s based upon UDP instead of TCP, because it handles traditional transport features itself, optimized for such things as multiple concurrent connections to a webserver, and fast startup times to lower the effective latency when visiting a webserver. Google’s Chrome and webservers support QUIC, with adoption by others growing slowly. [Wikipedia]

+RIP - “Router Information Protocol”. The was the first routing protocol defined for the Internet, based upon Xerox’s XNS routing protocol. It’s only usable in small organizations.

+relay - This is a general term for a device that forwards traffic. It can be an Ethernet switch, a bridge, an Internet router, an email gateway. This book insists that all relaying is essentially the same, whereas other textbooks insist that things like Ethernet switching and Internet routing are fundamentally different.

+reliability - Networking technologies do two things to make sure that a packet arrives correctly at its destination. The first is a checksum or CRC to verify the packet hasn’t been corrupted. The second is sequence numbers and acknowledgement to verify the packet hasn’t been lost in the network. If it has been lost, then the packet will be retransmitted. Such reliability can happen on a local network, such as with SDLC or LLC, or it can happen on the Internet, with such protocols as TCP.

+resolver - Packets are routed according to the numeric addresses (like 142.250.189.132), but humans refer to things on the Internet with names (like www.google.com). The process of finding the numeric address is known as resolving the name. Your computer can do its own resolving, but typically sends requests to another computer called a resolver. The resolver is part of the basic configuration information when connecting to a network. Most people accept whichever resolver is suggested by their ISPs, others use well-known Internet resolvers like 8.8.8.8 (provided by Google) or 1.1.1.1 (provided by CloudFlare) or 9.9.9.9 (provided by IBM). [Wikipedia]

+retcon - “retroactive continuity”. This describes where later works of fiction in a series pretend that something different happened in earlier works. An example was where the character of Sherlock Holmes was killed, and then as the author continued to write works, changed things so that he actually survived. It’s a great analogy for OSI, as the original version didn’t describe the Internet, so people now pretend it meant something different so that it can describe the Internet. [Wikipedia]

+retransmit - If a packet is lost on the Internet, sometimes it’ll be retransmitted. It requires that that packet can be uniquely identified (such as with sequence numbers), and that acknowledgement packets are sent in the opposite direction. This can happen with technologies for local links. It can happen as part of the Internet stack when using TCP. This can be done instead by protocols that run on top of the Internet stack, such as with QUIC aka. HTTP/3. This is considered a transport feature.

+RFC - “Request For Comment”, the documents for Internet standards. The name reflects the de facto nature of the early Internet, as people simply built whatever they liked and asked others in the community to comment. This contrasts with the de jure standards of OSI, where people agreed upon official standards before building anything. [Wikipedia]

+routing-protocol - A routing-protocol is how routers communicate the network map to each other. It builds a table which is consulted when packets arrive, so that the router knows in which direction to forward the packet. The best known protocol is BGP, which ISPs and backbones use to communicate route information between themselves. Routers within an organization typically use a different internal routing-protocol. See also route and router. [Wikipedia]

+route - The path between two points that a packet will follow.

+router - Just another name for a relay. It’s typically used in particular to refer to an Internet Protocol relay, as opposed to an Ethernet bridge/switch. [Wikipedia]

+RPC - “Remote Procedure Call”. This was a technology popular in the 1980s and 1990s for building networked applications that promised to make the network almost invisible to the programmer. Software is built from “procedure calls”, this technology allows some procedure calls to be made remotely. It’s mentioned in this book because RPC effectively forms a network layer on top of the Internet. [Wikipedia]

+RS-232 - This is the most popular standard for connecting two devices with  a cable that has been around since the 1960s.

+SDLC -  “Synchronous Data Link Protocol”. This was a protocol developed in the early 1970s by IBM for creating “smart” links between devices, when both sides contained a CPU running software. Prior to SDLC, one side of links was typically a dumb device (or or both sides without a CPU, without software), where “protocols” were either separate wires or special characters. What’s important in this book is how SDLC became Layer #2 of the OSI Model. It’s really impossible understanding the true meaning of the “Data Link Layer” without fully understanding SDLC, and it’s impossible to understand SDLC without considering history and IBM mainframes. [Wikipedia]

+segmentation - See fragmentation.

+sequencing - This is the process of adding a sequence number to packets. It serves several purposes. One purpose is that of a unique ID, whereby the recipient sends an acknowledgement packet with the same sequence number. When the recipient fails to receive the acknowledgement, it resends the packet. Packets can arrive in a different order than sent, so another purpose of sequence numbers is to put packets back into their original order, especially when reassembling fragments of data.

+session - This book mentions the word “session” a lot because it’s a prime example of the misinterpretation of the OSI Model. It meant something specific, unique to technology of the time. But that concept was soon obsolete and nobody really understood what was originally meant. Therefore, they re-interpreted the word to mean things completely different than what was originally intended.

+session cookie - A small bit of data sent by the web server, in a response. The web  browser then sends it back to the server on subsequent requests. A typical example is that once a user logs in, the server sends a token representing that login session. It all happens invisible to the user, though browsers do allow them to view the cookies set by servers, if the user wants. It’s mentioned in this text only to describe how these have nothing to do with what was originally intended by the Session Layer #5. [Wikipedia]

+simplex or +half-duplex - In the early days (before around 1980) many links consisted of a single wire that could only transmit in one direction at a time. It’s mentioned in this textbook because of the misconceptions about the Session Layer #5. It was intended to handle simplex links in a path, but by 1980, those sorts of links had largely disappeared from practical computer networks.

+SLIP - “Serial Line Internet Protocol”. When connecting to devices with a serial line, SLIP just sends the raw IP datagrams back-to-back, without almost any additional information. [Wikipedia]

+SMB -  “Server Message Block” protocol. This is the protocol that forms the basis of Microsoft Windows networking, created in the early 1980s. It used in this book to show the many ways different networks can be layered on each other. SMB is best modeled as a “network” not an “application”. [Wikipedia]

+SMTP - “Simple Mail Transfer Protocol”. This is one of the first major Internet protocols from the 1970s, for the purpose of transferring email. It’s still the way email is transferred across the Internet today. One reason this book mentions it is because email forms a network that runs on top of the Internet, largely independent of the Internet. [Wikipedia]

+SNA - “Systems Network Architecture”. This is the early network stack IBM designed for mainframe networks in the early 1970s. It works under completely different principles than today’s Internet. It’s frequently mentioned in this book because OSI was based upon SNA, which is why it’s a poor model for the Internet. [Wikipedia]

+sniffer - See packet-sniffer.

+socket - A socket is used in two different contexts. On the wire, it refers to the combination of source/destination addresses and source/destination ports, plus which transport protocol is being used (TCP or UDP). This “five tuple” uniquely identifies a connection or stream of packets. Another use for this word is the “descriptor” or “handle” used in code to refer to this connection, which is created with the socket() system call. [Wikipedia]

+Sockets - Aka, “Berkeley Sockets”. This is the standard transport API for writing software that uses the raw Internet One of the reasons it’s discussed in this book is the way that “transport” is really the top of the network stack where you’ll have an API . [Wikipedia]

+SPP - “Sequenced Packet Protocol”. This was part of Xerox’s competing XNS internetwork protocol stack, being roughly equivalent to TCP. Whereas TCP sends a sequence of individual bytes, SPP sends a sequence of packets.

+SSH - “Secure Shell”. This is currently the most popular way for a remote terminal. It’s mentioned here because remote terminals like Telnet were one of the first major uses of telecommunications and the Internet during the 1970s. Much of the OSI Model can best be understood as providing functionality for remote terminals. More to the point, SSH shows how those ideas were wrong – remote terminals did not evolve the way OSI designers predicted. [Wikipedia]

+SSL - “Secure Sockets Layer”, the primary method for encrypting network traffic. SSL not only encrypts traffic to protect it from passive eavesdropping, but also protects it from active interception with a man-in-the-middle attack, whereby somebody puts a device in the middle that pretends to be the server to one end of the connection, and the client to the other end. It does this with certificates that verifies the identity of the other end. With certificates, you can be certain that www.google.com is actually Google and not a hacker pretending to be Google. If a website uses SSL, it doesn’t mean it’s secure from being hacked, only that the network traffic to the server is secure from being intercepted. SSL is officially named TLS.

+stack - Also “network stack”. Networks are layered on each other. Within a network layer, there may be sublayers, as more complex functionality is layered on simpler technology. For example, the Internet is defined by the simple IP protocol with more complex transport protocols like TCP layered on top. Layering implies some independence, in other words, layers are a type of modules. This text mentions the network stack in contrast to the OSI Model, where in both cases, layering happens, but in vastly different ways. For example, a typical network stack doesn’t have a fixed number of layers, and the layers you see in your own computer won’t be the same as the layers in the computer you are connecting to. [Wikipedia]

+standard - A standard consists of guidelines that many different people follow. A good example is railway tracks. If they are the same width (plus other factors), then a train can travel from state-to-state, country-to-country, without having goods/passengers offloaded at each crossing. Standards can be de jure (some authority dictates) or de jure (everyone happens to be following it). OSI attempted to impose de jure standards on everyone, forcing them to build networks like mainframes. The Internet evolved from de facto standards, where everyone who wanted to connect to the Internet willingly adopted those standards. There are many official standards organizations in the world, all working under the international standards organization known as the ISO.

+stream - This is a descriptive word with no particular meaning, whose precise meaning depends upon context. On a local link, we might describe a stream of bits. At the TCP protocol, we often describe it as a stream of bytes. For something like Netflix, it would be a stream of video.

+sublayer - This would be a layer within a layer. One of the points of this book is that layers have sublayers, a layer may itself be a sublayer of something larger. In particular, layers #1 and #2 of the OSI Model are best described as sublayers of Ethernet, and layers #3 and #4 are best described as sublayers of Internet protocols.

+subnet - This has a couple related meanings. One meaning is any arbitrary subdivision of a network. Another refers specifically to Internet addresses, where the first bits address a subnet, and the remaining bits a specific device on the subnet. Routers only look at the subnet portion of an address when forwarding packets, except for the very last router that’s actually on the subnet. Another meaning is the local broadcast domain of the local Ethernet, which is usually congruous to the Internet’s concept of a subnet, but may not be. [Wikipedia]

+SWIFT - A network for transferring banking transactions. It’s an example of a network layered on top of other networks. [Wikipedia]

+switch - This has many meanings. In the history of telecommunications (circa 1900), switches were mechanical things operated by humans to connect/disconnect copper wires. Later telephone switches were electromechanical devices, controlled by pulsed from rotary telephones. With the digitalization of the telephone system in the 1960s after the invention of the transistor, telephone switches became digital devices forwarding streams of bits in virtual circuits. Since the early Internet was built on top of telephone company circuits, the word “switch” came to mean what happened beneath the Internet, while Internet relays were called “routers”. Thus, today, we have Ethernet “switches” for local networks in the home and business which forward Ethernet packets based upon MAC addresses, and Internet “routers” running at a higher layer. This book stresses how all packet relays are fundamentally the same purpose, but that individual details differ. Thus, Ethernet is a network layer that relays packets, and we prefer to call those relays “switches”. And the Internet is another network layer that relays packets, where we prefer to call relays “routers”. [Wikipedia]

+telegraph - The telegraph system of the 1800s formed that era’s cyberspace. People were able to send short text messages (around the size of an SMS or tweet) to people across the country, and eventually, across the ocean. They are interesting in this book because of their history with character-sets and control-codes, such as the eventual development of ASCII.

+teletype - (Also a brand, Teletype) A teletype is a combination of a teleprinter and typewriter (keyboard). It was used for sending telegraph messages. Unlike previous systems that required extensive operator training, a teletype could be used by anybody, typing characters like any other typewriter. In the 1960s, with the Teletype Model 33, it became common to connect a teletype to computers, in order to interact with the computer. (Without a teletype, interaction was done with jobs).

+Telnet - This is a remote terminal protocol. In the early Internet of the 1970s, it was the most important protocol, citing in early documents as one of the main protocols layered on top of the Internet. Due to the lack of built-in encryption, Telnet has largely been replaced by SSH. [Wikipedia]

+TCP - “Transmission Control Protocol”. This is the upper portion of the Internet protocol suite, providing end-to-end connection-oriented service with congestion-control.

+TCP/IP - This refers to the two most essential protocols of the Internet stack, the IP internet protocol and the TCP transport protocol. In many technology discussions, the name “TCP/IP” is synonymous with “the Internet”. At one time (in the early 1970s), these were combined into a single protocol, the TCP. In the late 1970s, they were split into two, the lower IP protocol for routing packets, and the upper TCP protocol for handling end-to-end connections. [Wikipedia]

+telcom - See telecom.

+telecom - A telecommunications company. Before the Internet, the world-wide cyberspace consisted of the world’s telecoms. During the height of WW II, Hitler could call Churchill on the phone. In most countries, they were state-owned monopolies known as PTTs or Post, Telegraph, and Telephone companies, with governments monopolizing all three businesses. In the United States, there were many telephone and telegraph companies, with AT&T monopolizing long-distance telecommunications.

+telecommunications - Back around 1900, there were telephone and telegraph networks. These were combined into the word telecommunications. In the 1970s, telecommunications networks were distinct from computer networks. In today’s Internet, the distinction isn’t important. Sometimes traditional telephone happens over the Internet, sometimes the Internet runs over traditionally telephone-oriented circuits. A telecommunication’s company was known as a telcom or telecom. Standards were set by the ITU or International Telecommunications Union. The word typically refers to low-level physical transmission, not really including higher layer protocols like the Internet. The word typically refers to long distance communications, rather than local networks in the home/office. [Wikipedia]

+teletext - This was a service for sending text along with television signals to TVs or set-top-boxes with 8-bit processors. It’s described in this document as merely an extension to videotext services like Minitel. [Wikipedia]

Figure - from Wikipedia

+terminal - This word typically applies to a device containing a keyboard to send text, and something to receive text, either printing to paper (teletype) or displaying on a simple screen (historically, a CRT, a cathode ray tube, the technology used for televisions of the time). Presumably, the name comes from the fact that it is on the end (terminus) of either a computer network or telecommunications network. A remote terminal was the most common application of early networking in the 1970s, such as the Telnet protocol. Every make of terminal used different control codes, therefore, a goal of the Presentation Layer was for apps to negotiate which control codes to use for the terminal.  [Wikipedia] [Wikipedia]

+TFTP - “Trivial File Transfer Protocol”. This is a version of FTP (see FTP) that transfers files over UDP instead of TCP. The consequence of this is that there is no congestion control and will likely cause problems on the network if used to frequently transfer files. It’s almost entirely used for things as firmware updates to IoT devices and local routers. [TFTP]

+TLS - “Transport Layer Security”, the official name for SSL. See SSL. [Wikipedia]

+Token Ring - This was a local network technology designed to compete with Ethernet, developed by IBM. It has some technical differences that IBM advertised as advantages, but it really wasn’t any better. It existed purely to be a de jure standard that IBM de facto controlled. It’s interesting in this book as discussion of early office networks, how there were alternatives to Ethernet at the time. [Wikipedia]

+Tor - “The Onion Router”. This is a privacy network on top of the Internet that forwards traffic through multiple gateways to hide the origin of the traffic. At its core, it’s still establishing a TCP connection, but with a hidden address. The address appears to be of only the last “exit” node in the chain. The book mentions Tor as an example of an entire network layered on top of the Internet, how a fixed number of layers isn’t a good model. [Wikipedia]

+traceroute - Internet packets have a hop-count or TTL field that is decremented every time a packet is relayed through a router. When it reaches zero, the packet is discarded, and a message (using ICMP) is sent back to the sender. When a packet is originally sent with a large TTL value, it’ll only reach 0 if there’s a loop. A clever trick is to send deliberately low values (like 1, 2, 3, 4, ..). By recording the IP addresses of the routers sending back error messages, one can “trace” the route the packets follow. [Wikipedia]

+traffic-class - This is a field in the IPv6 header that’s really uninteresting. It’s rarely used. It’s mentioned in this book only to emphasize how otherwise simple the IPv6 header is, containing addresses, the hop-count, and which protocol header follows next.

+transceiver - This is the part that translates a digital stream on one end to an analog signal on the other. Most of what we think of as “networking” happens at this point. In the hierarchy of network concepts, this is at the root. The field of electronic signals describing this is greater than all the rest of the network, far greater than what’s described in this text. For Ethernet in particular, one side is the PHY and the other side the MAC.

+transcode - This means to convert from one format to another. This happens as part of video streaming services like Netflix to adjust for bandwidth. As bandwidth decreases, the video needs to be recompressed to a larger extent, lowering its quality. It’s mentioned in this book because the normal way of describing the network stack is that congestion control is something that happens within one location in the stack. In reality, applications regularly need to deal with themselves. It’s also mentioned in this book because Netflix’s transcoding servers form an independent network on top of the Internet.

+transmit  or +transmission - This simply means to send something over a link or a network. At the lowest level, it consists of converting digital bits into analog signals and boosting that signal. At the highest level, it means simply using the network. See also TCP.

+transport - As this book describes it, transport is the thing at the top of a network, with end-to-end connections. As OSI describes it, the Transport Layer #4 is something in the middle of a network stack. The Sockets API using TCP and UDP is the archetype of a transport service.

+TTL or +time-to-live. Packets are transmitted with a maximum limit to the number of hops they can traverse, the maximum number of times that a packet can be relayed through a router. Each time a router relays a packet, it decreases the number by one. If it goes to zero, the router assumes there’s a routing loop, and drops the packet instead of relaying it. In such cases, the router sends a packet back to the sender (ICMP) informing them. One reason this book mentions TTL is to highlight the simplicity of the IP protocol headers. It’s the major field in the header after the addresses themselves. [Wikipedia]

+UDP - “User Datagram Protocol” Internet routers famously treat each packet individually, as if they are the only packet ever to be sent from that source to that destination. That multiple packets may be related is only known on the ends with protocols such as TCP. TCP first sends packets back and forth to establish a connection before any data is sent. When all the data is truly contained in just one packet, we call it a datagram. Datagrams exist within the network stack, but are also visible to the user (where “user” means the programmer writing an app) in the form of the “User Datagram Protocol” or UDP. UDP has the same port numbers and checksums as TCP packets, but nothing else. It doesn’t have the sequence numbers needed to track multiple packets. Nor does it have any startup handshake for a connection, it just sends all the data in one packet. One reason for using UDP is when building DNS, which contains a single lookup packet and a single response packet. Another reason is for broadcasts on the local network, where one packet can be sent to many destinations, where the concept of a connection isn’t practical. This book (and everyone else) uses UDP to contrast with TCP, that they are both transport protocols, but with different characteristics serving different needs. [Wikipedia]

+Unicode - This is the universal character-set used by most computers and data formats these days. It has roughly 1 million code-points (aka. characters) for nearly all the languages used these days, as well as ancient languages. It’s interesting to this text because historically, different computers used different character-sets, and it was assumed that the network would need to translate. The reality is that nearly all computers now use some form of Unicode, and data uses a specific encoding, such as UTF-8.

+UTF-8 - “Unicode Transform Format - 8 bit”. This is the standard character-set encoding used in computers today. Because Unicode has over a million characters, it takes many bytes to encode a single character. It’s interesting for this textbook because it was once thought that character-set translation should be something that the network should do, because different computers had different character-sets. It’s now assumed that every computer will use some form of Unicode, and that most data will be encoded with UTF-8. In any event, character-sets are properties of the data, not of the computer. [Wikipedia]

+videotex - This describes a service over the dial-up phone network using 8-bit dumb terminals within homes. It displayed either text or simple text-mode graphics. It’s described in this document because the Session Layer #7 functions were largely defined to support things needed by videotex. The three major videotex products were the German BTX, French Minitel, and English Prestel. This service ran over an X.25 infrastructure. [Wikipedia]

+VPN - “Virtual Private Network”. This provides a network on top of a network. It’s used in this book to point out the way networks are layered on each, that they are layered differently from each other, that there are no fixed number of layers. [Wikipedia]

+WAN - “Wide Area Network”. This is an obsolete term which people need to stop using in the present. In the 1980s, there were fundamentally two types of technologies, one for an office network inside a building or campus, and long distance communications. The word LAN was used to refer to that local network, whereas WAN referred to the long distance network. The long-distance network was still private to the organization and not something like the Internet. Today, the technologies have merged, so there’s no real distinction. Today’s descriptions of WANs, like on Wikipedia, fail to reflect what originally meant. In many cases, it referred to things like X.25 circuits for connecting a terminal to a remote mainframe. In other cases, it represented just long distance links between routers, on top of which the Internet worked. It was never a technology like LANs where you simply connected a bunch of computers together.  [Wikipedia]

+web - Also known as the “World Wide Web” or “WWW”. As this text describes it, the web is essentially its own network layered on top of the Internet. [Wikipedia]

+WiFi - The thing to know about WiFi is that it’s a local network technology compatible with Ethernet. People mix and match WiFi devices with Ethernet devices, “bridging” them together in one local network. It’s a completely independent technology from the Internet, which is why older WiFi devices that don’t support IPv6 can still be used with IPv6 in bridge mode. Typically, though, most WiFi access-points are used in router mode, where they route IPv4 and IPv6 packets instead of bridging. [Wikipedia]

Figure: Wireshark packet-sniffer and protocol-analyzer.

+Wireshark - The most popular packet-sniffer. With this tool, we can see the structure of packets, such as protocol headers.  [Wikipedia]

+worm - A worm is software that infects a computer by itself, without human interaction. In contrast, computer viruses require interaction with a human, such as clicking on a program to run it. Because they are not slowed down by the human element, worms can grow exponentially fast, destabilizing the entire Internet. Worms infecting Unix operating systems were common in the 1990s, worms infecting Windows have been common since.

+X.25 - This was the early vision for cyberspace by the telecom companies, it was essentially the Internet before there was an Internet. Its design was radically different from the modern Internet, using circuits rather than packets as its fundamental unit. [Wikipedia]

+X.200 - Two different standards organizations documented the OSI Model. The ITU-T described it in the X.200 document[137], and the ISO described it in the  ISO/IEC 7498 document. The two documents specify the same standard. Individual standards related to OSI are in the X.2xx series.

+X.509 - An ITU standard for certificates in SSL. [Wikipedia]

+Xerox - This was the major innovator in networking technology around 1970, inventor of Ethernet and many internetwork technologies. It’s famous for having created the precursor to the modern desktop computer in the late 1960s, with networking, a mouse, and a windowing system. This was the inspiration for the Apple Macintosh, and later Microsoft Windows. [Wikipedia]

+XNS - “Xerox Network System”. Xerox invented the local area network with Ethernet, and also invented their own internetwork technology known as XNS. It worked similarly to the Internet’s TCP/IP network stack, with a basic datagram protocol on the bottom (IDP) and a transport protocol on the top (SPP). Had things gone slightly different, today’s cyberspace might’ve been based upon SPP/IDP instead of TCP/IP. [Wikipedia]

11. Some References

The following is a random assortment of links, mostly dealing with the history of networking. It’s hard to appreciate what OSI means today without understanding what it meant when it was written.

This is not an exhaustive list of references used to create this document. It’s just random notes I encountered along the way.

Discussion of the Bachrach memo criticizing Ethernet.
https://www.reddit.com/r/reddit.com/comments/1xz13/in_1974_xerox_parc_engineers_invented_ethernet/
https://bachrachtechnology.com/wp/the-1974-bachrach-ethernet-memo/

“A Protocol for Packet Network Intercommunication”, by  Vint Cerf and Bob Kahn. This paper from 1974 defines the TCP. While it embodies the same goals as the TCP/IP we have today, it’s a completely different implementation. It’s prett unrecognizable according to how we see networks today. It would later be influenced by Xerox PUP, Cyclades, and Ethernet.
https://www.cs.princeton.edu/courses/archive/fall06/cos561/papers/cerf74.pdf

“History of Computer Communications”, a history that comes from interviewing 80 techies involved in the early networks. An important note is that there’s no single history, as different people tell different stories of what happened. Each eye witness sees a different version of the events.
https://historyofcomputercommunications.info/section/1.0/introduction/

“Engineering and Technology Wiki”, another source of first-hand accounts with those involved in creating the technology.
https://ethw.org/Main_Page 

This paper from 1980 is written by Hubert Zimmerman, one of the primary authors of the OSI Model. It fills in the OSI Model, describing in practical terms what the official documents only describe vaguely. If you attempt to read the actual standard and want to know what those terms were intended to mean, this document explains a lot them.
https://cseweb.ucsd.edu/classes/wi01/cse222/papers/zimmerman-osi-itoc80.pdf

Computer History Museum - Interview of Hubert Zimmerman
https://archive.computerhistory.org/resources/access/text/2018/01/102738698-05-01-acc.pdf

Interview with Charles Bachman. One of the key quotes is “OSI movement was more oriented towards business transaction processing and moving files back and forth, but essentially different than the highly interactive use of the Internet today.”. It reinforces the point that OSI and the Internet are fundamentally different networks.
https://ethw.org/Oral-History:Charles_Bachman

The 7 Layer Burrito, a humorous take on the model.
https://web.archive.org/web/19990826193318/http://www.europa.com/~dogman/osi/

LAN Standards: A Status Report. This shows the conflicts in making Ethernet fit into a standard, as well the effort IBM spent trying to kill Ethernet in favor of it’s own LAN standard.
https://www.sciencedirect.com/science/article/pii/S147466701764401X

The ALOHA System—Another alternative for computer communications. This created a “packet radio” system in Hawaii, where many people would be attempting to transmit on the same radio channel. It had the same problems of preventing collisions when two people tried to transmit at the same time. It’s unremarkable radio problems which have always suffered from multiple people transmitting on the same channel. It enters our story because some brilliant engineers realized the simple fact that a copper wire is just a radio channel, and hence the same techniques could be applied to prevent collisions on a cable – creating Ethernet.
https://www.clear.rice.edu/comp551/papers/Abramson-Aloha.pdf

IEEE interview with Bob Kahn on birth of internetworking
https://spectrum.ieee.org/bob-kahn 

Cisco OSI Model Reference Chart
https://learningnetwork.cisco.com/s/article/osi-model-reference-chart
This demonstrates all the problems described in this text.

Elements of Networking Style by M.A. Padlipsky
https://amazon.com/dp/0132681293
Written by one of the early ARPANET engineers, contains essays about conflicts with OSI.

Patterns in Network Architecture by John Day
https://amazon.com/dp/0132252422
Strips away any model and looks at architectural issues for networks, going  back to the early ARPANET.

The World In Which IPv6 Was A Good Design
https://apenwarr.ca/log/20170810
Some ideas about IPv4 and IPv6

The Actual OSI Model by J. B. Crawford
https://computer.rip/2021-03-27-the-actual-osi-model.html
Discussion trying to look at the model in its own terms. I don’t think its helpful as this text. For example, you can’t look at the definition of the Data Link Layer and make any sense of it until you understand IBM’s SDLC.

End-to-End Arguments in System Design by Saltzer, Reed, and Clark
https://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf
An early essay about the benefits of end-to-end thinking.

OSI Deprogrammer chat on Hacker News
https://news.ycombinator.com/item?id=38004699
Discussion of this book. You might find helpful comments or rebuttals here.


[1] The main criticism from reviewers is that this document is too dense. It needs to be: it’s telling the experts in the industry that they’ve been wrong for 40 years.

[2] OSI is ultimately just a few minutes of lecture and a few pages in the textbook in a university course. This text is like using an anvil to swat a fly.

[3] It’s not simply false, but the fact that teachers know it’s false and teach it anyway. This makes it a lie.

[4] They justify the lies by claiming it helps explain concepts to students – but it’s not helpful.

[5] Professors don’t understand what OSI meant specifically, so they assume OSI was talking about theoretical generalities.

[6] Framework, ontology, taxonomy – any of these words would work in this context. It’s something where we can pigeonhole concepts/terminology into boxes.

[7] People conceptualize, or build a model in their head, about how things work. No such model is accurate, but even inaccurate models can be helpful. The thesis of this text is that other models are more helpful that the OSI Model.

[8] Firstly, nobody follows the standard. But more importantly, nobody really understands what was originally meant, so their explanations of what the model says don’t agree. At least if every textbook or professor provided the same explanation, then it’d arguably be a standard, but they don’t.

[9] Throughout the 1960s and 1970s, IBM monopolized the computer industry with big powerful computers called “mainframes” that took up entire rooms, accessed remotely in other parts of the office with terminals.

[10] Using the command-line or ssh doesn’t count. This sort of experience only helps a little.

[11] This is probably the biggest hurdle to overcome in this text: people assume everything they don’t understand must be some sort of theory. You can’t prove it’s not theory without first teaching them what it originally meant, which leads to this long, dense text trying to teach what’s really useless information since networks don’t work that way.

[12] https://en.wikipedia.org/wiki/Videotex 

[13] https://en.wikipedia.org/wiki/Nostradamus 

[14] Today’s orthodoxy is that Ethernet matches only the first two layers, below the Network. In fact, it implements (with LLC) the first 4 layers of the OSI Model, as originally envisioned.

[15] To repeat: the Internet isn’t the Network layer, but the Internetwork layer. Since such a layer doesn’t exist in OSI, we pretend the Internetwork and Network layers are the same.

[16] https://www.itu.int/rec/T-REC-X.200-199407-I/en 

[17] (ISC)² CISSP Official Study Guide, 8th Ed. - https://www.amazon.com/dp/1119475937 

[18] An epiphany that, as we teach later in this text, is probably a misconception.

[19] This term was coined by Thomas S Kuhn in his book The Structure of Scientific Revolutions (1962). His text applies to what’s going on here, describing how people resist changes in the prevailing paradigm.

[20] SNA, X.25, XNS, IPX, Vines, NetBIOS in all its forms, SunRPC, DCE-RPC, all of SMBv1, and so on. In the section below is a post from the 1990s showing which protocols the Sniffer could handle – I wrote code for each and every one. I also drew that poster.

[21] OSI Model - https://en.wikipedia.org/wiki/OSI_model 

[22] The model that we know today was largely complete by 1977, at which time it became unofficially adopted as a way of describing networks. It wouldn’t become a final, official standard until 1984. I give the number 1977 as its origin date because that’s when I find academics starting to cite it.

[23] ARPANET split TCP and IP into separate protocols around 1979, based upon the ideas of Xerox protocols and Cyclades network. Before that, the design of ARPANET looked more like OSI.

[24] https://en.wikipedia.org/wiki/Connectionless-mode_Network_Service 

[25] By fully implemented I mean fully interoperable among vendors. At most, there was some compatibility with routers, like Cisco’s support of CLNP, because they had to work with end-systems. But for the upper layers, it’s hard finding any interoperable applications.

[26] Computer Networks and Internets (6th ed). - https://amazon.com/dp/0133587932 

[27] …namely, that these aren’t layers of a single stack working together, but that there are multiple networks independent of each other.

[28] I was “Chief Architect” for the company at the time and am the primary author for that poster.

[29] This is a rather recursive definition: the Session Layer handles sessions, and the definition of a session is something handled by the Session Layer. This is why nobody understands it, because they are left wondering what is really meant here.

[30] This sort of definition has left a lot of people confused, interpreting this to mean the Session Layer manages connections, where connections are actually in the lower Transport Layer.

[31] That it’s been retconned to mean a “local network” in addition to a “local link” is the most profound change in the model between how it was defined in the 1970s and how it’s used today. Originally, Ethernet switches/bridges were considered part of the network layer, forwarding packets along a route. 

[32] An antenna is a wire.

[33] Yes, the ITU X.200 and ISO/IEC 7498 documents are a written standard for OSI, but nobody uses them. And they don’t match the consensus view anyway.

[34] “What is the OSI Model?”
https://www.cloudflare.com/learning/ddos/glossary/open-systems-interconnection-model-osi/ 

[35] Of course, many disagree, and believe it has been useful in describing some things, but this is almost always a misconception. The chapter on Misconceptions lists all the reasons why people think OSI is helpful and why they are wrong.

[36] …or work alongside of …

[37] IPv6 was created over 20 years after Ethernet was invented, but still works on the oldest Internet technologies. You didn’t need to upgrade Ethernet equipment to use IPv6.

[38] https://www.rfc-editor.org/rfc/rfc1149 - A Standard for the Transmission of IP Datagrams on Avian Carriers

[39] How signals are transmitted (and received) on the wire is more complex than the rest of networking put together.

[40] When IPv6 was defined, TCP had to also be redefined. The necessary change for TCPv6 was small, changing how the checksum is calculated, but still reflects that these protocols are dependent on each other rather than completely independent.

[41] LLC is a sublayer of Ethernet that exists for other reasons, but isn’t used when carrying Internet traffic, so is often ignored.

[42] Other than as either a historic footnote, or the fact that students will see it mentioned elsewhere.

[43] As the Misconceptions chapter describes, the OSI Session Layer #5 doesn’t even handle sessions – those are handled by the Association Control Service Element or ACSE in the Application Layer #7.

[44] As described in the Misconceptions chapter, encoding/representation is a property of the data and not of the network.

[45] Telnet and SSH still support negotiating terminals, and support hundreds of different terminal types. In practice, everyone just uses an ANSI/VT100 terminal emulator. A lot of command-line programs assume that terminal and won’t work with any other – in theory they could be written to work with any terminal, but are’t.

[46] The distinction between a network and a link is something implicit in the OSI Model, between the Network Layer #3 and Data Link Layer #2. The problem is that it fails as a model. Ethernet is simultaneously a link and a network. Links can run over a network as much as networks run over links. The distinction between networks and links is therefore very important, just not what OSI describes.

[47] After the invention of the transistor in the 1950s, the telecoms created the T carrier system, a network that forwards traffic as a stream. A phone call starts a stream that continues until you hang-up. The HDMI standard likewise transmits data to the TV/monitor as a stream of bits. The alternative standard, DisplayPort, is packet based. Everything else is packets. The original 2G cellphone standard was a stream of bits, then updated with 2.5G GPRS packets. Every mobile network from 3G onward has used packets instead.

[48] IBM SNA, the inspiration for OSI, called the network layer the Path Control Layer. I sometimes think that’s a better description.

[49] RFC 1055 - SLIP

[50] This is the average size. Internet addresses have a variable prefix length, from fewer than 8 bits to more than 24 bits. It depends upon the address. Addresses are written with the length of the prefix, so we write something like 10.0.0.0/8 or 192.168.0.0/16 to indicate that the first case 8 bits are the prefix and the second case that 16 bits are teh prefix.

[51] You really only see LLC in raw WiFi packets.

[52] https://datatracker.ietf.org/doc/html/rfc791 

[53] The IEEE also defined 802.4 Token Bus and 802.5 Token Ring that fit this model, but they are historical footnotes. Today, all we care about is Ethernet and WiFi.

[54] LoRa stands for “long range”. It works in the public 900 MHz band at speeds below 0.3 mbps. https://en.wikipedia.org/wiki/LoRa 

[55] https://en.wikipedia.org/wiki/Amazon_Sidewalk 

[56] Often historically – today’s email is now primarily Internet based, but you still can get email to non-Internet nodes.

[57] This is only technically true. This claim counts GPUs (graphics processors) and TPUs (tensor AI processors) in the total count of operations-per-second. But even so, it’s practically true that in almost any app, most of the computations are performed on the phone than the server.

[58]  like Novell NetWare, Microsoft LanMan, and Unix NFS

[59] There are two forms of security. The first is the rules we define what people can do, like requiring passwords or encrypting data. The second is what hackers do when bypassing the rules, such as buffer-overflows. In the above text we mean only the first: authenticating users and encrypting data sent over the network. A secure website using HTTPS means people can’t eavesdrop on data sent to the website across the network, like credit card numbers. It doesn’t mean that the website itself won’t commit fraud and misused your credit card number.

[60] I arrive at this number informally, having worked with companies around the world analyzing their traffic. I haven’t done a formal, systematic survey.

[61] They no longer do. The effect has become so famous that those editing articles now try to prevent it from happening.

[62] https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosophy 

[63] https://en.wikipedia.org/wiki/Modulation 

[64] https://en.wikipedia.org/wiki/Telegrapher%27s_equations 

[65] Like LLC or TCP

[66] The Associated Press is a wire service, that has been transmitting news articles since the era of telegraph technology in the 1800s.

[67] This process is called Google dorking – using Google features to look for things that people might not be aware they’ve made public.

[68] DHCP is the Dynamic Host Configuration Protocol, when you connect to a network, it’s the most common way your machine gets assigned an IP address.

[69] To be fair, this is what happens with most subjects. Education is often incomplete, so people fill in the blanks with assumptions and reverse-engineering. At the limits of everybody’s knowledge will be misconceptions.

[70] (ISC)² CISSP Official Study Guide, 8th Ed. - https://www.amazon.com/dp/1119475937 

[71] https://en.wikipedia.org/wiki/OSI_model 

[72] Not you, of course, I’m referring to other people who appear not to understand what these mean.

[73] https://www.rfc-editor.org/rfc/rfc1034 - DNS - Concepts and Facilities

[74] https://www.rfc-editor.org/rfc/rfc1035 - DNS - Implementation and Specification

[75] (ISC)² CISSP Official Study Guide, 8th Ed. - https://www.amazon.com/dp/1119475937 

[76] IBM mainframe networks and telecom X.25 telecommunications

[77] You should try this. Pull out the CISSP study guide, or even the original X.200 document, and go to your professor’s office and ask them to explain “network service optimization”. Sure, you’ll fail the course, but it’ll be fun.

[78] Getting To Philosophy no longer works, because no Wikipedia editors are aware of the phenomenon and deliberately disrupt it. But this demonstrates the principle: we like to place knowledge in categories

[79] A large number of tests ask students to answer that the Session Layer manages sessions. It’s firstly the easiest possible question to answer right, though ironically, it’s wrong. While it’s what the test giver believes is true, the fact is according to the original OSI it’s false, because sessions didn’t really exist back then.

[80]  (ISC)² CISSP Official Study Guide, 8th Ed. - https://www.amazon.com/dp/1119475937 

[81] Layer #2 was created to describe SDLC, so you can spend an hour on this and students will still be confused, but because it doesn’t exist on the modern Internet. It’s just one example of how when you try to teach the original meaning, it’ll just confuse students.

[82] Specifically, Ethernet forwards packets based upon the full Ethernet address, while the Internet forwards packets using only part of the IP address (the subnet prefix).

[83] https://www.rfc-editor.org/rfc/rfc7241

[84] https://www.rfc-editor.org/rfc/rfc791 

[85] https://www.rfc-editor.org/rfc/rfc1149 

[86] MPLS stands for Multi-Protocol Label Switching, https://en.wikipedia.org/wiki/Multiprotocol_Label_Switching 

[87] Microsoft’s RPC is a fundamental part of the Windows operating system, both for internal applications on a single Windows system, as well as for building applications that cross multiple systems.

[88] https://en.wikipedia.org/wiki/Mechanical_floor 

[89] An IDS or intrusion detection system eavesdrops on a network wire, like a packet-sniffer, but searches for signs of hacker activity.

[90] …or fragmentation, compression, encryption, or other transformation

[91]ITU X.200 -  https://www.itu.int/rec/T-REC-X.200-199407-I/en 

[92] ISO/IEC 9498 - https://www.ecma-international.org/wp-content/uploads/s020269e.pdf

[93] X.200 - https://www.itu.int/rec/T-REC-X.200-199407-I 

[94] ISO/IEC 7498 - https://www.ecma-international.org/wp-content/uploads/s020269e.pdf 

[95] https://www.itu.int/rec/T-REC-X.800-199103-I 

[96] July 22 is the official Pi Day, not March 14.

[97] https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-decimals-of-pi-do-we-really-need/ 

[98] https://www.rfc-editor.org/rfc/rfc793 

[99] We all use the standard DEC VT100 control-codes on remote terminals for things like SSH today, which are now hard-coded into applications. But there are still libraries like curses and features in SSH to negotiate the use of different control-codes. If you want to connect to an IBM mainframe instead, you are likely going to use an entirely different program to emulate IBM tn3270 terminals.

[100] This tautological, but I stress it because of the implications. Today, all software and hardware is designed with networking in mind. Back in 1970, it was all designed with a different assumption, that there would be no network, no transfer of the data off the computer.

[101] https://www.itu.int/rec/T-REC-X.216-199407-I/en 

[102] https://www.itu.int/rec/dologin_pub.asp?lang=f&id=T-REC-X.226-199407-I!!PDF-E&type=items 

[103]  My theory is that tests contain gimme questions so students can at least get one thing right.

[104] So much confusion could’ve been avoided, and so much more fun could’ve been had, if only they’d named the the Intercourse Layer.

[105] OSI tries to teach that layers are independent of each other, but in fact, all the layers are intertwined. Here we see layer #5 dealing with problems that should’ve been contained to layer #2.

[106] Internet, Ethernet, web, MPLS, Tor, VPN, etc.

[107] https://www.ietf.org/rfc/rfc787.html 

[108] SDLC stands for Synchronous Data Link Control

[109] X.200 - https://www.itu.int/rec/T-REC-X.200-199407-I 

[110] ISO/IEC 7498 - https://www.ecma-international.org/wp-content/uploads/s020269e.pdf 

[111] Not until the invention of the transistor and PCM encoding of voice traffic into a 64-kilobit-per-second stream did it become possible for data to carry voice, around the year 1960.

[112] Much of Europe and Asia had state-run PTTs, but the United State had AT&T, a private company, as well as many competitors to AT&T. Thus, I choose telecoms to refer to both PTTs and private companies like AT&T.

[113] https://www.rand.org/content/dam/rand/pubs/research_memoranda/2006/RM3420.pdf

[114] SNA stands for Systems Network Architecture. No, it’s not a helpful name.

[115] Actually, it was based upon Honeywell’s competing version of SNA that they called HDNA, but this distinction isn’t terribly important.

[116] For example, RISC processors were invented by IBM, it’s 801 processor. IBM didn’t do anything with it, it was mired in corporate politics, but it was the inspiration for both the Standard and Berkeley RISC processors.

[117] From a certain point of view, it wouldn’t make sense to make something that didn’t work with IBM’s mainframes. But from another point of view, mainframes were already obsolete. The Internet’s TCP/IP was completely incompatible with IBM and still eventually won to become the standard.

[118] SDLC stands for Synchronous Data Link Control. These words don’t really mean much.

[119] No, this isn’t the OSI Session Layer #5.

[120] Though many were still billed for the circuit, just a lot less, with the primary costs coming from data sent.

[121] https://computerhistory.org/blog/net-50-did-engelbart-s-mother-of-all-demos-launch-the-connected-world/

[122] PARC is Palo Alto Research Center, the Xerox offices were these technologies were being developed.

[123] https://www.cs.princeton.edu/courses/archive/fall06/cos561/papers/cerf74.pdf 

[124] Or symbols containing multiple bits.

[125] Fiber optics is a wire. An antenna is a wire.

[126] This is insane, by the way. Go back in time and tell the original USB engineers we’d getting 240 watts across USB and they’d assumed that by 2020 we’d also have figured out how to go faster than the speed of light.

[127] TCP/IP wouldn’t appear as we know it until around 1979, but large groups opposed it nonetheless because it was sponsored by the American military.

[128] RS-232 - https://en.wikipedia.org/wiki/RS-232, pay particular attention to “3-wire RS-232”.

[129] Token Ring was a trick by IBM to provide a standard that only it (primarily) supplied. Thus, the IEEE had the 802.3 standard for Ethernet and the 802.5 standard for Token Ring. There was also an 802.4 Token Bus standard, but nobody used that.

[130] Famously, MS-DOS and Windows computers would boot from the drive labeled C:, and when they mounted a remote drive, it would appear locally with a letter like F:.

[131] In 2017, the Internet partially failed from the Wannacry and notPetya worms that exploited a bug in SMB that goes back to the early 1990s.

[132] Made even more offensive by the fact that it doesn’t help.

[133] Made especially confusing by the fake that the original IPv4 protocol violated the original OSI Network Layer.

[134] Whereas academics focus on theory, professionals focus on practice. It’s a flaw for each that they don’t spend enough time on the other.

[135] Certified Information Systems Security Professional, sponsored by ISC².

[136] OSI Model - https://www.ecma-international.org/wp-content/uploads/s020269e.pdf 

[137] OSI Model - https://www.itu.int/rec/T-REC-X.200-199407-I