this page last updated: Mon May 19 12:22:31 PDT 2014
Fiber-based networks (which are more reliable, lower latency and higher bandwidth than wire networks) introduced the possibility of implementing dynamic network reconfiguration without sacrificing network stability. In particular, it became possible to use table-driven forwarding rules at the link level.
Before this time, IP defined network routes at layer 3 -- the network routing level. Link level protocols were "stateless" in that they need only manage the transfer of data between two fixed end points. The "state" defining the end points doesn't change (or it can be rebuilt using a broadcast via ARP). Thus all information pertaining to a route that data must take as it traverses the network was originally confined to Layer 3. This information is managed in a per-hop "routing table" that indicates the point-to-point network link that data must take to make its next hop.
With ATM and SONET, however, table-driven link-level protocols made it possible for an abstract "network link" to be implemented as a routed "path" across intermediate link nodes. These original technology-specific protocols eventually informed a standard for such link-level state management called MPLS.
Like most techniques in networking, the idea of manipulating layer 2 network state is not unique. MPLS defines "label based routing" as a methodology for doing a table look up in the switch to determine where a packet is going based on a tag (or label) in the packet. Since it is a layer 2 protocol, it can (and usually does) rewrite the tag before the packet egresses based on the "path" through the layer 2 network that the packet needs to take (the table specifies the output port and the next tag).
Turns out that "normal" IP network switches implement some of this functionality as well. First, it is important to understand the difference between a router and a switch.
For IP, a router must implement a "longest-prefix-match" algorithm to determine the outbound network interface. IP addresses come in classes and, of course, IP address ranges can be subnetted. Thus the job of the router is to find the "best" (longest) match between prefixes in its routing tables and the destination address. Indeed, the original motivation for MPLS and its predecessors was that longest-prefix-match was too complex to implement in an ASIC. Today's routers use more complex ASICs and do the routing entirely in hardware, but originally, routing was a software activity.
A switch does not implement hierarchical addressing. In fact, originally (and for so called "dumb" switches today) there was no table look up at layer 2. In a switch, a packet that came in on a port was sent unceremoniously to all other ports. Since the switch was stateless, it couldn't "know" which outbound port a packet should use so it simply sent it to all of them.
Managed Switches, however, are smarter (and more scalable) than their dumb counter parts. They pay attention to ARP broadcasts and record MAC address-port mappings in forwarding tables. Because the MAC address is not hierarchical (it is strictly point to point), a switch need not implement longest-prefix-match and thus the table management is very fast and cheap to implement in hardware.
Switch tables are managed internally using only local information. That is, because a switch is theoretically between two end points of a link, it need not consult with other switches when it builds its tables (say to avoid routing loops). However, it wasn't long before network architects discovered that manipulating these tables via a control interface could prove useful (say because a spanning tree algorithm was malfunctioning).
OpenFlow is a standard control protocol for managing forwarding table information at layer 2. Instead of allowing the internal switch logic to build and rebuild its tables, OpenFlow specifies that each switch has a controller (one controller can serve many switches) that is responsible for managing the switches tables.
The advantage of OpenFlow is that the controller is programmable meaning that it can implement policies that determine what table updates to send the switches it controls. For example, it is possible to implement Access Control Lists for MAC address forwarding that change dynamically. The model is that a switch will forward packets using its hardware and tables if there is a table match. If there is not, it will contact the controller and ask for a table entry for the packet. Thus, by keeping an ACL and invalidating switch tables when it changes, an OpenFlow controller can implement policy in the layer 2 path.
The disadvantage of this approach is that the controller's logic (at least some portion of it) may need to be executed while the packet is in flight. Unless pre-computed tables are loaded into the switches, the typical interaction is for a packet is to contact the controller the first time it sees a packet with a MAC address it doesn't recognize (i.e. for which there is no table entry) to get a table entry. This table initialization address a performance overhead to the initial packet reception since the controller is a software entity and may be remote from the switch.
Another perceived disadvantage of OpenFlow is the possibility for layer 2 network chaos. Switch table entries are no longer strictly based on local information. If all switches are controlled by a single controller, the controller can keep the tables consistent, but in a scaled network setting, controllers may need to agree on policies to prevent loops, partitions, etc. Great care must be taken if this agreement is not transactional.
The basis for this latter argument is in the notion that once switch behavior can be programmed, it is possible to implement different isolated network architectures over the same physical hardware. To see why this architectural control is important, consider the problem of provisioning a network for a set of VM's in a cloud. For the purposes of illustration, I'll use the AWS nomenclature in the following example.
Recall that when a user creates one or more instances they are assigned to a security group that corresponds to an isolated layer 3 and layer 2 network between the VMs. If there is just one VM, it is the only host in the security group. The security group also specifies firewall rules that describe layer 3 routing into and out of the group. All VMs inside the group communicate as if the network were physically dedicated to them.
Using "normal" networking protocols (i.e. without OpenFlow and/or SDN) creating a security group involves the following steps:
For layer 3, each security group gets its own subnet. The router must allow the cloud to install and remove layer 3 routes corresponding to different subnets as security groups are created and destroyed.
With SDN, however, it is possible to define these functions in terms of a new network architecture. Rather than defining a relationship between security groups, VLANs, and IP routes, the SDN-controlled network can define "roles" for different programmable components (presumably using a combination of open flow and layer 3 route control) in the network.
For example, it is possible to build a layer 3 network between VMs that does not use an intermediate router. Instead, each host becomes an "edge router" that can send packets to a layer 2 network that sets up secure "circuits" between edge routers. In this model, each host hosting a VM maintains a routing table entry for the security group that correspond to a network or subnet. The layer 2 network, then, accepts commands from the cloud to set up a set of virtual circuits (implemented using switch forwarding tables) between all pairs of hosts participating in a security group. When a VM routes a packet, it goes to the host first (to make sure the VM isn't spoofing its layer 3 address) and then the host forwards the packet to the layer 2 circuit switch.
The advantage of this approach is that there need not be a centralized router that is programmed with routing table entries. The disadvantage is that the layer 3 edge routers and the virtual circuit set up and tear down must be coordinated.
However, the proponents of SDN point out that architecturally the cloud is implementing a limited form of software defined networking when it provisions security groups. There is a controller (the cloud) that implements policy (isolated layer 2/layer 3 network and firewall rules) using programmable network devices (hosts for VLANs, router, and firewall device). They reason that these activities are independent of cloud computing (cloud computing is a special case) and thus should be implemented as their own service that the cloud uses.
The scalability of the approach is also an argument that gets made although far less convincingly. Many SDN papers hypothesize a new hierarchical separation of concerns for The Internet. The reasoning is that the current approach relies on a consistent set of layer 2/3 protocols everywhere the Internet is to go. If The Internet were designed as a common set of "core" protocols (say the current IP protocols) and edge routers that can tunnel new protocols through the core, innovations in networking will be possible. It is true that an overlay approach using tunnels is more flexible. The feature of The Internet that is most compelling, however, is its ability to remain stable at a global scale. It is not at all clear that the additional flexibility offered by SDN will scale to Internet sizes.