Open Virtual Network (OVN), together with Open vSwitch (OVS), is a software defined, hardware accelerated, network solution (SDN). It provides an abstraction for many important virtual network elements, and operates at the layers below the CMS, making it a platform-agnostic integration point.
In this article, we will take a close look at OVN, an important part of Canonical’s networking strategy. We’ll explain the gap that OVN fills, its key features and strengths—especially how it benefits from specialised hardware—and its overall architecture. We will also cover how Canonical integrates OVN with the rest of its portfolio.
Motivation
Ever since the limits of physical reality shifted CPU manufacturer competition from clock speed to core count in the mid-2000s, the industry has been hard at work figuring out how to best harness this novel form of increase in CPU performance.
Around the same time frame, public cloud entered the market and hardware-assisted virtualisation technologies appeared in off-the-shelf hardware.
Fast-forward to today and you will find that the phone, laptop or workstation you’re reading this article on has a CPU with a number of cores that require advanced process management to fully utilise them.
The most prevalent technique for software to harness the power of multicore CPUs remains horizontal scaling, which is implemented by running multiple copies of a software component, typically encapsulated in containers and virtual machines (VMs), across one or more physical machines.
With that in mind, it becomes clear that the endpoint of the network is no longer a physical piece of equipment: rather it is the hundreds of VMs and thousands of containers running on the physical equipment, each of which has individual network service and policy requirements.
But how do you effectively manage network service and policy for thousands of entities across one or more physical machines? This is where OVN comes into play.
Managing the numerous containers and VMs running on every machine requires management software such as Kubernetes, LXD/MicroCloud and OpenStack. Collectively referred to as “Cloud Management Software (CMS)”, what these solutions have in common is that they can delegate implementation of the networking for containers and VMs to OVN.
OVN features
OVN makes use of OVS, a production quality multilayer switch platform that opens the forwarding functions to programmatic extension and control, which provides further integration down to hardware acceleration through its support for multiple data path providers and flow APIs. This includes at the kernel level with SwitchDev / TC, and in the userspace with eBPF XDP and DPDK rte_flow.
Hardware accelerated data path features include:
- Access Control Lists (ACLs)
- Layer 2 Switching
- Layer 3 Routing, including policy-based and ECMP routing
- Layer 4 Load Balancing
- NAT
- GENEVE and VxLAN encapsulation
Distributed control plane features include:
- ARP/ND
- Bidirectional Forwarding Detection (BFD)
- Control Plane Protection (CoPP)
- DHCP
- Instance DNS resolution
- IPv6 Prefix Delegation
- IPv6 RA
- MAC Learning
OVN hardware acceleration
With the expanding amount of data each entity has to store and process, the demand on the network grows relentlessly. 100, 200 and 400 Gbps ports are commonplace. 800 Gbps ports are just around the corner, and within a few years we will see 1.6Tbps ports being deployed in every data centre. Concomitantly, the number of CPU cores we can fit in a single server is steadily increasing (see previous figure) to withstand the growing number of virtual machines and containers.
This poses considerable challenges to server networking capabilities, which can be solved by hardware acceleration.
To get a grasp of this scaling issue, let’s first have a look at how the time window to process packets in the CPU closes according to the bit rate. The theoretical time that is available to process a complete standard Ethernet frame of 1500 bytes is governed by an inverse linear relationship, using the formula:
t
processing
=
S
R
=
1500
×
8
R
=
12000
R
, where S is the size of the Ethernet frame in bits and R is the data rate in bits per second.
This theoretical time available to process each standard Ethernet frame can be compared to a single core clock speed and CPU load for each frame. According to Raumer, Daniel, et al., the total number of CPU cycles required to process each frame is quite stable with the packet size, and they find a value of 2203 for standard Ethernet Frame.
We can thus derive the maximum number of nanoseconds between packets that is available for a given processor frequency, using the formula, which is again governed by an inverse linear relationship:
t
available
=
C
f
=
2203
f
, where C is the number of required CPU cycles and f is the core frequency in GHz.
Considering both relations, keeping in mind the plateau around 4 GHz for processor frequency, the authors found 40Gbps is the practical maximum to process packets fully in software using a single, fully dedicated core. Even with the latest processors having dozens of cores, servers cannot handle processing wirespeed traffic at the highest available network port speeds. There are ongoing efforts to further optimise network processing in software, such as DPDK and AF_XDP, but there is no hope that it can scale as fast as the port speed.
With this development in mind, one of the most important values OVN/OVS provide is a data path implementation that can be fully accelerated by the NIC, enabling packets to appear directly in the virtual machine or container endpoint.This allows precious CPU time to be spent on your applications.
Hardware acceleration for the data path features (mentioned in the above third paragraph of the “What is OVN?” section) is in use in production environments today, supported hardware includes NVIDIA ConnectX-6 Dx.
OVN hardware offload
Taking hardware acceleration one step further, it is also possible to offload the control plane software onto dedicated infrastructure or data processing units (IPU and DPU, respectively), embodied in the latest generation of smart NICs, such as NVIDIA Bluefield, Intel IPU E2100 or AMD Pensando Elba. In addition to the associated performance improvement, the relinquishment of precious CPU cycles to focus on business workloads also provides administrators with the capability to completely isolate the control of the network from the hosted workloads. This, in turn, simplifies integration and increases security. OVN running on the smart NICs can manage routing of the highest complexity and scale, with each individual workload only seeing their own simple routing table.
OVN architecture and fundamentals
OVN is a distributed system composed of several components. They are detailed from top to bottom in the figure below:
- The OVN northbound database (NB) keeps an intermediate representation of the network configuration received from the CMS through the OVN integration API. Its database schema directly supports the same networking concepts as the CMS, such as logical switches and routers, and access control lists (ACLs).
- A translation service, ovn-northd, converts the logical network configuration stored in the NB into logical data path flows.
- The OVN southbound database (SB) stores these logical data path flows in its logical network (LN) tables. Its physical network (PN) tables contain reachability information discovered by an agent running on each transport node. The Binding tables link the logical network components’ locations to the physical network.
- The agent running on each transport node is ovn-controller. It connects to the SB to discover the global OVN configuration and status from its LN and populates the PN and Binding tables with local information.
- Each transport node also hosts an Open vSwitch bridge dedicated to OVN’s use, called the integration bridge. The ovn-controller connects to the local ovs-vswitchd, as an OpenFlow controller, to control the network traffic and to the local ovsdb-server to monitor and control OVS configuration.
More details on OVN components can be found at this link.
OVN conveniently uses ovsdb-server as its database without introducing any new dependencies.
Each instance of ovn-controller is always working with a consistent snapshot of the database. It maintains an update-driven connection to the database. If connectivity is interrupted, ovn-controller will catch back up to the latest consistent snapshot of the relevant database contents and process them.
Configuration data flow
In the OVN design, network configuration flows southbound, from the CMS to each of the transport nodes. The CMS is responsible for presenting the administrators with the configuration user interface. Any change done through it is communicated to the NB via the API and then the lower-level details are determined and passed along to the SB by ovn-northd, which also updates the desired configuration version number. In turn, the SB changes are communicated to the ovn-controller agents running on the transport nodes. The agents then update the VMs, containers and Open vSwitch configuration.
Status information flow
Conversely, to gather status, information flows northbound from the transport nodes to the CMS. Each ovn-controller agent running on a transport node updates its configuration version number in the SB to reflect that a requested configuration change has been committed. Ovn-northd monitors the configuration version number of each transport node in the SB and copies the minimum value into the NB to reflect the progress of changes. Any NB observer, including the CMS, can therefore find out when all participating transport node configurations have been updated.
OVN and Canonical products
OVN can serve as the SDN of several CMS, including OpenStack, LXD and Kubernetes. Canonical bundles it with Charmed OpenStack, Sunbeam, and MicroCloud, simplifying their deployment and initial configuration. Moreover, Canonical Kubernetes can also leverage OVN, thus providing a familiar architecture and command-line interface, by complementing it with CNI (Container Network Interface, the CNCF specifications for containers connectivity) plugins that implement OVN, such as KubeOVN or OVN-Kubernetes. The latter benefits from all the hardware acceleration and offloading capabilities and is maintained by the OVN organisation itself.
While it is theoretically possible for multiple CMS to share a single distributed OVN database, it should be avoided. This is because each CMS has a unique, potentially incompatible, view of their network requirements. With each of them directly writing their network configuration details in a single network model without any arbitration layer, the end result will be inconsistencies and unwanted overlaps. Instead, each OVN deployment must be kept dedicated to its respective CMS. Consistent routing information sharing and policing between them can be reached using industry standard routing protocols, such as the Internet Border Gateway Protocol (BGP). In fact, the best practice is increasingly to leverage BGP as the data-centre underlay, thus benefiting from the reliability, scalability and security attributes of the “routing protocol of the Internet”. But that will be the topic of another future blog post…
Summary
Canonical OVN provides easy to use, reliable and predictable software-defined networking. This SDN abstracts the details of the data centre network infrastructure and seamlessly interconnects your workloads regardless of the underlying CMS. With built-in support for secure tenant isolation and enhanced performance through hardware acceleration and offloading, it’s a cornerstone of Canonical’s networking vision. Stay tuned for future content, including insightful white papers which will examine OVN internals and operations in more detail. We’ll cover topics such as how to run OVN on NVIDIA Bluefield DPUs and how to use these to apply and manage distributed security controls.