If you've heard of VMware NSX, you might have heard of a Distributed "this" and a Distributed "that". This post is here to help you understand what exactly what "this" and "that" are. In my last post, I dove into the NSX Distributed Firewall. Today, we'll be looking into the NSX Distributed Router and what makes it tick. So without further ado, let's take a look at it together and demystify what this Distributed Logical Router thing really is.
What is a DLR?
The Distributed Logical Router, or DLR, does what it says, it routes: it's a router. It allows you to do that using static routes or dynamic routing protocols such as OSPF, IS-IS & BGP. So you can get some fairly complex routing going fairly quickly and also integrate DLRs with your physical routers using these protocols. It also has another interesting feature: "Layer 2 bridging", which allows you to take an NSX VXLAN Logical Switch port group, a regular (VLAN) distributed port group and merge those two into a single Layer 2 network. Also, it has a firewall functionality for the Control Plane component, the "Control VM", more on this component below.
Knowing all this, let's break down the different parts of a Distributed Logical Router :
– Data plane : The Data plane of the Distributed Router resides in the kernel of the ESXi hosts participating in a given NSX Transport Zone. It uses the NSX VIB software packages as agents. (P.S. : A "Transport Zone" determines on which ESXi clusters Logical Switches and DLRs will span across)
– Control Plane : Every time you deploy a Distributed Router, it also deploys a "Control VM", this VM is the control plane for the DLR. You can access it from SSH, or from the VM's console. But most of the time you won't need to do this unless you're troubleshooting an issue.
– Management Plane : The management plane for Distributed Logical Routers is, like every other NSX component, part of the vSphere Web Client. You'll find everything you need to configure and manage your Distributed Router Instances under the NSX Edges tab.
Where does a DLR fit in to my architecture?
The recommended use case for Distributed Logical Routers is East-West traffic. East-West traffic is traffic that stays at the same level of your network architecture. In most cases that means traffic that doesn't leave your datacenter, not even your virtual infrastructure. Any traffic that does that is called North-South traffic. VMware typically recommends you use Edge Service Gateways for North-South traffic. The main benefit of using Distributed Routers is to take advantage of the optimized data path that network takes when routed by a DLR. You might have heard of traffic "hairpinning" when a VM needs to talk to another VM on the same host, but a different port group. Normally, in this scenario, the network traffic would need to go all the way up to the physical network to be routed onto the other VM's VLAN. With a DLR though, the traffic would only go from VM1, to the ESXi's kernel and then straight to VM2. That is a huge benefit for most people and will allow you to save a significant of amount of (1) packet transit time and (2) network bandwidth on the physical network infrastructure. Here's a visual example of all that jibber-jabber :
Credit : http://bradhedlund.com/2013/09/27/seven-reasons-vmware-nsx-cisco-ucs-and-nexus/
Definitely keep this in mind when designing your network!
You can still use DLRs for North-South traffic as well, but you won't necessarily reap the benefits that come with using DLRs and you might even hit a limitation of the DLR that will simply prevent you from doing so. One such limitation is that you can connect an NSX Logical Switch to ONLY one Distributed Logical Router at a time. That might force you to complicate your virtual network design to accommodate this limitation.
To better understand the different use cases for DLRs and other great architectural information about NSX, I strongly suggest you take a look at this awesome document : Architecting a VMware NSX® Solution for Service Providers from the vCloud Architecture Toolkit for Service Providers.
Finally, here's an image that demonstrates the concepts I just spoke of, it'll give you an idea of where DLRs fit into your network architecture :
How does it all work?
Creating a Distributed Logical Router is very similar to the way you create an Edge Gateway, it's actually the same wizard that is used for both the Edge Gateway, or the Distributed Router. Once created, all you have is basically a "virtual instance" of a Router that exists within the kernel of one or many ESXi hosts, within a given NSX Transport Zone. You can add logical interfaces (LIFs) to a DLR, connect those interfaces to networks and assign IPs to those LIFs, just like you would do on any other router. Keep in mind, there are 2 types of LIFs : Uplink (for North-South traffic, exiting the DLR) and Internal (for East-West traffic, between components directly connect to the DLR).
To ensure that the Distributed Router is seen as a single entity by all participating hosts and all VMs that use it, VMware assigns what they call a vMAC (Virtual MAC Address) to every LIF (logical interface) of a DLR. That allows all ESXi hosts that participate in that DLR instance to advertise that same MAC address and therefore make everyone who communicates with it believe that every instance of that same DLR is actually a single entity. DLRs also use what is referred to as a pMAC (Physical MAC addresses) to advertise the MAC address of an actual ESXi host's physical NIC onto the physical network. That's how traffic flows through the physical network and also makes its way back into the virtual network, using that pMAC.
There also also a component called the Control VM that is automatically provisioned when you deploy a DLR. The Control VM basically communicates with certain daemons residing on the ESXi hosts (The UWA, User World Agent) to make sure that all the ESXi hosts and their DLR configuration have up-to-date Routing Tables and also takes care of updating them with information received from dynamic routing protocols. Keep in mind, the Control VM does NOT route packets, only sends and received metadata to enable routing by the data plane. While we're on the subject of Control VMs, when deploying them, it's always better to make them redundant, the VMs are tiny, so don't cheap out for such a small price.
Troubleshooting Distributed Logical Routers
You can do most of the operations for a DLR in the vSphere Web Client (The Management plane), but when troubleshooting, you might need to go a little deeper into the configuration to investigate. To do so, you can connect via SSH (or VM Console if necessary) to your DLR's Control VM (The Control plane). (Make sure SSH is activated on it and a password is set, via the vSphere Web Client!) When connected to the Control VM, you can use the list command to see all you can do to troubleshoot. You'll mostly be able to read log files and view configurations.
You might also need to connect to the ESXi hosts that participate in the Distributed Logical Router instance (The Data plane) and use the net-vdr command to look at a detailed resulting configuration. For example, you can use the net-vdr command list out all the Distributed Routing instances that exist on a given ESXi host (net-vdr –instance) and use that information to dig deeper into its configuration.
Now that you've gotten through this Beginner's Guide, why not take it a step further and take a look at a few other interesting blog posts about the NSX Distributed Logical Router :
– Here's a deeper dive into the NSX DLR from Roie Ben Haim on routetocloud.com
– Here is a step by step guide by Giuliano Bertello that explains a few concepts about DLRs and how to create them in the vSphere Web Client. from blog.bertello.org
– Here is a great resource to help you get started with troubleshooting a DLR : NSX vSphere troubleshooting (Logical Router section) on Yet.org
The Bottom Line
VMware is definitely on the right track with NSX's "distributed" services, like the Distributed Router and the Distributed Firewall. There's also a (sort of) hidden "Tech preview" version of Distributed Load Balancing baked into the latest versions of NSX. That will definitely be useful for NSX users that have vSphere stretched clusters and those that use NSX for DR scenarios.
With all this useful functionality baked in to NSX and even more to come, it looks like being an NSX customer will continue to be a rewarding experience for years to come!
P.S. : Keep an eye out for my upcoming article on NSX Universal Object, A.K.A. Cross-vCenter networking. Some interesting stuff on DLRs in there too, coming soon!