Cross-vCenter NSX – Part 3

Welcome back! In part two we deployed the data plane for the East-West traffic in our cross-vCenter NSX environment. In this part we continue with the setting up the components for the North-South traffic.

Current state

A quick look at the current state of our cross-vCenter NSX lab environment.

The management plane and the control plane are operational and configured for cross-vCenter NSX. We have deployed the data plane components for East-West traffic; two universal logical switches and a universal distributed logical router.

NSX Edge

The NSX edge is responsible for the central on-ramp/off-ramp routing between the NSX logical networks (VXLANs) and the physical network. It’s a central function and is deployed as virtual appliances (edge services gateways) by NSX Manager.

Local egress

There are some design considerations surrounding the NSX edge. In a cross-vCenter NSX environment spanning multiple physical sites one consideration is of particular interest: should each site use a local edge for North-South egress (local egress), or should one site act as the central edge for the other sites?

You might remember from part two that we deployed the universal distributed router with local egress enabled. So what we’ll do, is set up our fictional sites “DC-SE” and “DC-US” with active/active egress. This is actually a less common/popular type of cross-vCenter NSX deployment as it introduces asynchronous traffic flows (traffic egresses one site and ingresses at another site) which need to be dealt with somehow.

Deployment

When local egress is a requirement we need to deploy some additional components. We need a UDLR control VM at each site. Each site will also use a separate transit universal logical switch.

So, let’s create the transit logical switches first and come back to the control VMs a little later.

Log in at the DC-SE’s vCenter and navigate to Networking and Security > Logical Switches. Click the “+ Add” button. Type a name for the logical switch. In my case I’ll call it “ULS Transit SE”. This logical switch will use the universal transport zone “UTZ” that we created in part one. Click “Add“. Repeat these steps to add the second transit universal logical switch. I’m calling this one “ULS Transit US”.

With the transit switches created we continue with the deployment of the ESG appliances. For this lab we’ll deploy just one ESG per site. At DC-SE navigate to Networking and Security > NSX Edges. Click the “+Add” button and choose “Edge Services Gateway”.

Give the ESG a name. In my lab it’s called “esg-se”.

Configure a user name and password. Enable SSH.

Configure the appliance VM deployment. A compact sized appliance will do for this lab.

Next we configure two interfaces on the ESG: one uplink and one internal. Starting with the uplink interface that we connect to a VLAN-backed distributed port group (VLAN 70 at my DC-SE site). This one connects the ESG with the pfSense router. I’m assigning IP address 10.0.70.2/24 to this interface (the pfSense router’s interface is configured with 10.0.70.1/24).

The internal interface connects the ESG with the “ULS Transit SE” logical switch. It is the link between the ESG and the UDLR. I’m assigning 192.168.100.1/29 to this interface.

We also configure a default gateway on “esg-se” which points to the next-hop router on the “physical” network. In my case this is 10.0.70.1 (pfSense router).

Review the configuration and click “Finish” to deploy the ESG appliance.

At the DC-US’s vCenter we basically repeat these steps for the ESG deployment over there. The unique settings for the ESG at DC-US in my lab are:

Nameesg-us
Uplink interface IP (VLAN 700)10.1.70.2/24
Internal interface connected toULS Transit US
Internal interface IP192.168.101.1/29
Default gateway10.1.70.1

Universal Distributed Logical Router

We need to revisit the UDLR to deploy the additional UDLR control VM at DC-US as well as configure connectivity between the UDLR and the new transit logical switches.

Log in at vCenter in DC-US and navigate to Networking and Security > NSX Edges and click on the universal distributed router. Select Configure > Appliance Settings. Click on “Add Edge Appliance VM“.

Configure the placement parameters and deploy the VM.

Once the control VM has been deployed we head over to vCenter at DC-SE and navigate to Networking and Security > NSX Edges and open up the universal distributed router. Select Configure > Interfaces and add a new uplink interface that connects to the “ULS Transit SE” logical switch. I configure this interface with IP address 192.168.100.2/29.

Add a second uplink interface to the UDLR and connect it to the “ULS Transit US” logical switch. I configure this one with IP address 192.168.101.2/29.

With the UDLR uplinks configured we’ll do a quick connectivity test. Open an SSH session to the ESGs and ping the IP address of the UDLR uplink interface connected to that ESG.

North-South routing

Now that we have a working L2 connection between the ESGs and the UDLR, we can focus on setting up the North-South routing in our cross-vCenter NSX environment.

In my lab I’ve configured an iBGP peering between the UDLR control VMs and their respective ESG appliance. I also configured an eBGP peering between the ESGs and the pfSense routers at each site.
I won’t go through setting up dynamic routing with BGP, but here’s a summary of the configuration that I used in my lab:

PropertyDC-SEDC-US
pfSense Local AS6550265502
ESG default gateway IP10.0.70.110.1.70.1
ESG Local AS6551065110
ESG BGP neighbors10.0.70.1, 192.168.100.310.1.70.1, 192.168.101.3
ESG redistributionconnectedconnected
UDLR default gateway192.168.100.1192.168.101.1
UDLR Local AS6551065510
UDLR protocol address192.168.100.3192.168.101.3
UDLR BGP neighbors192.168.100.1192.168.101.1
UDLR redistributionconnectedconnected

Setting up dynamic routing in cross-vCenter NSX with local egress can be quite a job. Even in a small lab like this we need to configure routing parameters at 6 places: pfSense x 2, ESG x 2, UDLR control VM x 2.

The big picture

Now that we have configured North-South routing let’s have another look at the environment from above.

Let’s do a simple test with traceroute and see if local egress works.
From AppServer-SE (192.168.0.20) we are egressing via “esg-se”:

From AppServer-US (192.168.0.22) we are egressing via “esg-us”:

A simple test, but local egress seems to work!

Conclusion

This concludes the blogpost series on cross-vCenter NSX. During these three posts we went through the process of setting up cross-vCenter NSX with local egress. We deployed and configured the different components and had a look at the data plane functions: universal East-West routing, universal distributed firewall and finally North-South routing in a cross-vCenter NSX environment.

All in all setting this up is a pretty straight forward process. Especially in a lab where we have control over the entire environment. 😉

Cross-vCenter NSX – Part 1

One of those scenarios where the NSX platform really shines is in multi-site environments. Here NSX together with vSphere is the infrastructure that delivers on business requirements like workload mobility, resource pooling, and consistent security.

Since NSX version 6.2 we can roll out NSX in a vSphere environment managed by more than one vCenter system. This type of deployment is called cross-vCenter NSX.
With cross-vCenter NSX we are able to centrally deploy and manage networking and security constructs regardless of the management domain architecture.

In preparation for some assignments involving cross-vCenter NSX, I’ve been busy with a cross-vCenter NSX lab. I thought I’d do a little writeup in two three parts on setting this up.

In this first post we’ll prepare the management and control plane for cross-vCenter NSX. In part 2 we’ll have a closer look at how to deploy the data plane in a cross-vCenter NSX environment.

The lab environment

The following components are the building blocks I used for this simple cross-vCenter NSX lab:

  • 8 x nested ESXi 6.7 U1 hosts
  • 2 x vCenter 6.7 U1 systems
  • vSAN storage
  • NSX 6.4.4
  • 2 x pfSense routers

Just so we spend time focusing on the relevant stuff I’ve done some preparation in advance.

I set up two fictional sites: DC-SE and DC-US. Each with its own, non-linked, vCenter server system, four ESXi hosts, vSAN storage, and a standalone NSX Manager.
The ESXi hosts are prepared for NSX (VIBs installed) and a segment ID pool and transport zone are configured. DC-SE is running a controller cluster.

Each site has a pfSense machine acting as the perimeter router. Static routing is set up so each site’s management, vMotion and VTEP subnets can reach each other. Both sites also have a VLAN for vSAN plus one for ESG uplink which will be used in part two.

VLANDC-SEDC-US
Management10.0.10.0/2410.1.10.0/24
vMotion10.0.20.0/2410.1.20.0/24
vSAN10.0.30.0/2410.1.30.0/24
VXLAN transport10.0.50.0/2410.1.50.0/24
Uplink10.0.70.0/2410.1.70.0/24
High-level overview of the lab environment before cross-vCenter NSX is implemented

Please keep in mind that this is a simple lab environment design and in no way a design for a production environment. Have a look at VMware Validated Designs if you want to learn more about SDDC designs including NSX for production environments.

Step 1 – Assign primary role to NSX Manager

There’s a 1:1 relationship between vCenter server and NSX Manager. This is true when setting up cross-vCenter NSX as well, but here the NSX managers involved are assigned roles.
The NSX manager that should be running the controller cluster is assigned the primary role. Additional NSX Managers participating in cross-vCenter NSX (up to 7) are assigned the secondary role.

So let’s start by assigning the NSX Manager in DC-SE the primary role. In vCenter, go to Networking and Security > Installation and Upgrade > Management > NSX Managers. Select the NSX Manager and click on Actions > Assign Primary Role.

As you can see the role has changed from Standalone to Primary.

When we assign the primary role to a NSX Manager, its controller cluster automatically becomes the universal controller cluster. It is the one and only controller cluster in cross-vCenter NSX and provides control plane functionality (MAC, ARP, VTEP tables) for both the primary and secondary NSX Managers.

Step 2 – Configure Logical Network settings

While we’re at DC-SE we continue with the configuration of the logical network settings for cross-vCenter NSX.

We begin with defining a Universal Segment ID pool. These segment IDs (VNIs) are assigned to universal logical switches. Universal logical switches are logical switches that are synced to the secondary NSX Managers. We will look more at this in part two.

Go to Networking and Security > Installation and Upgrade > Logical Network Settings > VXLAN Settings > Segment IDs and click Edit.

Configure a unique range for the universal segment ID pool.

Create a universal transport zone and add CL01-SE

Next to VXLAN Settings we find Transport Zones. Click it and click Add to start adding an Universal Transport Zone.

Give it a name like “UTZ” and switch Universal Synchronization to On and add the CL01-SE vSphere cluster to the transport zone.

Step 3 – Assign secondary role to NSX Manager

Assigning the secondary role to the NSX Manager located in DC-US is done from the primary NSX Manager in DC-SE.

In vCenter, navigate to Networking and Security > Installation and Upgrade > Management > NSX Managers. Select the NSX Manager and click on Actions > Add Secondary Manager.

Here you enter the information of the NSX Manager at DC-US and click Add.

The NSX Manager at DC-US now has the secondary role. We can verify this by logging in to vCenter at DC-US and navigate to Networking and Security > Installation and Upgrade > Management > NSX Managers.

As you can see the NSX Manager now has the Secondary role.

Add CL01-US to the universal transport zone

While still logged in to vCenter at DC-US navigate to Networking and Security > Installation and Upgrade > Logical Network Settings > Transport Zones. The transport zone that we created over at the primary NSX Manager in DC-SE, shows up here too. Mark the transport zone and click on the “Connect Clusters” button. Add the CL01-US cluster to the transport zone.

Wrapping up

This completes the preparation of the management and control plane for our simple cross-vCenter NSX lab.

We started by assigning the primary role to the NSX Manager at DC-SE. By doing so we got ourselves a universal controller cluster. Next we configured the logical network settings necessary for cross-vCenter NSX. Finally we paired primary NSX Manager at DC-SE with the standalone NSX Manager at DC-US by assigning it the secondary role. Along the way we also added the vSphere clusters on both sites to the same universal transport zone.

In part 2 we will set up the data plane in our cross-vCenter NSX lab. We’ll have a look at how logical switching, distributed logical routing, and distributed security works in a cross-vCenter NSX environment.

NSX: Bridging between VXLAN and VLAN

After you prepared your vSphere clusters for VXLAN you’re eager to start building your SDDC network. You provision some logical switches, a distributed logical router and maybe even an edge services gateway. Before you know it you are doing full-fledged network virtualization. It’s simple enough, right?

But then you realize you still have other virtual workloads and possibly all kind of physical equipment residing on that other network: The VLAN-based network.

Of course this is (hopefully) not the way you rollout VXLAN in your SDDC. Just like any other major change in your network (logical or physical) some kind of planning is required here. You should at least create a solid design for the VXLAN structure, the IP subnets, IP routing, and how it all connects and propagates to the physical network before you start implementing VXLAN. Things might get really complicated if you don’t.

But even with all the planning in the world you still might end up with workloads and equipment that for various reasons are stuck on VLANs. On top of that, some of these workloads require to be on the same L2 segment as the virtual workloads that you planned on migrating to VXLANs. This can be a short term (transitioning etc) or a long term requirement.

A helping hand

NSX BridgeOne component of NSX-V that comes in handy in a situation like this is the L2 bridge. The L2 bridge has a number of use cases including:

  • Migration: Physical to virtual, or virtual to virtual  without requiring re-IP.
  • Connectivity: Physical workloads not suitable for virtualization can maintain connectivity with virtual workloads inside of NSX.
  • Service insertion: Transparent integration of any physical appliance such as a router, load balancer or firewall into NSX.

There are some prerequisites and limitations:

  • L2 bridging requires a distributed logical router with a control VM
  • The VXLAN network and VLAN-backed port groups must be on the same distributed virtual switch and use the same physical NICs.
  • VLAN-backed port group must be configured with a VLAN ID (between 1 and 4094).
  • Don’t use a L2 bridge to connect a logical switch to another logical switch, a VLAN network to another VLAN network, or to interconnect data centers.
  • You can’t use a Universal logical router to configure bridging and you cannot add a bridge to a universal logical switch (cross-vCenter NSX objects).
  • A logical router can have multiple bridging instances, however, the routing and bridging instances cannot share the same VXLAN/VLAN network. Traffic to and from the bridged VLAN and bridged VXLAN cannot be routed to the bridged network and vice versa.
  • The recommended maximum is 500 bridging instances per distributed logical router. A number you’ll hopefully never need.

Configuring a L2 Bridge

The L2 bridge is configured with a couple of clicks over at the distributed logical router. Yes, a couple of API calls does the trick too.

At the DLR click the Manage tab and then Bridging. Click the green plus sign to add a bridge:

Skärmavbild 2018-09-08 kl. 19.48.50.pngType a name for the bridge and select a logical switch (VXLAN) and a distributed virtual port group (VLAN-backed). Click OK and don’t forget to publish your changes.

That’s all there is to it. You’re now bridging between a VXLAN and a VLAN.

Conclusion

VXLAN-VLAN bridging is not necessarily something you want to do over a long period of time as it adds some complexity to your environment. That being said, there are scenarios (mentioned above) where the L2 bridge is the right solution and it’s good to know that setting this up in NSX-V is a breeze.

A simple NSX-V load balancer lab

Skärmavbild 2018-08-21 kl. 21.19.57Recently I had to do a quick demo on the NSX-V logical load balancer. Setting up the NSX components in a small lab for such demo is a pretty straight-forward process. I thought I’d walk you through setting this up in case you want to play around with the logical load balancer or any of the other NSX components yourself.

The NSX-V logical load balancer in a nutshell

The NSX-V logical load balancer enables high availability service and distributes network traffic (L4 and L7) among multiple servers. Incoming requests are evenly distributed between backend servers that are grouped together in a pool.

It comes with configurable service monitors that perform health checking on backend servers. If one of the backend servers becomes unavailable, the service monitor marks that server as “DOWN” and stops distributing requests to it. When the backend server is available again, the service monitor marks it as “UP” and incoming requests are once again distributed to the server.

The NSX-V logical load balancer has all the functionality you can expect from a modern load balancer. There’s variety of features and two deployment models (inline vs one-armed), but they are beyond the scope of this simple lab. I recommend you have a look at the Logical Load Balancer chapter of the NSX Administration Guide for detailed documentation on the NSX-V logical load balancer.

The Lab Environment

Let’s have a  look at a small diagram showing the environment we’re going to build:

NSX-V Load Balancer PoC.png

 

Two virtual machines are connected to the “Web Tier” logical switch. The “Web Tier” uses IP subnet 172.16.1.0/24 as defined by the Distributed Logical Router (DLR). The DLR and the Edge Services Gateway (ESG)  connect using the “Transit” logical switch. The “Transit” VXLAN  uses the tiny 172.16.0.0/29 subnet. The ESG also acts as the gateway for the virtual network with its connection to the physical network.

The logical load balancer is a component of the ESG. Its job in this lab will be to distribute incoming web requests between the two web servers.

For this guide I assume NSX-V 6.4 is installed and configured and that your vSphere cluster is prepared for VXLAN.
We need to reserve four IP addresses on the physical network’s subnet to be used by the ESG. For this lab we also need Internet access from the physical network as well as access to a DNS server.

We don’t need a lot of resources for this lab. A single physical ESXi 6.5/6.7 host running a nested vSphere environment will do the job.

Building the NSX infrastructure

We start by creating the two logical switches “Web Tier” and “Transit”. The default settings for the switches are fine for this lab:

Skärmavbild 2018-08-13 kl. 20.19.10.png

Next we deploy the distributed logical router:

Skärmavbild 2018-08-13 kl. 20.24.02.png

At the “Configure interfaces” step we create and configure two interfaces: An uplink interface we’ll call “2ESG” which connects to the “Transit” logical switch and an internal interface that we’ll call “2WebTier” which connects to the “Web Tier” logical switch. We configure the interfaces using the IP addresses from the diagram above:

Skärmavbild 2018-08-13 kl. 20.32.18.png

The DLR’s default gateway is the IP address of the ESG’s internal interface, accessed via the “2ESG” interface:

Skärmavbild 2018-08-13 kl. 20.38.45.png

Once the DLR is deployed we go to its settings and configure DHCP relay. The DHCP relay server in this lab is the IP address of the ESG’s internal interface (we’ll configure a DHCP server on the ESG in a moment). We also need to enable a DHCP relay agent on the DLR’s “2WebTier” interface:

Skärmavbild 2018-08-13 kl. 20.52.10.png

Now it’s time to deploy the Edge Services Gateway appliance:

Skärmavbild 2018-08-13 kl. 20.55.05.png

We choose a compact appliance size and configure the ESG with two interfaces: An uplink interface we call “2Physical” which connects to a VLAN backed port group on the distributed switch as well as an internal interface we call “2DLR” which connects to the “Transit” logical switch.

We’re going to configure four IP addresses on the “2Physical” interface: One primary IP address and three secondaries. One of the secondaries will be used as the VIP address of the load balancer. The other two will be used for NAT:

Skärmavbild 2018-08-13 kl. 21.40.50.png

The default gateway for the ESG resides on the physical network. In my environment it’s at 10.0.1.1 and reached using the “2Physical interface:

Skärmavbild 2018-08-13 kl. 21.53.03.png

So let’s setup the DHCP server on the ESG. We do this under the “DHCP” tab of the ESG. We need to configure a DHCP pool and enable the DHCP service. Don’t forget to configure a primary name server. The “Web Tier” servers need one.

Skärmavbild 2018-08-13 kl. 21.58.55.png

Finally we need to let the ESG know how to reach the “Web Tier” subnet. We do this by adding a static route under “Routing”:

Skärmavbild 2018-08-21 kl. 19.52.38.png

Now the ESG knows that it should use the DLR to reach the “Web Tier” subnet.

Deploying the Virtual Machines

Now that the NSX infrastructure is in place we’ll continue with the deployment of our two virtual machines in the “Web Tier” VXLAN.

In this lab we use VMware’s Photon OS which is an open source minimal Linux container host optimized for vSphere. Download the OVA for vSphere 6.5 or newer and deploy two virtual machines from it. Make sure to connect both to the “Web Tier” logical switch.

Skärmavbild 2018-08-16 kl. 22.01.26.png

Once booted, the DHCP server on the ESG should have assigned IP addresses to the VMs. Take a note of the IPv4 addresses as we need them for our NAT rules:

Skärmavbild 2018-08-19 kl. 16.27.43.png

Configuring NAT

NAT rules are defined on the ESG under the “NAT” tab. First of all we want to provide Internet access to our “Web Tier” virtual machines so we can patch the OS and pull the NGINX container. For this we have to create a source NAT (SNAT) rule that looks like this:

Skärmavbild 2018-08-16 kl. 22.19.32.png

The primary IP address of the ESG “2Physical” interface (10.0.1.240) is used to represent any IP address from the “Web Tier” subnet (172.16.1.0/24) on the physical network.
Don’t worry about protocol for this lab.

As we also want to SSH into our virtual machines from the physical network, we need to create two destination NAT (DNAT) rules. One for each server. For my web-1 virtual machine the DNAT rule looks like this:

Skärmavbild 2018-08-16 kl. 22.27.53.png

IP address 10.0.1.242, one of the secondary IP addresses of the ESG’s “2Physical” interface, is translated to web-1’s IP address (172.16.1.50).
Don’t forget to create one more DNAT rule for the web-2 virtual machine using one of the other secondary IP addresses of the ESG’s “2Physical” interface (I chose 10.0.1.241 for example).

Installing the Web Servers

With the NAT rules in place we should be able to SSH into our virtual machines from the physical network:

ssh root@10.0.1.242

The default root password on the Photon OS OVA is “changeme” which we’ll have to change after first login.

We can now test if our SNAT rule is working by querying and installing available OS updates. Do this on both machines:

tdnf update

When it’s done updating reboot the servers:

reboot

After reboot, SSH back into the virtual machines to enable and start the Docker engine:

systemctl enable docker
systemctl start docker

Next pull the NGINX docker container:

docker pull nginx

We want the NGINX server to use an “index.html” that is stored outside of the docker  container. Create an “index.html” on each of the servers:

mkdir -p ~/docker-nginx/html
vi ~/docker-nginx/html/index.html

Copy and paste this into index.html on web-1:

<h1>This is web-1</h1>

And paste this into index.html on web-2:

<h1>This is web-2</h1>

Set the following permissions on the html directory (never do this in production):

chmod 755 ~/docker-nginx -R

We can now run the container. On each of the virtual machines issue the following command:

docker run --name docker-nginx -p 80:80 -d -v ~/docker-nginx/html:/usr/share/nginx/html nginx

With the containers up and running we should be able to access the web servers from the physical network. In my case I browse to http://10.0.1.242 and hit web-1’s “index.html”

Skärmavbild 2018-08-18 kl. 20.05.16

Make sure this works on both web servers before moving on.

Configuring the Logical Load Balancer

The web servers are up and running. It’s time to enable high availability service and start load balancing.

We start by defining a new server pool we’ll call “web-pool”. On the ESG click the “Load Balancer” tab and choose “Pools”. Add a new pool:

Skärmavbild 2018-08-18 kl. 22.02.43.png

Add our two virtual machines to the pool. Notice that you can specify vCenter/NSX objects (which require VMware Tools or ARP/DHCP Snooping) as well as IP addresses. The Photon OS virtual machines are running open-vm-tools meaning we can pick virtual machine objects for these servers.

The other thing we need to specify here is the service monitor. Choose “default_http_monitor” which by default executes a “HTTP GET” to the servers in the pool every 5 seconds.

Head over to “Virtual Servers”. Here we configure our virtual IP (VIP) and tie it to our server pool:

Skärmavbild 2018-08-18 kl. 22.30.32

Give the virtual server a name and enter the last available secondary IP address of the ESG’s “2Physical” interface (10.0.1.243 in my case). Select the pool that we just created as the default pool.

Finally enable the load balancer service under “Global Configuration”.

Testing the Logical Load Balancer

It’s time to test the load balancer. Open a web browser and navigate to http://VIP_IP. In my case http://10.0.1.243. You should now hit one of the web server’s “index.html”:

Skärmavbild 2018-08-19 kl. 17.11.37.png

Now reload the page in your browser.

Skärmavbild 2018-08-19 kl. 17.14.31.png

Voilà! The load balancer sent the request to the next available server in the pool which in my case was web-1.

Every time you reload the web page you will end up on the next available server in the pool. This is the intended behavior when using the Round Robin load balancing algorithm.

Let’s stop the NGINX container on one of the virtual machines:

docker stop docker-nginx

Now reload the web page a couple of times. Instead of a timeout every other time the load balancer keeps sending the requests to the only healthy server left in the pool.

Start the NGINX container again:

docker start docker-nginx

Keep reloading the web page in your browser and you’ll notice that it doesn’t take many seconds before the load balancer picks up the server and start sending requests to it again.

Conclusion

NSX-V, a key component of the VMware vSphere SDDC, delivers a very competent load balancer that is easy to install and configure.
In this short exercise we kept things really simple. There are more things to consider in a production environment, but building a lab like this hopefully gives you some idea of the concepts and components involved configuring a NSX-V logical load balancer.