• Configuring DPU-Based Acceleration for NSX

    Offloading the NSX Distributed Firewall (DFW) to a Data Processing Unit (DPU) is an exciting new feature which is GA as of NSX version 4.1. Other NSX features that were already supported within DPU-based acceleration for NSX are:

    • L2 and L3 overlay networking
    • L2 VLAN networking
    • Observability features such as packet capture, IPFIX, TraceFlow, and port mirroring

    For NSX DFW, offloading and accelerating by a DPU means layer 4 traffic flows go through the following process:

    1. When the first packet arrives, it is considered as a flow miss and processed at software level.
    2. The new packet is forwarded for software slow path processing:
      • If a packet is not allowed by a rule, the packet is dropped and a flow’s entry is not created.
      • If the packet is allowed, a flow entry is created.
    3. When the software processing successfully inserts a flow entry, it programs the flows in the DPU hardware for faster processing.

    In this article I’ll walk through step 0 which is enabling DPU-based acceleration for NSX. It’s the only step that requires some manual configuring. The rest is taken care of for you by vSphere and NSX.

    The process is so easy that I don’t actually expect people will need to read an article like this. On the other hand, there’s some value in sharing the steps of a simple configuration procedure as well. So let’s just get started!

    Lab Environment

    The following are the hardware and software components relevant for this exercise:

    • 3 x Dell PowerEdge R750 with NVIDIA BlueField-2 DPU
    • vCenter 8.0 Update 1
    • ESXi 8.0 Update 1
    • NSX 4.1

    The components above are in place and now we are tasked with enabling DPU-based acceleration for NSX in this environment.

    Step 1 – Distributed Switch With Network Offload Compatibility

    The first thing we need to do is create a vSphere Distributed Switch (VDS) that supports network offloads to DPU.

    Create the VDS

    In vCenter under Networking create a new Distributed Switch using the normal procedure:

    Make sure to select version 8.0.0 in the next step as this is the version that supports DPU network offloading:

    In the next dialog we configure the Network Offloads compatibility. The servers in this lab are equipped with the NVIDIA BlueField-2 DPU so we’ll select NVIDIA BlueField here:

    Notice that the DPU comes with two SFP interfaces so we also configure the number of VDS uplinks to be 2:

    Add ESXi Hosts

    Next we add the ESXi hosts to the DPU compatible VDS.

    The two interfaces on the DPU are mapped to ESXi vmnic2 and vmnic3 which we in turn assign to Uplink 1 and Uplink 2 on the VDS:

    With the ESXi hosts and their respective DPUs added to the DPU compatible VDS, we continue with configuring the NSX side of things.

    Step 2 – Prepare Host Transport Nodes

    Preparing ESXi hosts as NSX Host Transport Nodes backed by DPU is more or less done by following standard procedure.

    Create Uplink Profile

    First we create an uplink profile. Navigate to System > Fabric > Profiles > Uplink Profiles and click + Add Profile. We’ll give the uplink profile some descriptive name like dpu-uplink-profile, configure teaming and optionally a transport VLAN when overlay networking is in scope:

    We define 2 active uplinks in the profile, one for each DPU interface.

    Create Transport Node Profile

    Next we create a transport node profile under System > Fabric > Hosts > Transport Node Profile. Let’s call this transport node profile dpu-tn-profile:

    Click Set in the Host Switch column and in the next dialog click on Add Host Switch.

    It’s here we select our vCenter, the DPU compatible VDS, relevant transport zones, the uplink profile, and map the uplinks which we defined in our uplink profile to the VDS uplinks:

    Note that under Advanced Configuration > Mode we must select Enhanced Datapath (either Standard or Performance) as the Mode when the selected VDS is a DPU compatible VDS.

    Configure ESXi Hosts

    The final step is to configure the ESXi hosts and for this we use our transport node profile.

    Navigate to System > Fabric > Hosts > Clusters. Select the vSphere cluster that contains the ESXi hosts with DPU hardware and click Configure NSX:

    In the dialog that pops up select the transport node profile that we created in the previous step (dpu-tn-profile) and press Save. The NSX installation and configuration on the ESXi hosts kicks off.

    Once completed it could be interesting to have a quick look at the details of one of the configured ESXi hosts. This should confirm that the DPU-backed interfaces are claimed as NSX uplink-1 and uplink-2 respectively:

    And this completes the configuration of DPU-based acceleration for NSX. From here any workload that is connected to an NSX segment (VLAN or overlay) will benefit from the offloading and acceleration capabilities offered by the NSX programmed DPU interfaces.

    Summary

    As simple as VMware has made it to set all of this up, I personally consider DPU-based acceleration for NSX to be a serious game changer that offers a variety of new design options to organizations and their private cloud initiatives. I’m excited to see what’s next.

  • NSX Application Platform – Installation Notes

    A while back I needed to deploy the NSX Application Platform (NAPP) in my lab environment to demonstrate features like NSX Intelligence and the ones within NSX Advanced Threat Prevention (ATP).

    In my experience, deploying NAPP can be more or less of an undertaking depending largely on whether the prerequisites are in place and the requirements are met. Thoroughly reading through the documentation as well as some level of comfort working with Kubernetes do come in handy too.

    Today’s article is not so much a guide on how to install the NSX Application Platform as it is my own documentation on deploying NAPP in my specific lab environment. My documentation might help you with your NAPP deployment as well (I hope it does), but it will not include much information on the “whys” and the “hows” and instead focus on getting the job done.

    Lab Overview

    The lab environment for this NAPP deployment consists of the following VMware components:

    • vCenter 8.0 Update 1c
    • ESXi 8.0 Update 1c
    • NSX 4.1.1.0

    The goal is to install the Evaluation Form Factor of NAPP using just one VLAN. As a result of that I ended up with the following configuration:

    Install & Configure Kubernetes

    Installing Kubernetes on Ubuntu 22.04 involves relatively many steps (when not automated) but is rather straightforward and well documented. In my case I deployed two Ubuntu virtual machines with the following specifications:

    Control Node (controller01)

    • 8 vCPU
    • 8 GB RAM
    • 1 x 500 GB disk
    • 1 x NIC connected to Management VLAN
    • FQDN: controller01.sddc.lab
    • IP address: 10.203.240.50/24

    Worker Node (worker01)

    • 16 vCPU
    • 96 GB RAM
    • 1 x 500 GB disk
    • 1 x NIC connected to Management VLAN
    • FQDN: worker01.sddc.lab
    • IP address: 10.203.240.51/24

    It’s important that the nodes can resolve their own and each other’s FQDNs. Preferably by creating DNS records.

    After the Ubuntu base OS installation we are ready to install and configure the Kubernetes cluster. Run the following commands on both nodes:

    sudo apt update && sudo apt upgrade
    
    sudo apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
    
    curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
    
    echo "deb https://apt.kubernetes.io/ kubernetes-xenial main">> ~/kubernetes.list
    sudo mv ~/kubernetes.list /etc/apt/sources.list.d
    sudo apt update
    
    export VERSION="1.24.16-00"
    sudo apt-get install -y kubelet=$VERSION kubeadm=$VERSION kubectl=$VERSION
    sudo apt-mark hold kubelet kubeadm kubectl
    
    cat<<EOF | sudo tee /etc/modules-load.d/k8s.conf
    overlay
    br_netfilter
    EOF
    
    sudo modprobe overlay
    sudo modprobe br_netfilter
    
    sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.ipv4.ip_forward = 1
    EOF
    
    sudo sysctl --system
    
    sudo mkdir -p /etc/apt/keyrings
    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    
    echo \
    "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"| sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    
    sudo apt update
    sudo apt install containerd.io -y
    containerd config default | sudo tee /etc/containerd/config.toml >/dev/null 2>&1
    sudo sed -i 's/SystemdCgroup \= false/SystemdCgroup \= true/g' /etc/containerd/config.toml
    
    sudo systemctl enable containerd
    sudo systemctl start containerd
    
    sudo sed -ri '/\sswap\s/s/^#?/#/' /etc/fstab
    sudo swapoff -a
    
    sudo systemctl enable kubelet
    sudo systemctl start kubelet

    Run the following commands on the Control node:

    sudo kubeadm init --control-plane-endpoint=controller01.sddc.lab --pod-network-cidr=10.244.0.0/16
    
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    kubeadm token create --print-join-command

    It’s important to add the --pod-network-cidr to the kubeadm init command as this will enable NodeIpamController in Antrea which we’ll install in a moment.

    Copy the output of the last command and run it on the Worker node. For example:

    kubeadm join controller01.sddc.lab:6443 --token wo64hi.83y9s2jh9p3dt7m9 --discovery-token-ca-cert-hash sha256:4218627e8cf43588a0d1e849529cb144a718a8f2ffdd3e3fd0d1caa01d5afd84

    Install Antrea

    Next we need to install a CNI. In my lab this is Antrea. Run the following command on the Control node:

    kubectl apply -f https://raw.githubusercontent.com/antrea-io/antrea/main/build/yamls/antrea.yml

    With Antrea installed and running we should have a working Kubernetes cluster. From the Control node run:

    kubectl get nodes -o wide

    The nodes report the Ready status. This completes the installation of the Antrea CNI.

    Install & Configure MetalLB

    In this lab MetalLB will be responsible for load-balancing. Install the MetalLB Kubernetes load-balancer by running the following command on the Control node:

    kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.10/config/manifests/metallb-native.yaml

    Next, we need to create an IPAddressPool that MetalLB uses when assigning IP addresses to services. Create a file called mbpool.yaml with the following contents:

    apiVersion: metallb.io/v1beta1
    kind: IPAddressPool
    metadata:
      name: napp-pool
      namespace: metallb-system
    spec:
      addresses:
      - 10.203.240.60-10.203.240.70

    Create the IPAddressPool in Kubernetes from the Control node:

    kubectl apply -f mbpool.yaml 

    An L2Advertisement is needed as well. Create a file called l2adv.yaml with the following contents:

    apiVersion: metallb.io/v1beta1
    kind: L2Advertisement
    metadata:
      name: napp-pool
      namespace: metallb-system

    Create the L2Advertisement in Kubernetes from the Control node:

    kubectl apply -f l2adv.yaml 

    This completes the installation and configuration of the MetalLB Kubernetes load-balancer .

    Install & Configure vSphere Container Storage Plug-in

    In order to provide the required storage to the different NAPP Kubernetes Pods we install and configure the vSphere Container Storage Plug-in.

    First we create the required Kubernetes namespace by running the following from the Control node:

    kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v3.0.0/manifests/vanilla/namespace.yaml

    Next we create a configuration file that we’ll use when creating the Kubernetes Secret in a moment. The contents of my csi-vsphere.conf:

    [VirtualCenter "netlab-vcenter.netlab.home"]
    insecure-flag = "true"
    user = "Administrator@vsphere.local"
    password = "VMware1!"
    port = "443"
    datacenters = "SDDC"

    Now we create the Kubernetes Secret from the Control node:

    kubectl create secret generic vsphere-config-secret --from-file=csi-vsphere.conf --namespace=vmware-system-csi

    Install the plug-in from the Control node:

    kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v3.0.0/manifests/vanilla/vsphere-csi-driver.yaml

    As we’re running a single control plane node (controller01) we should make the following adjustments to the CSI deployment. From the Control node:

    kubectl edit deployment vsphere-csi-controller -n vmware-system-csi

    We adjust the number of replicas to be just 1:

    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: vsphere-csi-controller
      namespace: vmware-system-csi
    spec:
      replicas: 1

    The last step is to create a StorageClass which makes vSphere storage available to the Kubernetes Pods. First we define a new Storage Policy in vCenter:

    Give it a name and enable host based rules (my vSphere environment uses VMFS-backed storage):

    To be able to create this policy at all we must define a host based service. Keeping it simple I define a Storage I/O Control service with Normal IO shares allocation:

    With the “k8s” vSphere Storage Policy in place we can continue with creation of a matching StorageClass in Kubernetes. On the Control node create a file called sc.yaml with the following contents:

    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: vsphere-csi-storageclass
      annotations:
        storageclass.kubernetes.io/is-default-class: "true"
    provisioner: csi.vsphere.vmware.com
    parameters:
      storagepolicyname: "k8s"

    Now we can create the StorageClass using this file from the Control node:

    kubectl apply -f sc.yaml

    This completes the installation and configuration of the vSphere Container Storage Plug-in.

    Register DNS Records For NSX Application Platform

    We need to register two DNS records for NAPP to function properly; one for the NAPP Interface Service Name and one for the NAPP Messaging Service Name. The IP addresses for these records are taken from the Kubernetes IPAddressPool that we created earlier.

    The IP address claimed for the NAPP Interface Service Name is always the first available IP address in the defined IPAddressPool. In my case this is 10.203.240.60. For NAPP Messaging Service Name it can be any other available IP address from the IPAddressPool. The NAPP deployment will figure that one out by doing a DNS lookup.

    I ended up adding these two records to my DNS server:

    napp-service.sddc.lab.		A	10.203.240.60
    napp-messaging.sddc.lab.	A	10.203.240.69

    Deploy NSX Application Platform

    NAPP is deployed from the NSX Manager UI. It’s a wizard-driven deployment process that is easy to go through.

    We navigate to System > NSX Application Platform and click on Deploy NSX Application Platform:

    We use the following URLs for Helm and Docker:

    • Helm Repository: oci://projects.registry.vmware.com/nsx_application_platform/helm-charts
    • Docker Registry: projects.registry.vmware.com/nsx_application_platform/clustering

    On the next step in the deployment wizard (Configuration) we need to upload our Kubernetes configuration file. In my case I simply copied that file from the Control node. It’s located here: ~/.kube/config.

    Other values entered during the Configuration step:

    • Storage Class: vsphere-csi-storageclass
    • Interface Service Name: napp-service.sddc.lab
    • Messaging Service Name: napp-messaging.sddc.lab
    • Form Factor: Evaluation

    In the next step we run the pre-checks:

    Note that the warning on the Time Synchronization pre-check is not critical and will not impact the NAPP deployment process.

    At the last step, Review & Deploy, we kick off the NAPP deployment:

    Deploying NAPP will take a while so now is a good time for some coffee and leg stretching:

    Once the deployment has finished you’ll hopefully be welcomed by a screen like the one below which confirms that the platform is up and running:

    Enable Features

    Now that NAPP is in place we can enable the different features that are powered by it:

    • Metrics
    • NSX Intelligence
    • NSX Network Detection And Response
    • NSX Malware Prevention

    To enable the NSX Intelligence feature we hover over the NSX Intelligence tile and click Activate. A number of pre-checks must be run before we can activate the feature:

    Once pre-checks completed successfully we can click Activate. This will activate the NSX Intelligence feature. Behind the scenes NSX Intelligence specific Kubernetes Pods are being created and instantiated. This process might take a while to complete and the result should look something like this:

    With the feature active we can start using NSX Intelligence for the automatic mapping and analytics of network communication flows. The module is found in the NSX Manager UI under Plan & Troubleshoot > Discover & Take Action:

    Summary

    This completes my installation documentation on deploying the NSX Application Platform in my lab environment. I hope you found it useful.

    Thanks for reading.

  • BGP EVPN Between NSX And VyOS – Part 2

    Welcome back! In Part 1 we configured and prepared NSX to participate in a BGP EVPN control and data plane. In this part we continue with configuration of the VyOS router. Once both NSX and VyOS are configured we’ll verify that everything is working as intended.

    Lab Overview

    The lab environment for this exercise consists of the following components:

    • vCenter 8.0 Update 1c
    • ESXi 8.0 Update 1c
    • NSX Manager 4.1.0.2
    • 2 x NSX Edge nodes (VM form factor, Large)
    • 1 x Tier-0 Gateway
    • 1 x VyOS 1.4 router (VM)

    The following table lists the configuration items that are relevant for this article.

    ItemValueDescriptionScope/SpanConfigured
    VLAN 24410.203.244.0/24VLAN for Geneve transport Edge nodes, ESXi hostsYes
    VLAN 24610.203.246.0/24VLAN for BGP Uplink 1Edge nodesYes
    VLAN 24710.203.247.0/24VLAN for BGP Uplink 2Edge nodesYes
    VLAN 10172.16.10.0/24Tenant Red VLANVyOS, VRF RedNo
    VLAN 20172.16.20.0/24Tenant Blue VLANVyOS, VRF BlueNo
    Segment Red10.204.245.0/24Tenant Red NSX overlay segmentNSX, VRF RedNo
    Segment Blue10.204.246.0/24Tenant Blue NSX overlay segmentNSX, VRF BlueNo
    dummy/loopback192.168.100.0/24IP CIDR for VXLAN TEPsVyOS, Edge nodesn/a
    VyOS router ASN65240BGP ASN on the VyOS routerVyOSYes
    NSX Tier-0 ASN65241BGP ASN on the NSX Tier-0 gatewayNSXYes
    RD VRF Red NSX65241:1Route Distinguisher for Red VRF in NSXNSXNo
    RD VRF Blue NSX65241:2Route Distinguisher for Blue VRF in NSXNSXNo
    VNI Pool75001 – 75010EVPN/VXLAN VNI PoolNSXNo
    VNI Red75001VNI for Red VRFNSXNo
    VNI Blue75002VNI for Blue VRFNSXNo

    Diagram

    Below the high-level diagram showing what it is we’re about to build.

    We’re in the process of creating isolated network data paths for our tenants “Red” and “Blue”. By the end of this exercise each tenant’s VM will be able to communicate with that tenant’s physical server. The VMs are connected to NSX overlay segments and the physical servers to isolated VLANs.

    Preparing the VyOS Router

    Like NSX, the VyOS router needs to be configured for BGP EVPN. There are quite some steps involved so we better get started!

    Step 1 – Configure BGP Settings

    eBGP is up and running between the NSX Tier-0 gateway and the VyOS router but we need to configure some additional settings in order to make the router ready for participation in BPG EVPN.

    Advertise L2VPN EVPN Capability

    VyOS needs to inform its NSX BGP neighbors that it’s capable of doing L2VPN EVPN. So for each neighbor entry we need to add the following configuration:

    set protocols bgp neighbor 10.203.246.2 address-family l2vpn-evpn
    set protocols bgp neighbor 10.203.246.3 address-family l2vpn-evpn
    set protocols bgp neighbor 10.203.247.2 address-family l2vpn-evpn
    set protocols bgp neighbor 10.203.247.3 address-family l2vpn-evpn

    Advertise VXLAN VNIs

    VXLAN VNIs need to be advertised back and forward between NSX and VyOS and the following command accomplishes this on the VyOS router:

    set protocols bgp address-family l2vpn-evpn advertise

    Step 2 – Create Dummy Interfaces

    For the router-internal transport of the VXLAN encapsulated traffic we need to “front” the VXLAN interfaces (created in the next step) with a a dummy interface. VyOS dummy interfaces are basically loopback interfaces:

    set interfaces dummy dum0 address 192.168.100.100/32

    The IP adress here is taken from the “dummy/loopback CIDR” item in the table above. As you might remember we used “192.168.100.102” and “192.168.100.103” as EVPN Tunnel Endpoints on the NSX Tier-0 gateway.

    These IP addresses can be anything really as long as they don’t overlap with something existing of course. The important thing is that they’re being advertised to the EVPN counterpart.

    Advertise dum0 Interface IP Address Through eBGP

    One way to accomplish is by simply adding the IP address of the dum0 interface to the existing eBGP dynamic routing process that’s already running between the NSX Tier-0 gateway and the VyOS router:

    set protocols bgp address-family ipv4-unicast network 192.168.100.100/32

    Step 3 – Validate

    Now is a good time to verify that our Tier-0 gateway understands that the VyOS router is capable of doing L2VPN EVPN and that dum0’s IP address is in the route table.

    The easiest way to do this is by logging into an NSX edge node through SSH and use the NSXCLI.

    First we check in which VRF our Tier-0 SR is living:

    get gateway

    The Tier-0 SR is in VRF #3 so lets enter that context:

    vrf 3

    Now we can check what it knows about its neighbor the VyOS router:

    get bgp neighbor 

    And we’re interested in the capabilities that are being advertised by the neighbor:

    “Address Family L2VPN EVPN: Advertised and received” looks good to me. Next we inspect the route table within the same VRF (Tier-0 SR):

    get route bgp

    “192.168.100.100/32” ended up in the route table. Twice because we are double peered with the VyOS router over two VLANs so that’s what we expected. Thumbs up!

    Step 4 – Create VXLAN Interfaces

    The VyOS VXLAN interfaces are responsible for encapsulation and decapsulation of L2 frames. These are essentially the TEPs on the VyOS side. We create one VXLAN interface per tenant:

    Tenant Red:

    set interfaces vxlan vxlan75001 vni 75001 
    set interfaces vxlan vxlan75001 port 4789 
    set interfaces vxlan vxlan75001 mtu 1600
    set interfaces vxlan vxlan75001 parameters nolearning
    set interfaces vxlan vxlan75001 source-address 192.168.100.100
    

    Tenant Blue:

    set interfaces vxlan vxlan75002 vni 75002
    set interfaces vxlan vxlan75002 port 4789 
    set interfaces vxlan vxlan75002 mtu 1600
    set interfaces vxlan vxlan75002 parameters nolearning
    set interfaces vxlan vxlan75002 source-address 192.168.100.100
    

    Step 5 – Create VRFs

    Each tenant will have its own VRF within the VyOS router as well. The VRFs contain the tenant-specific settings for BGP and EVPN like route distinguisher, route-targets as well as VNI.

    Tenant Red:

    set vrf name red protocols bgp address-family ipv4-unicast redistribute connected
    set vrf name red protocols bgp address-family l2vpn-evpn advertise ipv4 unicast
    set vrf name red protocols bgp address-family l2vpn-evpn rd 65240:1
    set vrf name red protocols bgp address-family l2vpn-evpn route-target import 65241:1
    set vrf name red protocols bgp address-family l2vpn-evpn route-target export 65240:1
    set vrf name red table 1002
    set vrf name red vni 75001

    Tenant Blue:

    set vrf name blue protocols bgp address-family ipv4-unicast redistribute connected
    set vrf name blue protocols bgp address-family l2vpn-evpn advertise ipv4 unicast
    set vrf name blue protocols bgp address-family l2vpn-evpn rd 65240:2
    set vrf name blue protocols bgp address-family l2vpn-evpn route-target import 65241:2
    set vrf name blue protocols bgp address-family l2vpn-evpn route-target export 65240:2
    set vrf name blue vni 75002

    Step 6 – Create VIFs

    Following our diagram, each tenant should receive its own VLAN in the data center where the tenant’s physical server is to be connected. Let’s instantiate the VIFs for these:

    set interfaces ethernet eth1 vif 10 description "Tenant Red VLAN"
    set interfaces ethernet eth1 vif 20 description "Tenant Blue VLAN"

    Note that we do not assign IP addresses to the VIF interfaces. Not directly at least (hint: check Step 7).

    FYI. My VyOS router has two physical interfaces: eth0 and eth1. The eth0 interface is used as uplink to an upstream router and eth1 is an 802.1q trunk on which the different VIFs reside. So therefore the tenant VIFs are backed by eth1 and become sub-interfaces of eth1.

    Step 7 – Create Bridge Interfaces

    So far, all the configuration has been around logical constructs. At some point we need to “hit the road” and that point is here and now.

    In VyOS, in the case of VXLAN, we bring logical and physical together in a bridge interface. We create one bridge interface per tenant.

    Tenant Red:

    set interfaces bridge br75001 vrf red
    set interfaces bridge br75001 address 172.16.10.1/24
    set interfaces bridge br75001 member interface vxlan75001
    set interfaces bridge br75001 member interface eth1.10

    Tenant Blue:

    set interfaces bridge br75002 vrf blue
    set interfaces bridge br75002 address 172.16.20.1/24
    set interfaces bridge br75002 member interface vxlan75002
    set interfaces bridge br75002 member interface eth1.20

    Through the bridge interface we, indirectly, assign an IP address to the tenant VIFs (eth1.10 and eth1.20) which are member interfaces of the bridge interfaces.

    This completes the configuration of the VyOS router. Quite a few steps and explaining every line would take a lot of space but I hope most of it is rather self explanatory.

    For your reference I have published my VyOS config for this lab on GitHub in case you want to compare or see the big picture (or find mistakes and want to suggest improvements).

    Validation

    NSX configured and VyOS configured. It’s about time to verify that we have a working BGP EVPN control and data plane.

    The VyOS Side

    On the VyOS side of things we use a couple of commands to check the control plane status:

    show bgp l2vpn evpn

    As we can see in the screenshot above, EVPN type-5 prefixes for 10.204.245.0/24 and 10.204.246.0/24 (tenant Red’s and tenant Blue’s NSX overlay segment IP subnets) have been received through BGP EVPN. Both VXLAN TEPs on the Tier-0 gateway (192.168.100.102 and 192.168.100.103) have sent the prefixes.

    show ip route vrf red

    The above command and screenshot show us the route table for VRF Red. We can see that it contains a route to tenant Red’s NSX overlay IP subnet. Learned through BGP and distributed by the Tier-0 VXLAN TEPs.

    By now it’s pretty clear that we have a working BGP EVPN control plane. To test the functioning of the data plane on the VyOS router side we can run a simple ping from VRF Blue to tenant Blue’s virtual machine connected to tenant Blue’s NSX overlay segment:

    ping 10.204.246.20 vrf blue count 4

    We have a a functional data plane! Can we ping tenant Red’s virtual machine from VRF Blue?

    ping 10.204.245.10 vrf blue count 4

    Nope, there is no route i VRF Blue’s route table that leads to the 10.204.245.0/24 network.

    The NSX Side

    We have strong indications that our BGP EVPN configuration is working, but let’s also have a look at how to verify things from the NSX side.

    Beginning on one of the NSX edge nodes we enter the VRF for “SR-VRF-VRF Red” and inspect the route table:

    get route bgp

    We can see that the IP subnet assigned to tenant Red’s VLAN (172.16.10.0/24) ended up in VRF Red’s route table in NSX.

    Just like on the VyOS side we can check the status for EVPN from the edge node. This is done from the Tier-0 SR VRF:

    get bgp evpn

    The output is very similar to what we saw when running the “show bgp l2vpn evpn” on the VyOS router. Type-5 EVPN prefixes are being received.

    The Workload Side

    Now that we’ve validated functionality on the router level, it’s perhaps a good time to move up a couple of layers and make sure that workloads also can leverage this brand new and shiny network data path.

    In the screenshot below we’ve logged in to tenant Red’s VM:

    After verifying the VM’s IP address (10.204.245.10/24) we run a ping to the tenant’s physical server (172.16.10.10/24) connected to the tenant’s VLAN which is successful. We also try to ping tenant Blue’s physical server (172.16.20.20/24) which is not successful as expected.

    In the screenshot below we’ve logged in to tenant Blue’s physical server:

    After verifying the VM’s IP address (172.16.20.20/24) we run a ping to the tenant’s virtual machine (10.204.246.20/24) that is connected to the tenant’s NSX overlay segment which is successful. We also try to ping tenant Red’s virtual machine (10.204.245.10) which is not successful as expected.

    The tenant workloads can use their respective data path. Isolated from NIC to NIC, traversing NSX overlay into an isolated VLAN. inthe data center. Mission accomplished

    Summary

    It does not get more exciting than this I’m afraid. Or maybe it’s exciting enough. 🙂

    Anyhow, in this second and last part we configured the VyOS router to play BGP EVPN ball with NSX. Configuring the VyOS side was a bit more work compared to the NSX side, but by no means difficult. Once configuration on both sides was in place we verified that we had a functional EVPN control and data plane using VyOS and NSX CLI commands as well as ping tests directly from the involved tenant workloads.

    In these articles we looked specifically at VyOS because that’s what I have my lab, but you probably understand that this technology will most likely work with the physical network equipment you have in your data center today.

    Don’t hesitate to reach out if you have any questions. Thanks for reading.

  • BGP EVPN Between NSX And VyOS – Part 1

    Recently I’ve been looking into setting up BGP EVPN between VMware NSX and VyOS router. I’m using VyOS quite a lot in labs and demos, often as the counterpart to a Tier-0 gateway, and wanted to find out if it was capable of a somewhat more advanced feature like BGP EVPN.

    It took some research as well as some good ol’ trial and error, but I’m happy to report that I was successful in my endeavor. And to be honest, it is a pretty straight forward process, but things usually are once you know how to do it. 🙂

    Sharing is caring and that’s why in this and the next article I will walk through setting up BGP EVPN between NSX and VyOS. In part 1 we will deal with configuring and preparing the NSX environment and in part 2 we’ll configure the VyOS router and make sure everything comes together.

    Before we begin let’s have a quick look at some background around what BGP EVPN is and how it’s used in data centers and within NSX.

    BGP EVPN

    Ethernet VPN (EVPN) is a BGP distributed control plane for Network Virtualization Overlay (NVO). It provides Layer 2 and Layer 3 connectivity over underlay networks. Initially it was designed for use with MPLS in service provider networks but EVPN has been widely adopted in data centers as a control plane mechanism for VXLAN overlay networking due to advantages in BGP scalability and flexibility.

    The use case for BGP EVPN In NSX

    Within NSX, BGP EVPN technology is used to interconnect and extend NSX-managed overlay networks to other data center environments that are not managed by NSX. VXLAN encapsulation is used between NSX TEPs (edge nodes and hypervisors) and external network devices to ensure data plane compatibility.

    In NSX you can choose between two connectivity modes for the EVPN implementation: Inline mode and Route Server mode.

    Inline mode

    In this mode the Tier-0 Gateway joins the BGP EVPN control plane together with external routers to exchange routing information. The data plane consists of NSX edge nodes which forward traffic to and from the hypervisors. TEPs used for the data plane VXLAN encapsulation are configured on each edge node.

    Route Server mode

    As with inline mode the Tier-0 Gateway establishes a BGP EVPN control plane to exchange routing information with the external routers but in the data plane it is the ESXi hypervisor that forwards the traffic. The same TEPs that are used for the GENEVE encapsulation (east-west traffic) are used for the BGP EVPN data plane VXLAN encapsulation.

    In these articles we will focus on configuring BGP EVPN in Inline mode.

    Lab Overview

    The lab environment for this exercise consists of the following components:

    • vCenter 8.0 Update 1c
    • ESXi 8.0 Update 1c
    • NSX Manager 4.1.0.2
    • 2 x NSX Edge nodes (VM form factor, Large)
    • 1 x Tier-0 Gateway
    • 1 x VyOS 1.4 router (VM)

    The following table lists configuration items that are relevant for this article.

    ItemValueDescriptionScope/SpanConfigured
    VLAN 24410.203.244.0/24VLAN for Geneve transportEdge nodes, ESXi hostsYes
    VLAN 24610.203.246.0/24VLAN for BGP Uplink 1Edge nodesYes
    VLAN 24710.203.247.0/24VLAN for BGP Uplink 2Edge nodesYes
    VLAN 10172.16.10.0/24Tenant Red VLANVyOS, VRF RedNo
    VLAN 20172.16.20.0/24Tenant Blue VLANVyOS, VRF BlueNo
    Segment Red10.204.245.0/24Tenant Red NSX overlay segmentNSX, VRF RedNo
    Segment Blue10.204.246.0/24Tenant Blue NSX overlay segmentNSX, VRF BlueNo
    dummy/loopback192.168.100.0/24IP CIDR for VXLAN TEPsVyOS, Edge nodesn/a
    VyOS router ASN65240BGP ASN on the VyOS routerVyOSYes
    NSX Tier-0 ASN65241BGP ASN on the NSX Tier-0 gatewayNSXYes
    RD VRF Red NSX65241:1Route Distinguisher for Red VRF in NSXNSXNo
    RD VRF Blue NSX65241:2Route Distinguisher for Blue VRF in NSXNSXNo
    VNI Pool75001 – 75010EVPN/VXLAN VNI PoolNSXNo
    VNI Red75001VNI for Red VRFNSXNo
    VNI Blue75002VNI for Blue VRFNSXNo

    Diagram

    Let’s have a look at a high-level diagram showing what we’re about to build.

    Diagrams showing BGP EVPN networking can become very “busy” and therefore I intentionally left out a lot of details right now just to keep the focus on what it is we’re trying to achieve.

    The business requirement that we’re going to look into here is separation and isolation of tenant network traffic. This separation and isolation begins at the tenant’s NSX overlay segment and extends into the physical data center (and beyond). In this specific scenario our tenants “Red” and “Blue” will end up with each their isolated data path spanning from the vNIC of their respective virtual machine(s) to a tenant dedicated VLAN out in the data center. The data path extension is facilitated by BGP EVPN VXLAN tunnels that are established between the NSX edge nodes and the VyOS router.

    Preparing The NSX Environment

    The assumption here is that eBGP is already configured and functional between the Tier-0 gateway and the VyOS router. Some VLANs are also in place but other than that not much has been prepared so let’s get started!

    Step 1 – Configure Tier-0 Gateway Settings

    We have eBGP up and running between the Tier-0 gateway and the VyOS router but we need to configure some additional items in order to make the gateway ready for BGP EVPN.

    Route Filter

    The Tier-0 needs to announce (for VyOS) that it is capable of doing L2VPN EVPN. To configure this we navigate to Networking > Tier-0 Gateways and expand the Tier-0 gateway. Click on or expand BGP and click the number to the right of BGP Neighbors.

    In the Set BGP Neighbors dialog you’ll see the BGP neighbor entries. For each entry click on the number in the Route Filter column.

    This will bring up a new dialog where we can edit the route filter once we’ve clicked on Edit.

    We can now click on Add Route Filter and add L2VPN EVPN to the filter. We leave all other settings as they are.

    Repeat this configuration for the other neighbor entry.

    EVPN Settings

    Some specific EVPN settings are required and these settings are found under EVPN Settings.

    Click Edit on the Tier-0 gateway and change the EVPN Mode to Inline. Next create a new EVPN/VXLAN VNI Pool. As per the table above the VNI range will be from 75001 to 75010.

    The last thing we need to configure under EVPN Settings is EVPN Tunnel Endpoint. These are the IP addresses for the VXLAN TEP interfaces that will be instantiated on the edge nodes. Each edge node will have its own TEP interface.

    The IP addresses for these TEPs are taken from the “dummy/loopback” CIDR documented in the table above. We configure 192.168.100.102 for edge node 1 and 192.168.100.103 for edge node 2. These IP addresses don’t belong to any existing VLAN or overlay segment and need to be advertised to the VyOS router.

    Route Re-distribution

    We use the existing eBGP process between the Tier-0 and the VyOS router to get the VXLAN TEP IP address out there. This is configured on the Tier-0 under Route Re-distribution.

    Create a new entry or update an existing one so that it includes route re-distribution for EVPN TEP IP.

    Step 2 – Validate

    Now that the Tier-0 gateway has been prepared for BGP EVPN, it’s a good time to verify that the VyOS router knows about the new capability and the VXLAN TEP IP addresses.

    Log in to the VyOS router and run the following command:

    show bgp neighbors

    This command will give us details about each BGP neighbor configured. We’re specifically interested in what is listed under Neighbor capabilities:

    As we can see in the screenshot above the L2VPN EVPN capability is advertised and received . Now let’s have a quick look at the routing table:

    show ip route bgp

    We can see that the configured VXLAN TEP IP addresses on our edge nodes are in the table.

    Step 3 – Create VRF Gateways

    Each tenant gets its own NSX VRF gateway and now is the time to create them.

    Navigate to Networking > Tier-0 Gateways and click on Add Gateway. Select VRF.

    The following settings are configured for the VRF for tenant Red:

    ItemValueDescription
    NameVRF RedWhat’s in a name?
    Connect to Tier-0 GatewayT0-Gateway-01The parent Tier-0 gateway
    VRF Settings > Route Distinguisher65241:1Distinguishes routes coming from this VRF
    VRF Settings > EVPN Transit VNI75001The VXLAN VNI this VRF will use
    VRF Settings > Route Target > Import Route Targets65240:1Import routes with this route distinguisher from VyOS
    VRF Settings > Route Target > Export Route Targets65241:1Export routes with this route distinguisher to VyOS

    The following settings are configured for the VRF for tenant Blue:

    ItemValueDescription
    NameVRF Blue
    Connect to Tier-0 GatewayT0-Gateway-01The parent Tier-0 gateway
    VRF Settings > Route Distinguisher65241:2Distinguishes routes coming from this VRF
    VRF Settings > EVPN Transit VNI75002The VXLAN VNI this VRF will use
    VRF Settings > Route Target > Import Route Targets65240:2Import routes with this route distinguisher from VyOS
    VRF Settings > Route Target > Export Route Targets65241:2Export routes with this route distinguisher to VyOS

    Besides this we also need to make sure that we re-distribute Tier-1 gateway connected segments into the BGP. For this we create a Route Re-distribution that contains Advertised Tier-1 Subnets > Connected Interfaces & Segments on each of the VRFs.

    This completes the creation and configuration of the VRF gateways for our tenants.

    Step 4 – Create Tier-1 Gateways

    To make use of the native data plane multi-tenancy offered within NSX, each tenant receives a Tier-1 gateway with an uplink to its VRF and downlink(s) to the tenant’s overlay segment(s).

    The table below shows the settings that are configured for the Tier-1 for tenant Red:

    ItemValueDescription
    NameTier-1 Red
    HA ModeDistributed OnlyThis Tier-1 will only exist in RAM.
    Linked Tier-0 GatewayVRF RedThe tenant’s VRF gateway
    Route AdvertisementAll Connected Segments & Service PortsThe tenant’s segments are advertised toward the VRF

    And below the settings configured for the Tier-1 for tenant Blue:

    ItemValueDescription
    NameTier-1 Blue
    HA ModeDistributed OnlyThis Tier-1 will only exist in RAM.
    Linked Tier-0 GatewayVRF BlueThe tenant’s VRF gateway
    Route AdvertisementAll Connected Segments & Service PortsThe tenant’s segments are advertised toward the VRF

    Step 4 – Create Segments

    Lastly, each tenant receives a logical layer 2 segment to which the tenant’s workloads can be connected.

    The table below shows the settings configured for tenant Red’s segment

    ItemValueDescription
    NameSegment Red
    Connected GatewayTier-1 RedDownlink from the tenant’s Tier-1
    Transport ZoneTZ-OverlayThe overlay transport zone
    Subnets10.204.245.1/24The CIDR and IP gateway for this segment

    The table below shows the settings configured for tenant Blue’s segment

    ItemValueDescription
    NameSegment Blue
    Connected GatewayTier-1 BlueDownlink from the tenant’s Tier-1
    Transport ZoneTZ-OverlayThe overlay transport zone
    Subnets10.204.246.1/24The CIDR and IP gateway for this segment

    With the segments in place let’s have a look at the Network Topology in NSX.

    Nothing unexpected here but it’s always nice to get some visual feedback that things are connected the way they should.

    Summary

    At this point our NSX environment is prepared to participate in a BGP EVPN control and data plane. Configuring this has been relatively straightforward if you ask me.

    In part 2 we will configure the VyOS router, establish a BGP EVPN control plane between NSX and VyOS, and validate that we have accomplished our task of separating and isolating tenant network traffic from NSX overlay to data center VLAN.

    Thanks for reading.

  • One of the great benefits of the NSX Distributed Firewall (DFW) is the flexibility it offers when it comes to developing security policy models. Implementation of the application intrinsic NSX DFW always begins with looking at the business needs and then continues with development of a security policy model aligned with those needs.

    On the other hand, the enormous flexibility offered by the NSX DFW can also become quite intimidating. Teams or individuals that are tasked with securing an organization’s applications using the NSX Distributed Firewall might sometimes wonder where to begin.

    In today’s article I want to share three examples on how one could get started with securing applications using the NSX DFW without having to allocate much time and resources.

    These examples might show you that it’s possible to implement robust application security with relatively little effort. By aligning the security policy model with existing constructs you should be able to grab some of the “low hanging-fruit” and kickstart your NSX Security project. Let’s have a look.

    Example 1 – Security Policy Model Based On Environments

    A security policy model that is based on the logical environments and/or security zones in your software-defined data center. The assumption here is that these are already defined otherwise this approach will become very time consuming instead. 😉 For example you might be looking at the following environments:

    • Development
    • Staging
    • Production

    Implementation

    1. Assign NSX tags to the workloads that are in scope. If a workload belongs to the Development environment you assign a tag called “development” to the workload. Similarly, production workloads receive the “production” tag. If possible make use of tools like the NSX API or PowerCLI to automate the assigning of tags to workloads. It will save you a lot of time.
    2. Create NSX groups matching the environments. Membership should be dynamic and based on tag. For example the “Staging” group will have a membership criteria stating that workloads tagged with “staging” should become members.
    3. Create a security policy for each environment in the DFW. Policies for this particular model fit well under the “Environment” category within the Distributed Firewall.

    The last step is to populate the policies with rules. For example each policy could have two rules to ensure that lateral movement between the environments is prohibited:

    Policy Development

    SourcesDestinationsServicesAction
    DevelopmentDevelopmentAnyAllow
    DevelopmentStaging, ProductionAnyDrop

    Policy Staging

    SourcesDestinationsServicesAction
    StagingStagingAnyAllow
    StagingDevelopment, ProductionAnyDrop

    Policy Production

    SourcesDestinationsServicesAction
    ProductionProductionAnyAllow
    ProductionStaging, DevelopmentAnyDrop
    Example of what “Policy Staging” could look like from the DFW UI

    With environments/zones predefined, implementing the NSX DFW using the above approach could be a relatively quick exercise.

    Example 2 – Security Policy Model Based On Organization Structure

    The organization structure consists of business units or departments which are used as the basis of this security policy model. This approach makes sense when business units/departments are treated as tenants and each have their own set of independent applications. Your list could look like this:

    • HR
    • Finance
    • Sales

    Implementation

    1. Assign NSX tags to the workloads in scope. If a workload is owned by HR you assign a tag called “hr” to that workload. Similarly, workloads owned by the Sales department get the “sales” tag assigned to them. Again, use tools to automate tagging of workloads if possible.
    2. Create an NSX group for each business unit/department. Membership should be dynamic and based on tag. For example the “Finance” group will have a membership criteria stating that workloads tagged with “finance” should become members.
    3. Create a security policy for each department in the DFW. Departments are like “environments” from our perspective, so these policies too are created under the “Environment” category in the Distributed Firewall.

    The last step is to populate the policies with rules. For example each policy could have two rules to ensure that lateral movement between departments is prohibited:

    Policy HR

    SourcesDestinationsServicesAction
    HRHRAnyAllow
    HRFinance, SalesAnyDrop

    Policy Finance

    SourcesDestinationsServicesAction
    FinanceFinanceAnyAllow
    FinanceHR, SalesAnyDrop

    Policy Sales

    SourcesDestinationsServicesAction
    SalesSalesAnyAllow
    SalesFinance, HRAnyDrop
    Example of what “Policy HR” could look like from the DFW UI

    When business units/departments run their own independent applications, implementing the NSX DFW using a security policy model that is aligned with the organization structure could be a quick way to increase the level of security for applications.

    Example 3 – Security Policy Model Based On Applications

    Here the applications themselves are the basis of the security policy model. This one works well as a “getting started” approach in environments where applications are well documented and run on dedicated application workloads. In other words, an application maps to workloads that are only serving the application. Here’s a list with some (fictional) applications we can work with:

    • MorphMind
    • EcoEff
    • HabitHive

    Implementation

    1. Assign NSX tags to the application workloads in scope. If a workload is used by the MorphMind app you assign a tag called “morphmind” to that workload. Similarly workloads used by the HabitHive app get the “habbithive” tag assigned.
    2. Create an NSX group for each application. Membership should be dynamic and based on tag. For example the “EcoEff” group will have a membership criteria saying that workloads tagged with “ecoeff” should become members.
    3. Create a security policy for each application in the DFW. We’re working with applications now so these policies fit nicely under the “Application” category within the DFW.

    The last step is to populate the policies with rules. For example each policy could have two rules to ensure that lateral movement between these applications is prohibited:

    Policy MorphMind

    SourcesDestinationsServicesAction
    MorphMindMorphMindAnyAllow
    MorphMindEcoEff, HabitHiveAnyDrop

    Policy EcoEff

    SourcesDestinationsServicesAction
    EcoEffEcoEffAnyAllow
    EcoEffMorphMind, HabitHiveAnyDrop

    Policy HabitHive

    SourcesDestinationsServicesAction
    HabitHiveHabitHiveAnyAllow
    HabitHiveEcoEff, MorphMindAnyDrop
    Example of what “Policy HabbitHive” could look like from the DFW UI

    Using this approach lateral movement between applications is prohibited and gives a high level of application security. For many customers this is somewhat of the desired approach. Just keep in mind that a certain level of “hygiene” among the application workloads is required in order to be successful.

    Summary

    When it comes to securing applications using the NSX Distributed Firewall I always advise customers to start with the low-hanging fruit. It allows for some easy (and critical) wins early on in the battle against all those threats out there. Hopefully the examples in this article give you an idea of what that low-hanging fruit might look like for you in your next NSX Security project.

    Thanks for reading.

  • Finishing touches and testing is completed. We’re proud to announce that we’ve just released SDDC.Lab Version 5!

    For those of you that are not familiar with the SDDC.Lab project, it’s a collection of Ansible Playbooks that perform fully automated deployments of nested VMware Software Defined Data Center Pods including solutions like vSphere, vSAN, and NSX.

    The project is maintained at a public GitHub repository and available to anybody who’s interested in speedy and consistent provisioning of nested VMware SDDC lab environments.

    What’s New?

    Product Versions

    Version 5 supports deploying SDDC.Lab Pods with the latest and greatest VMware technology while also maintaining backward compatibility for deploying earlier product versions. The “bleeding edge” bill of materials that SDDC.Lab v5 supports consists of the following VMware product versions:

    • vCenter Server version 8.0
    • ESXi version 8.0
    • NSX version 4.0.1.1
    • vSphere with Tanzu version 8.0
    • vRealize Log Insight version 8.8.2

    New Features and Improvements

    We, Luis Chanu and I, recommend that you have a look at the project’s CHANGELOG.md for a comprehensive list of all the new features and improvements that were added in version 5. The list below highlights some of the main features and improvements:

    • NSX-T overlay segments are automatically configured with Pod-unique IP subnets. This makes it possible to route IP traffic originating from these segments between Pods as well as between Pods and the physical environment.
    • vSphere Content Libraries can be created in the nested vCenter as part of a Pod deployment. The content libraries can then be consumed by other project features like Workload VMs and vSphere with Tanzu.
    • Pod configuration generation is much faster down from 1,5 hours to 7-10 minutes.
    • We’ve made sure that every single Ansible task that is taking place as part of a Pod deployment can be successfully carried out using standard Linux user privileges. The use of “sudo” is no longer required nor recommended when running Pod deployments.
    • Nearly all the project’s Ansible code has undergone Ansible Linting to ensure that the project is following Ansible’s proven practices, patterns, and behaviours as much as possible.

    Besides these main items we’ve been working work on many smaller things like code optimization, stability, and performance.

    How to Get Started?

    Getting started with SDDC.Lab v5 is quite easy. You head over to the GitHub repository and read through the README.md which contains all the information you need to successfully deploy your SDDC.Lab Pods. For completeness here are the high-level steps required to deploy a Pod:

    1. Install an Ubuntu Linux machine with Ansible and required modules
    2. Prepare a Pod configuration
    3. Deploy a Pod

    Detailed steps are available in the Preparations section of the README.md.

    Summary

    SDDC.Lab version 5 literally is a major release with many great improvements such as support for new product versions, new project features, and code improvements. We hope you will appreciate it.

    We have many plans and ideas for the next release and a new development branch is already in place. Check it out if you want to follow the developments in the project.

  • Last week we released version 3 of the SDDC.Lab project. For those of you who aren’t familiar with the project, it’s a set of Ansible scripts (Playbooks) that perform automated deployments of nested VMware SDDCs. An hour after you issue the deploy command, a fully-fledged vSphere-NSX-T environment is at your disposal. Pretty cool.

    The diagram below illustrates a high level overview of a typical SDDC.Lab Pod deployment:

    For more details we highly encourage you to check out the updated README.md which contains all the details.

    What’s New In Version 3?

    Speaking of updated, let’s have a look at what we think are some of the highlights in SDDC.Lab version 3.

    Simultaneous Pod Deployments

    My friend and co-developer Luis Chanu tweeted about this a while back:

    Indeed, sometimes one SDDC.Lab Pod is not enough. Given that it takes some time to deploy a Pod (about an hour) we’re very happy that with version 3 we can deploy multiple Pods simultaneously. And this is done without any impact on deployment time. Pure magic!

    BGP Between Pod Router And Physical Network

    Many have asked for this and we’re happy to announce that version 3 adds support for BGP dynamic routing between a Pod router and your physical Layer-3 switch/router. Of course you can still choose to configure a Pod router for OSPF v2/v3 or static routing. You can even do BGP and OSPF simultaneously. It’s entirely up to you.

    Support For New Releases

    For some reason VMware and others keep releasing new software versions all the time. It’s kind of hard to keep up! Anyway, SDDC.Lab v3 supports deploying the following software versions:

    • vCenter 7.0 U2a
    • ESXi 7.0 U2
    • NSX-T 3.1.2
    • VyOS 1.4 (Rolling release)
    • vRealize Log Insight 8.4.0
    • Ubuntu Server 20.04.2

    It’s likely that deploying newer (and older) releases of the above software will work without any problems, but SDDC.Lab v3 has been tested with the BOM above.

    Miscellaneous Updates And Changes

    We brought back the CHANGELOG.md where you can read pretty much everything we did while working on version 3. Some of the smaller changes worth mentioning are:

    • Updated Ansible modules for NSX-T
    • Improved vSAN disk claiming thanks to an Ansible Python module written by Luis
    • Scripts are now using Ansible Fully Qualified Collection Names (FQCN) in tasks
    • Updated documentation now contains detailed information on the network configuration

    Summary

    Version 3 comes with some really cool improvements and optimizations. We hope that you’ll be able to give version 3 a spin and find that it makes your life easier on those days you want to try out something in a clean vSphere-NSX-T environment.

    With v3 now being the project’s new stable/default branch, a new dev-v4 development branch has been created. Both NSX-T and vSphere have come with interesting new features within their respective platforms and we look forward to incorporate some of these into v4 of the project.

    Stay tuned!

  • There are Ansible modules for configuring most of the NSX-T platform components, but for certain configuration tasks it might be quicker (or even necessary) to GET/POST/PUT/PATCH/DELETE to the NSX-T REST API directly.

    Now, in those situations you could use curl or Postman or any of the other REST API clients out there, but if you would actually prefer to stay within your Ansible system instead of doing “quick and dirties” that aren’t documented, traceable, or reusable, the “nsxt_rest” module could be an interesting alternative.

    The “nsxt_rest” module is part of the official Ansible NSX-T modules that is maintained by VMware. The module gives you direct access to the NSX-T REST API and basically lets you configure anything you can configure through the NSX-T REST API. In other words, this is the only module you will ever need. 😉

    Example

    I recently had to test and re-test Tier-0 route filtering settings in different environments. I used the opportunity to create a simple Ansible Playbook with some tasks using the “nsxt_rest” module. This Playbook is maintained on GitHub, but I will post a static version of it here just for reference:

    ---
    - hosts: localhost
      name: ConfigureEgress.yml
      vars:
        NsxManagerAddress:     pod-220-nsxt-lm-1.sddc.lab                  # FQDN or IP address of your NSX Manager
        NsxManagerUser:        admin                                       # NSX Manager username
        NsxManagerPassword:    VMware1!VMware1!                            # NSX Manager password
        Tier0:                 T0-Gateway-01                               # Name of the Tier-0 Gateway
        LocalAs:               65001                                       # ASN on the NSX side
        RemoteAs:              65000                                       # ASN on the physical router side
        Prefix1:               any                                         # Name of the "Any" prefix
        Prefix2:               default-route                               # Name of the "Default Route" prefix                  
        RouteMapIn:            rm-in                                       # Name of the route map that is applied to the "In" filter
        RouteMapOut:           rm-out                                      # Name of the route map that is applied to the "Out" filter
        NeighborID1:           101eeb51-c0e7-41b3-b56a-5d8df4c29226        # ID of BGP neighbor #1 entry that should be configured with the filters
        NeighborIP1:           10.203.236.1                                # IP address of BGP neighbor #1 that should be configured with the filters
        NeighborID2:           d56e1a6f-d125-448a-8753-ca4b53bbf4bc        # ID of BGP neighbor #2 entry that should be configured with the filters
        NeighborIP2:           10.203.237.1                                # IP address of BGP neighbor #2 that should be configured with the filters
      tasks:
    
      
        - name: Create prefix lists for "Any" and "Default Route"
          nsxt_rest:
            hostname: "{{ NsxManagerAddress }}"
            username: "{{ NsxManagerUser }}"
            password: "{{ NsxManagerPassword }}"
            validate_certs: false
            method: patch
            path: "/policy/api/v1/infra/tier-0s/{{ Tier0 }}/prefix-lists/{{ item.name }}"
            content:
              {
                "prefixes": [
                    {
                        "network": "{{ item.network }}",
                        "action": "{{ item.action }}"
                    }
                ]
              }
          loop:
            - { name: "{{ Prefix1 }}", network: "ANY", action: "PERMIT" }
            - { name: "{{ Prefix2 }}", network: "0.0.0.0/0", action: "PERMIT" }
    
    
        - name: Create route map for the "In" filter
          nsxt_rest:
            hostname: "{{ NsxManagerAddress }}"
            username: "{{ NsxManagerUser }}"
            password: "{{ NsxManagerPassword }}"
            validate_certs: false
            method: patch
            path: "/policy/api/v1/infra/tier-0s/{{ Tier0 }}/route-maps/{{ RouteMapIn }}"
            content:
              {
                 "entries":[
                    {
                       "prefix_list_matches":[
                          "/infra/tier-0s/T0-Gateway-01/prefix-lists/{{ Prefix1 }}"
                       ],
                       "set":{
                          "local_preference":90
                       },
                       "action":"PERMIT"
                    },
                    {
                       "prefix_list_matches":[
                          "/infra/tier-0s/T0-Gateway-01/prefix-lists/{{ Prefix2 }}"
                       ],
                       "set":{
                          "local_preference":80
                       },
                       "action":"PERMIT"
                    }
                 ]
              }
    
    
        - name: Create route map for the "Out" filter 
          nsxt_rest:
            hostname: "{{ NsxManagerAddress }}"
            username: "{{ NsxManagerUser }}"
            password: "{{ NsxManagerPassword }}"
            validate_certs: false
            method: patch
            path: "/policy/api/v1/infra/tier-0s/{{ Tier0 }}/route-maps/{{ RouteMapOut }}"
            content:
              {
                 "entries":[
                    {
                       "prefix_list_matches":[
                          "/infra/tier-0s/T0-Gateway-01/prefix-lists/{{ Prefix1 }}"
                       ],
                       "set":{
                          "as_path_prepend":"{{ LocalAs }}",
                          "local_preference":100
                       },
                       "action":"PERMIT"
                    }
                 ]
              }
    
    
        - name: Add the filters to the BGP neighbor entries
          nsxt_rest:
            hostname: "{{ NsxManagerAddress }}"
            username: "{{ NsxManagerUser }}"
            password: "{{ NsxManagerPassword }}"
            validate_certs: false
            method: patch
            path: "/policy/api/v1/infra/tier-0s/{{ Tier0 }}/locale-services/{{ Tier0 }}_Locale_Services/bgp/neighbors/{{ item.neighbor }}"
            content:
              {
                 "neighbor_address" : "{{ item.ip }}",
                 "remote_as_num" : "{{ item.as }}",
                 "in_route_filters":[
                    "/infra/tier-0s/{{ Tier0 }}/route-maps/{{ RouteMapIn }}"
                 ],
                 "out_route_filters":[
                    "/infra/tier-0s/{{ Tier0 }}/route-maps/{{ RouteMapOut }}"
                 ],
                 "route_filtering":[
                    {
                       "enabled":true,
                       "address_family":"IPV4",
                       "in_route_filters":[
                          "/infra/tier-0s/{{ Tier0 }}/route-maps/{{ RouteMapIn }}"
                       ],
                       "out_route_filters":[
                          "/infra/tier-0s/{{ Tier0 }}/route-maps/{{ RouteMapOut }}"
                       ]
                    }
                 ]
              }
          loop:
            - { neighbor: "{{ NeighborID1 }}", ip: "{{ NeighborIP1 }}", as: "{{ RemoteAs }}" }
            - { neighbor: "{{ NeighborID2 }}", ip: "{{ NeighborIP2 }}", as: "{{ RemoteAs }}" }
    

    This is looks very similar to using curl commands, except now it’s wrapped in an Ansible Playbook that I can easily check in to some version control system, share, and re-use. I think those are some pretty nice benefits.

    Thanks for reading.

  • During some research I did for a customer on how to trigger an action based on an error event in the SDDC, I built myself a lab and ended up with a concept that seems interesting enough to write some lines about on the blog.

    High-Level

    The diagram below illustrates the “solution” at a high-level:

    No rocket science here. A system logs an event to Log Insight which generates an alert that triggers a Jenkins pipeline which remediates the system.

    So what does setting this up look like? Must be pretty difficult? I thought so too, but let’s have a look at an example in this article.

    Remediate The NSX-T Distributed Firewall

    In this simple example the “system” is the NSX-T Distributed Firewall (DFW) Default Layer 3 Rule. This is the last rule in the DFW table which determines what to do with traffic that is not matching any other rules (Drop or Allow).

    In our example we want traffic not being picked up by other DFW rules to be dropped and therefore the Default Layer 3 Rule is configured with a “Drop” action.

    If for some reason the action is changed to “Allow”, we want it to automatically revert back to “Drop” as that is our desired/required state.

    So there we have the use case for some event-driven automation.

    Step 1 – Identify The Event And Construct A Log Insight Query

    Before we can do anything meaningful we need to find the event that is logged when we change the firewall rule action to “Allow”. In this case the event in Log Insight looks like this:

    Using this information we can build a reliable Log Insight query that will show us this event and nothing else. Reliable and consistent are keywords as we’re about to connect this query to automation and the last thing we want here are trigger happy false positives.

    I’m fairly confident that the following Log Insight query is reliable enough for our example use case:

    text contains rule_id:2
    text contains action:allow
    event_type is v4_931714a6

    Step 2 – Create Alert From Query

    If the query comes back with a match, i.e, the DFW rule’s action has been changed to “Allow”, an alert should be activated. This alert is configured directly from the Log Insight Interactive Analytics interface where we also constructed our query.

    The alert I’m creating looks likes this:

    As you can see I’m using a webhook to notify a Jenkins pipeline. We will look more at Jenkins in the coming steps. For now it’s good to understand that Log Insight will execute a HTTP POST request each time the defined query comes back with a match.

    Step 3 – Configure Jenkins Pipeline Build Trigger

    I decided to use the Generic Webhook Trigger plugin on Jenkins which extends the build triggers of a pipeline to allow easy triggering through HTTP requests (e.g. webhooks).

    In our simple example very little configuration is required for the Generic Webhook Trigger configuration. Besides enabling it I’m adding a token to distinguish this build trigger from any others I might be creating:

    This trigger URL (http://jenkins.sddc.lab:8080/generic-webhook-trigger/invoke?token=vrli_v4_931714a6) is used as the webhook URL when configuring the alert in Log Insight.

    Back in Log Insight we can actually send a test alert to the webhook. This should result in the following message which indicates that Jenkins and specifically the webhook trigger are working:

    Step 4 – Configure Jenkins Pipeline Script

    The pipeline script contains the code that is executed to remediate our NSX-T DFW undesired state. NSX-T of course has a REST API which makes things relatively easy to configure.

    Using the Jenkins Pipeline Syntax and Snippet Generator for a httpRequest step, it was easy to put together a pipeline script that performs the HTTP PATCH request to the NSX-T API:

    For reference the complete pipeline script including the JSON payload that’s send to the NSX-T API looks as follows:

    pipeline {
        agent any
    
        stages {
            stage('Hello') {
                steps {
                    httpRequest authentication: 'nsx-t', consoleLogResponseBody: true, contentType: 'APPLICATION_JSON', httpMode: 'PATCH', ignoreSslErrors: true, requestBody: '''{
                        "action": "DROP",
                        "resource_type": "Rule",
                        "id": "default-layer3-rule",
                        "display_name": "Default Layer3 Rule",
                        "path": "/infra/domains/default/security-policies/default-layer3-section/rules/default-layer3-rule",
                        "relative_path": "default-layer3-rule",
                        "parent_path": "/infra/domains/default/security-policies/default-layer3-section",
                        "unique_id": "a6c492ad-bf22-4d35-8cf3-ec09f6beeb66",
                        "marked_for_delete": false,
                        "overridden": false,
                        "rule_id": 2,
                        "sequence_number": 2147483647,
                        "sources_excluded": false,
                        "destinations_excluded": false,
                        "source_groups": [
                            "ANY"
                        ],
                        "destination_groups": [
                            "ANY"
                        ],
                        "services": [
                            "ANY"
                        ],
                        "profiles": [
                            "ANY"
                        ],
                        "logged": false,
                        "scope": [
                            "ANY"
                        ],
                        "disabled": false,
                        "direction": "IN_OUT",
                        "ip_protocol": "IPV4_IPV6",
                        "is_default": true,
                        "_create_user": "system",
                        "_create_time": 1616953731606,
                        "_last_modified_user": "admin",
                        "_last_modified_time": 1616961662317,
                        "_system_owned": false,
                        "_protection": "NOT_PROTECTED",
                        "_revision": 9
                    }''', responseHandle: 'NONE', url: 'https: //pod-230-nsxt-lm-1.sddc.lab/policy/api/v1/infra/domains/default/security-policies/default-layer3-section/rules/default-layer3-rule', wrapAsMultipart: false
                }
            }
        }
    }

    This piece of code will change the default DFW rule action to “Drop”.

    Step 5 – Test

    Now that the alert definition, trigger, and remediation script are in place the waiting begins. When will somebody accidentally change the DFW rule’s action to “Allow”? Maybe soon?

    There you have it! I knew this was going to happen sooner or later 😉 Alright let’s see what happened.

    In Log Insight we can see that our event of interest was detected several times and alerts were sent to the Jenkins webhook:

    In the Jenkins UI we can see that the pipeline was built several times:

    Let’s have a closer look at build #9. The Console Output is pretty useful to have a look at:

    Here we can more or less follow what the pipeline script has been doing. In this case things are looking good. Especially the “Response Code: HTTP/1.1 200” which is the NSX-T API’s way of saying it accepted the call and the payload.

    Now let’s have a look at the DFW to see what happened with that firewall rule:

    It’s back at dropping traffic. Seems like our event-driven desired state enforcement automation is working!

    Summary

    Not that difficult, right? We went through setting up a simple event-driven workflow using a Log Insight – Jenkins webhook integration. This example can easily be expanded upon. Both on the Log Insight and the Jenkins side we can of course do much more sophisticated stuff where the only limit is our imagination.

    In today’s example the use case was to remediate. It might just as well be to create something. For example when a new tenant’s virtual machine folders are created in vCenter, Jenkins executes an Ansible or terraform script that builds the entire NSX-T logical network infrastructure for that tenant.

    One last thing worth mentioning is that the JSON payload send by Log Insight to Jenkins, contains all the event data. This data can be interpreted and used (as variables) in the pipeline script so that we can run very granular/targeted actions.

    Thanks for reading.

  • The NSX-T Central Control Plane (CCP) is building and maintaining a central repository for some tables that make NSX-T the unique network virtualization solution it is. More specifically I’m talking about:

    • The Global MAC address table
    • The Global ARP table

    In today’s article I’ll have a closer look at these two tables.

    MAC Address Table

    As soon as a virtual machine’s vNIC is connected to the NSX-T Data Plane, its MAC address as well as the Tunnel End Point (TEP) used to reach that MAC address are registered with the CCP. Now, when the Data Plane receives a frame destined to an unknown MAC address, besides flooding the frame, it will also query the CCP’s MAC address table to see if it can find a matching entry there. The CCP’s MAC address table is also used to pre-populate the local MAC address tables on Transport Nodes before they receive any traffic.

    There are two exceptions where MAC addresses of connected vNICs are not registered with the CCP. The first exception is when a vNIC is allowed to send traffic from several source MAC addresses. The second exception is when MAC addresses are learned from an Edge bridge connected to a physical layer 2 network. This is by design and protects the CCP from injection of an arbitrarily large number of MAC addresses into in the network.

    So, that’s a pretty cool table, right? One that you might want to have a look at yourself now and then perhaps.

    Querying the MAC Address Table

    There’s more than one way to retrieve entries from the CCP’s MAC address table. In this article I will show you how it’s done using the Manager CLI. Another option would be to leverage the NSX-T API using curl for example.

    We query the MAC address table on a per NSX-T segment basis. To see the learned MAC addresses and their associated TEPs for a segment we first need to know that segment’s Virtual Network Identifier (VNI). From the Manager CLI we run the following command to list segments and their VNIs:

    get logical-switches

    This gives the following result:

    VNI     UUID                                  Name   Type      
    65542   e1b15ca9-4c04-4692-8926-a4cd769b4776  Web    DEFAULT
    65538   0058ae01-04cd-4992-9c2c-60fb764bbad1  App    DEFAULT

    In this case we’re interested in the “App” segment which has VNI 65538. We can now run the following command to see the learned MAC address entries for the “App” segment:

    get logical-switch 65538 mac-table

    The output of this command in my tiny lab:

    VNI     MAC               VTEP-IP        TransportNode-ID                            
    65538   00:50:56:a4:b6:e5 10.81.234.16   4173bca0-4e6e-4ffb-8300-f2f4bed88b29
    65538   00:50:56:a2:bd:ec 10.81.234.17   78eeca52-69a8-44f6-a795-f1a1ecc7bdf7

    As we can see, the table contains two MAC addresses that belong to vNICs connected to the “App” segment. The MAC addresses are reachable via two different TEP IP addresses which are shown in the “VTEP-IP” column.

    Each entry also contains a value for the Transport Node ID. This tells us on which transport node the MAC address is connected and basically discloses on which host the virtual machine is running. To translate a Transport Node ID to an ESXi Management IP address we would run:

    get transport-node <TransportNode-ID> status 

    ARP Table

    The Central Control Plane also maintains a global ARP table. Thanks to this table we enjoy things like ARP suppression on our NSX-T Data Plane. It’s populated by snooping DHCP and ARP traffic. The snooping itself happens on the individual transport nodes and results are reported back to the CCP.

    Querying the ARP Table

    Retrieving information from the CCP’s ARP table can be done in different ways as well, but we’ll stick to the Manager CLI today.

    As with the MAC address table, querying the ARP table is done on a per segment basis. If we for example would like to see the ARP entries for the “Web” segment, we first need to know that segment’s VNI. In this case we already know that the “Web” segment’s VNI is 65542 and continue by running:

    get logical-switch 65542 arp-table

    ARP entries are displayed:

    VNI     IP          MAC                TransportNode-ID            
    65542   10.80.1.20  00:50:56:a2:bd:ec  78eeca52-69a8-44f6-a795-f1a1ecc7bdf7  
    65542   10.80.1.21  00:50:56:a4:b6:e5  4173bca0-4e6e-4ffb-8300-f2f4bed88b29

    The two ARP entries showing MAC address and IP address of the connected vNICs. The Transport Node ID is attached as well and can be used to find out on which Transport Node the IP/MAC (virtual machine) is connected.

    Modifying Output

    With just a couple of entries in these tables, finding relevant information is easy. It becomes a whole different story when hundreds or thousands of virtual machines are connected to a single segment. Luckily we have the option to modify the output:

    get logical-switch 65542 arp-table | 
       count     Count number of entities
       find      Only show lines that contain regex pattern
       first     Show first N lines of output
       ignore    Ignore lines that contain regex pattern
       json      Show output in JSON format
       last      Show last N lines of output
       more      Show output one page at a time
       sort      Sort command output

    To use a very simple example. If we just want to know the number of ARP entries for the “Web” segment that has a 10.80.1.0/24 CIDR, something like this would do the trick:

    get logical-switch 65542 arp-table | count 10.80.1.

    Output is modified and now looks like this:

    Number of lines that match pattern '10.80.1.': 2

    Regex patterns are used to filter the command output. Depending on your regex skills (I suck at it) you could construct a pretty advanced query and extract exactly the information that you are looking for.

    Summary

    In today’s article I went back to some NSX-T basics and looked closer at two important tables that are living within the NSX-T Central Control Plane. The information in these tables is likely available in other systems around your SDDC (vCenter, vROps, vRLI, vRNI, physical network, etc), but personally I think it’s important to know how to extract this information at the source (NSX-T). Sooner or later you will end up in a situation where you depend on that knowledge. 😉

    Thanks for reading.

    References and resources: