• Packet Capture On Tier-0 Uplinks

    With NSX-T logical networking the Tier-0 uplinks become the central passage for all of the North-South traffic—i.e., traffic between the NSX-T logical networks and the physical network.

    A critical point in the NSX-T data plane and one that we might want to place under a magnifying glass from time to time.

    In this short article I’ll walk through setting up and managing packet captures on Tier-0 uplinks.

    1 – Identify active SR location

    This step is relevant when the Tier-0 gateway is running in Active-Passive HA mode. Most of the time the interesting packets will be on the active uplinks and we need to figure out where these are situated.

    With Active-Active HA mode all of the Tier-0 uplinks are involved in forwarding traffic and therefore points of interest when it comes to capturing packets

    In the NSX Manager UI, navigate to Advanced Networking & Security > Networking > Routers. Click the Active-Standby link for the Tier-0 gateway:

    Here the active Tier-0 SR is located on edgevm01.

    2 – Identify interface ID

    Also under Advanced Networking & Security > Networking > Routers we click the name link of the Tier-0 gateway. This opens up the details pane where we choose Configuration > Router Ports:

    Copy the ID of the uplink interfaces that use the Edge node with the active Tier-0 SR:

    3 – Start capture session

    SSH into the Edge node with the active Tier-0 SR. To capture 50 outgoing/northbound packets run the following command:

    start capture interface <ID> direction output count 50 file capture.pcap

    For example:

    4 – Copy capture file

    The resulting capture.pcap file can now be copied to an SFTP server. For example:

    copy file capture.pcap url scp://root@sftp.demo.local/captures 

    After a successful copy you might want to delete the capture.pcap file from the Edge node’s file store:

    del file capture.pcap

    5 – Open capture file

    Open the capture file in a packet analyzer like Wireshark to start investigating the captured packets:

    Summary

    And that’s how easy it is to capture traffic on Tier-0 uplinks.

    It’s not uncommon that you need to capture network traffic as part of investigating some kind of application issue. For that reason I recommend that you document the IDs of the Tier-0 uplink interfaces in advance and have an SFTP server ready to go so that you don’t have to waste valuable time on preparing the packet capture itself.

    Thanks for reading.

  • NSX-T Meets FRRouting – Part 2

    Welcome back! We’re in the process of building an NSX-T Edge – FRRouting environment.

    In part 1 we prepared the FRR routers by doing he following:

    • Installed two Debian Linux servers
      • Installed VLAN support
      • Enabled packet forwarding
      • Configured network interfaces
    • Installed and configured VRRP
    • Installed FRRouting

    In this second part we will first deploy the NSX-T Edge components and then set up BGP routing. There’s a lot to do so let’s get started!

    Target topology

    As a refresher here is the big picture once more:

    We’ll use this diagram as our blueprint. Scroll back up here any time you wonder what the heck it is we’re doing down there.

    Deploy NSX-T Edge

    Let’s begin by getting the NSX-T Edge on par with the FRR routers.

    Create NSX-T segments

    The FRR routers, frr-01 and frr-02, were configured with local “peering” VLANs 1657 and 1658 respectively. Corresponding VLAN-backed segments are needed for L2 adjacency with the FRR routers.

    Creating the “vlan-1658” segment:

    Both segments in place:

    Uplink profile

    Create an uplink profile for the edge transport nodes containing settings for teamings, transport VLAN, and MTU:

    The transport VLAN has id 1659 and MTU size is 9000.

    Deploy Edge VMs

    Instead of walking through the Edge node deployment, the table below summarizes the settings I used during the deployment. Have a look at the Single N-VDS per Edge VM article for a detailed Edge node deployment walkthrough.

    SettingEdge Node 1Edge Node 2
    Nameen01en02
    FQDNen01.lab.localen02.lab.local
    Form FactorSmallSmall
    Mgmt IP172.16.11.61/24172.16.11.62/24
    Mgmt InterfacePG-MGMT (VDS)PG-MGMT (VDS)
    Default Gateway172.16.11.1172.16.11.1
    Transport ZoneTZ-VLAN, TZ-OVERLAYTZ-VLAN, TZ-OVERLAY
    Static IP List172.16.59.71, 172.16.59.81172.16.59.72, 172.16.59.82
    Gateway172.16.59.1172.16.59.1
    Mask255.255.255.0255.255.255.0
    DPDK InterdaceUplink1 > Trunk1 (VDS)
    Uplink2 > Trunk2 (VDS)
    Uplink1 > Trunk1 (VDS)
    Uplink2 > Trunk2 (VDS)

    The two Edge nodes are up and running:

    We add both Edge nodes to an Edge cluster:

    Create Tier-0 gateway

    With the Edge nodes in place we can create a Tier-0 gateway. I’m configuring it with Active-Standby HA Mode:

    We add four external interfaces to the Tier-0:

    NameIP addressSegmentEdge Node
    en1-uplink1172.16.57.2/29vlan-1657en1
    en1-uplink2172.16.58.2/29vlan-1658en1
    en2-uplink1172.16.57.3/29vlan-1657en2
    en2-uplink2172.16.58.3/29vlan-1658en2

    The four Tier-0 interfaces are in place:

    Test connectivity

    Now is a good time to verify the L2 adjacency between the FRR routers and the Tier-0 interfaces.

    A ping from frr-01 to the Tier-0 interfaces in VLAN 1657:

    And a ping from frr-02 to the Tier-0 interfaces in VLAN 1658:

    Successful pings. We’re good!

    Configure BGP

    Moving up an OSI layer, we continue with setting up BGP.

    Tier-0 gateway

    The Tier-0 is configured with the following BGP settings:

    SettingValue
    Local AS65000
    BGPOn
    Graceful RestartDisable
    ECMPOn

    The settings in NSX Manager:

    We add two BGP neighbors to the Tier-0: 172.16.57.1 (frr-01) and 172.16.58.1 (frr-02). Make sure to enable BFD for these neighbors too:

    The neighbor status will be “Down” at this point which is expected as we didn’t configure BGP on the FRR routers yet.

    For route re-distribution I choose to re-distribute from all the available sources into the BGP process:

    FRR routers

    Configuration of BGP in FRRouting can be done by editing configuration files directly or through VTY shell which is FRRouting’s CLI frontend. We’ll use VTY shell today.

    frr-01

    Run the vtysh command to start VTY shell:

    After changing to the configuration mode with conf t, we enable the BGP process with:

    router bgp 65001

    Next, we configure the router ID and the BGP/BFD neighbors which are the Tier-0’s interfaces in VLAN 1657 on frr-01:

    bgp router-id 172.16.57.1
    neighbor 172.16.57.2 remote-as 65000
    neighbor 172.16.57.2 bfd
    neighbor 172.16.57.3 remote-as 65000
    neighbor 172.16.57.3 bfd

    We want frr-01 to advertise itself as the default gateway to its BGP neighbors which is accomplished with:

    address-family ipv4 unicast
    neighbor 172.16.57.2 default-originate
    neighbor 172.16.57.3 default-originate

    Run end followed by wr to save the configuration:

    If all went well we should now see active BGP and BFD sessions between frr-01 and the Tier-0 interfaces in VLAN 1657. Let’s verify this with:

    show bgp summary

    BGP neighbor sessions are looking good. How about BFD?

    show bfd peers

    BFD sessions are up.

    frr-02

    We repeat the exact same configuration steps on frr-02. The configuration for frr-02 looks like this:

    router bgp 65001
    bgp router-id 172.16.58.1
    neighbor 172.16.58.2 remote-as 65000
    neighbor 172.16.58.2 bfd
    neighbor 172.16.58.3 remote-as 65000
    neighbor 172.16.58.3 bfd
    !
    address-family ipv4 unicast
    neighbor 172.16.58.2 default-originate
    neighbor 172.16.58.3 default-originate
    exit-address-family

    Let’s check the BGP/BFD status at frr-02:

    show bgp summary
    show bfd peers

    BGP and BFD sessions are looking good.

    Routing

    After a lot of deploying and configuring it’s finally time to see if we can actually route any traffic.

    FRR routing tables

    We begin by having a look at the FRR routing tables. Run the following command in VTY shell on the FRR routers:

    show ip route bgp

    frr-01:

    frr-02:

    The FRR routers have learned about each other’s /29 subnets via the NSX-T Tier-0. More specifically, they were learned from neighbor 172.16.57.2 and 172.16.58.2. This tells us that the active Tier-0 SR is hosted on Edge node 1.

    Is the standby Tier-0 SR completely out of the picture then? Let’s see:

    show bgp detail

    The standby Tier-0 SR on Edge node 2 also advertises routes for the same /29 subnets, but as you can see the ASN (65000) is added to the path three more times and packets won’t be routed over these longer paths.

    Tier-0 routing table

    Run the following command on the Edge node hosting the active Tier-0 SR:

    get route bgp

    Here we see two equal cost routes for 0.0.0.0/0, one to each FRR router. This tells us that “default-originate” did its job. Both routes also ended up in the FIB which means ECMP is working.

    From overlay to physical

    It’s now time for the ultimate test. We create an overlay segment, 192.168.10.0/24, connected to the Tier-0 gateway:

    The BGP process on the Tier-0 advertises the 192.168.10.0/24 network to its neighbors. Let’s check if they ended up there:

    show ip route bgp

    frr-01:

    frr-02:

    A route to the overlay network is indeed present in both of the FRR routers routing table.

    Now we connect a VM to the overlay segment and run a traceroute from this VM to an IP address north of the FRR routers:

    traceroute 10.2.129.10 -n -q 2

    The VM on the overlay segment can reach the physical network. By doing two probes per hop we also see that the Tier-0 offers two paths to the destination: one via frr-01 (172.16.57.1) and one via frr-02 (172.16.58.1).

    It’s a wrap

    It’s been quite a project, but we got ourselves a working NSX-T Edge – FRRouting environment and it wasn’t that hard to set up, right?

    This all started with me looking for a more enterprise like virtual top-of-rack solution for my NSX-T lab. Having these FRR routers north of the Tier-0 certainly feels like a big step towards that goal. Perhaps not fully showcased in these articles, but FRRouting’s feature set is pretty much on par with today’s data center leaf-spine switches. As a matter of fact it’s already being used there. Have a look at Cumulus Networks for example.

    For more information about features and possibilities surrounding BGP have a look at the official NSX-T and FRRouting documentation. Most of all I recommend that you set this up yourself. Hopefully these two articles will help you get started with that.

    Thanks for reading!

  • NSX-T Meets FRRouting – Part 1

    Until recently I always used pfSense with the OpenBGPD package as the NSX-T Edge counterpart in my lab environment. It’s quick and easy to set up and works well enough. But pfSense is not what I typically find in a customer’s production environment.

    I started to investigate other virtualized “top-of-rack solutions” for the lab that would be a bit more similar to what I see in the enterprise. Right now I’m testing out FRRouting and I must say that I’m pretty impressed with this solution so far. At least it’s good enough to be the subject of a blog post or two 😉

    I’m going to walk through deploying and configuring a pair of FRRouting instances, the NSX-T Edge, and BGP routing in a lab environment. Follow along if you want.

    Target topology

    The diagram below shows a logical L3 design for the NSX-T Edge – FRRouting solution that we’ll be building:

    There’s nothing much out of the ordinary here. We have a Tier-0 gateway backed by two Edge nodes, and BGP routing. At the top of the diagram things look a bit less familiar with two Linux routers powered by FRRouting.

    That’s a nice sketch. Now let’s see if we can make it work too.

    Bill of materials

    The following software is used to build this environment:

    • NSX-T 2.5.1
    • vSphere 6.7 U3
    • Debian Linux 10.2
    • FRRouting 7.2

    Deploy FRRouting

    This first part is about getting the FRR instances up and running which begins with installing two Linux servers. Let’s get right to it.

    Install Linux servers

    Debian Linux is a good fit here as there is an official FRR Debian repository which makes installing FRR a lot easier.

    Each server is configured with two NICs.

    The ens192 interface is configured as the primary interface and will be the “north-facing” port. The ens224 interface is the “SDDC-facing” port.

    At this point we only assign a static IP address to the ens192 interface.

    The only additional components we need to install are the SSH server and standard system utilities:

    Complete the Debian installation on both servers.

    Install VLAN support

    The servers will soon be configured with some VLAN interfaces. To add support for this we install the VLAN package:

    apt install vlan -y

    Add the following line to /etc/modules so that VLAN (802.1Q) support is loaded during boot:

    8021q

    Enable IPv4 packet forwarding

    We want the Linux servers to become Linux routers and as a part of that we need to enable IPv4 packet forwarding in /etc/sysctl.conf:

    net.ipv4.ip_forward=1

    Reboot the servers after making this change.

    Configure network interfaces

    Time to configure the network interfaces on the Linux routers. The following shows the interface configuration per Linux router:

    frr-01:

    InterfaceIP addressComment
    ens19210.2.129.101/24Primary interface, north-facing
    ens224Secondary interface, SDDC-facing
    ens224.1611172.16.11.253/24Management VLAN
    ens224.1657172.16.57.1/29BGP peering VLAN
    ens224.1659172.16.59.253/24Overlay transport VLAN

    Which results in the following /etc/network/interfaces for frr-01:

    source /etc/network/interfaces.d/*
    
    # The loopback network interface
    auto lo
    iface lo inet loopback
    
    # The primary network interface - north-facing port
    auto ens192
    allow-hotplug ens192
    iface ens192 inet static
    address 10.2.129.101/24
    gateway 10.2.129.1
    dns-nameservers 10.2.129.10
    dns-search demo.local
    
    # The secondary network interface - SDDC-facing port
    auto ens224
    allow-hotplug ens224
    iface ens224 inet manual
    mtu 9000
    
    # The VLAN 1611 interface - Management
    auto ens224.1611
    iface ens224.1611 inet static
    address 172.16.11.253/24
    
    # The VLAN 1657 interface - BGP peering
    auto ens224.1657
    iface ens224.1657 inet static
    address 172.16.57.1/29
    
    # The VLAN 1659 interface - Overlay transport
    auto ens224.1659
    iface ens224.1659 inet static
    address 172.16.59.253/24

    frr-02:

    InterfaceIP addressComment
    ens19210.2.129.102/24Primary interface, north-facing
    ens224Secondary interface, SDDC-facing
    ens224.1611172.16.11.254/24Management VLAN
    ens224.1658172.16.58.1/29BGP peering VLAN
    ens224.1659172.16.59.254/24Overlay transport VLAN

    The corresponding /etc/network/interfaces for frr-02:

    source /etc/network/interfaces.d/*
    
    # The loopback network interface
    auto lo
    iface lo inet loopback
    
    # The primary network interface - north-facing
    auto ens192
    allow-hotplug ens192
    iface ens192 inet static
    address 10.2.129.102/24
    gateway 10.2.129.1
    dns-nameservers 10.2.129.10
    dns-search demo.local
    
    # The secondary network interface - SDDC-facing
    auto ens224
    allow-hotplug ens224
    iface ens224 inet manual
    mtu 9000
    
    # The VLAN 1611 interface - Management
    auto ens224.1611
    iface ens224.1611 inet static
    address 172.16.11.254/24
    
    # The VLAN 1658 interface - BGP peering
    auto ens224.1658
    iface ens224.1658 inet static
    address 172.16.58.1/29
    
    # The VLAN 1659 interface - Overlay transport
    auto ens224.1659
    iface ens224.1659 inet static
    address 172.16.59.254/24

    Restart the network to activate the new network interface configuration:

    systemctl restart networking

    Run the ip address command to verify that the new interface configuration is active:

    Install VRRP

    As you noticed we are “stretching” the management VLAN (1611) and the overlay transport VLAN (1659) between the Linux routers. Both routers can act as the default gateway for these VLANs at any given time. To make use of this capability we’ll set up VRRP with Keepalived.

    Install the package:

    apt install keepalived -y

    Create the Keepalived configuration file: /etc/keepalived/keepalived.conf. Below the Keepalived configuration per server:

    frr-01 (VRRP master):

    global_defs {
    # Email Alert Configuration
    notification_email {
    # Email To Address
    admin@demo.local
    }
    # Email From Address
    notification_email_from noreply@demo.local
    # SMTP Server Address / IP
    smtp_server 127.0.0.1
    # SMTP Timeout Configuration
    smtp_connect_timeout 60
    router_id frr-01
    }
    
    vrrp_sync_group VG1 {
    group {
    1611
    1659
    }
    }
    
    vrrp_instance 1611 {
    # State = Master or Backup
    state MASTER
    # Interface ID for VRRP to run on
    interface ens224.1611
    # VRRP Router ID
    virtual_router_id 10
    # Highest Priority Wins
    priority 250
    # VRRP Advert Intaval 1 Second
    advert_int 1
    # Basic Inter Router VRRP Authentication
    authentication {
    auth_type PASS
    auth_pass VMware1!VMware1!
    }
    # VRRP Virtual IP Address Config
    virtual_ipaddress {
    172.16.11.1/24 dev ens224.1611
    }
    }
    
    vrrp_instance 1659 {
    # State = Master or Backup
    state MASTER
    # Interface ID for VRRP to run on
    interface ens224.1659
    # VRRP Router ID
    virtual_router_id 11
    # Highest Priority Wins
    priority 250
    # VRRP Advert Intaval 1 Second
    advert_int 1
    # Basic Inter Router VRRP Authentication
    authentication {
    auth_type PASS
    auth_pass VMware1!VMware1!
    }
    # VRRP Virtual IP Address Config
    virtual_ipaddress {
    172.16.59.1/24 dev ens224.1659
    }
    }

    frr-02 (VRRP backup):

    global_defs {
    # Email Alert Configuration
    notification_email {
    # Email To Address
    admin@demo.local
    }
    # Email From Address
    notification_email_from noreply@demo.local
    # SMTP Server Address / IP
    smtp_server 127.0.0.1
    # SMTP Timeout Configuration
    smtp_connect_timeout 60
    router_id frr-02
    }
    
    vrrp_sync_group VG1 {
    group {
    1611
    1659
    }
    }
    
    vrrp_instance 1611 {
    # State = Master or Backup
    state BACKUP
    # Interface ID for VRRP to run on
    interface ens224.1611
    # VRRP Router ID
    virtual_router_id 10
    # Highest Priority Wins
    priority 150
    # VRRP Advert Intaval 1 Second
    advert_int 1
    # Basic Inter Router VRRP Authentication
    authentication {
    auth_type PASS
    auth_pass VMware1!VMware1!
    }
    # VRRP Virtual IP Address Config
    virtual_ipaddress {
    172.16.11.1/24 dev ens224.1611
    }
    }
    
    vrrp_instance 1659 {
    # State = Master or Backup
    state BACKUP
    # Interface ID for VRRP to run on
    interface ens224.1659
    # VRRP Router ID
    virtual_router_id 11
    # Highest Priority Wins
    priority 150
    # VRRP Advert Intaval 1 Second
    advert_int 1
    # Basic Inter Router VRRP Authentication
    authentication {
    auth_type PASS
    auth_pass VMware1!VMware1!
    }
    # VRRP Virtual IP Address Config
    virtual_ipaddress {
    172.16.59.1/24 dev ens224.1659
    }
    }

    Restart the Keepalived service on both routers to activate the new configuration:

    systemctl restart keepalived

    We can now verify VRRP operation by running systemctl status keepalived:

    Running the ip address command will hopefully show the virtual IP address on the two VLAN interfaces:

    And a ping to the virtual IP address from the VRRP backup node (frr-02 in this case) should be successful:

    Install FRRouting

    With Linux installed and configured we continue with the FRRouting installation.

    Begin by adding the FRR Debian repository:

    curl -s https://deb.frrouting.org/frr/keys.asc | apt-key add -
    FRRVER="frr-stable"
    echo deb https://deb.frrouting.org/frr $(lsb_release -s -c) $FRRVER | tee -a /etc/apt/sources.list.d/frr.list
    apt update && apt install frr frr-pythontools -y

    FRRouting is now installed.

    Configure FRRouting capabilities

    We only enable the routing protocols that are needed. To make FRR a good match for the NSX-T Edge we would like the instances to be capable of doing BGP and BFD. So we simply enable these daemons in /etc/frr/daemons.

    bgpd=yes
    bfdd=yes

    Restart the FRR service and verify that the BGP and BFD daemons are active:

    systemctl restart frr
    systemctl status frr

    This is looking good. The FRR instances are now ready for control plane configuration.

    Summary

    This completes part 1 of the series on NSX-T and FRRouting. We’ve been quite productive:

    • Installed two Debian Linux servers
      • Installed VLAN support
      • Enabled packet forwarding
      • Configured network interfaces
    • Installed and configured VRRP
    • Installed FRRouting

    In the next part we’ll continue with deploying the NSX-T Edge and setting up BGP routing between NSX-T and the FRR instances. Thanks for reading and stay tuned!

  • Site-to-Site VPN Between NSX-T Tier-1 And AWS VPC

    Now that I started studying for the AWS Certified Advanced Networking – Specialty I have to learn pretty much everything about AWS networking. Naturally VPN is a part of that.

    When it comes to AWS VPN the most common use case is establishing secure Site-to-Site connections between the customer’s data center and a Virtual Private Cloud (VPC).

    As an exercise for myself I decided to configure a Site-to-Site VPN connection between an NSX-T Tier-1 gateway and an AWS VPC and in today’s article I’m walking through the process of setting this up.

    Target topology

    Let’s begin with a simple diagram showing the logical environment and the scenario that I cooked up for myself:

    A database server (db01) in the data center needs to regularly dump its database to an EC2 instance (db-backups). The database server is connected to a segment which is attached to a Tier-1 gateway. The EC2 instance is connected to a private subnet in a Virtual Private Cloud (VPC).

    The requirement is that only “segment-db” in the data center should be allowed to access the VPC private subnet using an encrypted connection over the Internet.

    This sounds like a reasonable realistic scenario at least, right? Let’s set this up.

    Configuring AWS

    I begin by configuring the AWS side using the AWS Management Console.

    Virtual Private Gateway

    First of all I need a Virtual Private Gateway (VGW) and attach it to my VPC. This is a matter of some clicks and requires minimal configuration:

    Customer Gateway

    Next up is a Customer Gateway. This is a logical construct representing the VPN “device” on the data center side:

    I’m selecting static routing and enter the public IP address of the VPN endpoint on the data center side. This can be a NAT-ed IP address.

    VPN Connection

    The third and last component is a VPN Connection:

    Here I choose the newly created VGW and Customer Gateway. I also select static routing and add the 10.10.1.0/24 prefix which matches the “segment-db” subnet.

    At this point Site-to-Site VPN on the AWS side is configured and ready:

    Configuration file

    I download the vendor generic configuration file which contains all the settings for the VPN connection. This file might come in handy when configuring NSX-T:

    Route propagation

    One last thing before heading over to the data center is a small modification to the private subnet’s route table so that route propagation from the VGW is enabled. In my case this effectively adds a route for 10.10.1.0/24 with the VGW as target.

    Configuring NSX-T

    With the AWS side ready to rock n’ roll I will continue with the configuration of the NSX-T side.

    VPN Profiles

    I begin by creating profiles for IKE, IPSec, and DPD. This is done under Networking > Network Services > VPN > Profiles. The profiles contain settings that match the ones in the downloaded configuration file.

    For the IKE profile I use the following settings:

    SettingValue
    Nameaws-ike
    IKE versionIKE V1
    Encryption AlgorithmAES 128
    Digest AlgorithmSHA1
    Diffie-HellmanGroup 2
    SA Lifetime (sec)28800

    The IPSec profile looks like this:

    SettingValue
    Nameaws-ipsec
    Encryption AlgorithmAES 128
    Digest AlgorithmSHA1
    PFSEnabled
    Diffie-HellmanGroup 2
    SA Lifetime (sec)3600

    And finally the DPD profile:

    SettingValue
    Nameaws-dpd
    DPD Probe Interval (sec)10

    Note 1: I actually don’t have to create these profiles as the built-in “Foundation” (IKE and IPSec) and “nsx-default-l3vpn-dpd-profile” profiles work fine with AWS IPSec VPN, but I prefer having my own profiles.

    Note 2: The IKE/IPSec/DPD settings in my profiles are fine for my little lab exercise. In the real world you want to consider other, possibly more secure settings.

    VPN Service

    With the VPN profiles in place I move over to VPN Services and add a new IPSec service:

    Very little needs to be configured here. A name (nsx-aws), a description perhaps, but most importantly a gateway. Starting with NSX-T 2.5 IPSec VPN can be configured on a Tier-1 gateway which is exactly what I want to do here:

    IPSec Session

    The last piece of configuration that I need to add is a Policy Based IPSec session which is done under IPSec Sessions:

    The IPSec session is configured with the following settings:

    SettingValue
    Nameipsec-session-01
    VPN Servicensx-aws
    Local Endpointendpoint-01
    Remote IPpublic IP of the AWS VGW
    Authentication ModePSK
    Pre-shared KeyPSK is in the downloaded configuration file
    IKE Profilesaws-ike
    IPSec Profilesaws-ipsec
    DPD Profilesaws-dpd
    Local Networks10.10.1.0/24
    Remote Networks172.16.1.0/24
    Connection Initiation ModeInitiator

    Quite a few settings here but most of them are self explanatory.

    Local Endpoint

    Except for “Local Endpoint” perhaps, which I created while configuring the IPSec session.

    The local endpoint serves as the VPN tunnel’s endpoint on the NSX-T side. Its IP address is assigned to the loopback interface within the Tier-1 Service Router (SR) component. This IP address needs to be unique and reachable throughout the network meaning it needs to be advertised by the Tier-1 and then distributed by the Tier-0.

    On the Tier-1 gateway I make sure that “All IPSec Local Endpoints” is enabled under “Route Advertisement”:

    On the Tier-0 gateway I select “IPSec Local Endpoint” under “Advertised Tier-1 Subnets”:

    A quick peek at the physical router’s route table shows me there’s now a host route to the local endpoint (192.168.10.10/32):

    Verify VPN connection

    Now that the AWS and the NSX-T sides are configured, I should have a functional Site-to-Site VPN connection between the two environments. Let’s verify.

    The IPSec tunnel status is looking good in the NSX Manager UI:

    Some more details about IPSec sessions can be fetched on the Edge node CLI using the “get ipsecvpn” command. For example:

    get ipsecvpn session

    On the AWS side the tunnel status for the first tunnel has changed to “Up”:

    I seem to have an operational Site-to-Site VPN connection. Time for the ultimate test: Can “db01” connect to the EC2 instance?

    It can indeed. Mission accomplished!

    API

    For faster provisioning of the NSX-T VPN configuration I can use the API instead.

    Have a look at this piece of JSON code for some inspiration:

    This creates a Tier-1 with segment and IPSec VPN configuration. It can be send as the body of a PATCH request to the hierarchical policy API:

    PATCH https://<nsx-mgr>/policy/api/v1/infra

    If you want to give it a try, replace the values of the following keys so they match your environment:

    • tier0_path
    • transport_zone_path
    • edge_cluster_path
    • peer_address
    • peer_id
    • psk

    Besides this it should be pretty environment agnostic.

    Summary

    No issues configuring a Site-to-Site IPSec VPN here. For more information about VPN on these platforms see the AWS Site-to-Site VPN documentation and the NSX-T documentation.

    Thanks for reading.

  • Kubernetes – NSX-T Lab

    A while back Dumlu Timuralp published an excellent guide on integrating NSX-T 2.5 with K8s. If you haven’t read it already I strongly recommend that you have a look at it. The guide goes through every step of configuring the integration and does a great job explaining the architecture and components that make up this solution.

    Today’s article is a quick walkthrough of my NSX-T integrated K8s lab which is based on Dumlu’s guide.

    Bill of materials

    The following components are used in my NSX-T – K8s lab:

    1. vSphere 6.7 U3
    2. NSX-T 2.5.1
    3. Ubuntu 18.04
    4. Docker CE 18.06
    5. Kubernetes 1.16

    The lab environment

    The starting point before setting up the K8s integration:

    A standard vSphere platform consisting of a couple of ESXi hosts and a vCenter server. NSX-T has been deployed and an overlay transport zone has been configured.

    On the logical network side of things I have a very basic setup with just a Tier-0 gateway for the North-South connectivity.

    The above infrastructure is pretty much always in place and mostly left untouched. The components for the NSX-T – K8s integration are connected to this existing infrastructure. Let’s have a look at how that’s done.

    NSX-T constructs

    A couple of NSX-T constructs are needed for the K8s integration:

    • Tier-1 gateway for K8s node management
    • Segment for K8s node management
    • Segment for K8s node data plane
    • IP block for K8s namespaces
    • IP block for K8s namespaces not doing source NAT
    • IP pool for K8s Ingress or LoadBalancer service type
    • IP pool for source NATing K8s Pods in the namespaces
    • Two distributed firewall policies

    Placing the components on the diagram for some clarity:

    Nothing too complex, but creating and configuring this by hand takes some time. Especially when doing this many times, which is not uncommon in my lab, it gets boring.

    Luckily, the NSX-T hierarchical policy API helps me out here. I simply specify the desired topology and its configuration as a piece of code and then tell the API to create it for me.

    So here’s the JSON-code for the topology and components above. If you want to use it yourself make sure that you change the values for:

    • tier0_path – the path to your Tier-0 gateway
    • transport_zone_path – the path to your overlay transport zone

    I send this code as the body of a PATCH request to:

    PATCH https://<nsx-mgr>/policy/api/v1/infra

    And in a matter of seconds the components are in place.

    Ubuntu VMs

    On the compute side my K8s cluster consists of three Ubuntu VMs: A master and two worker nodes. Each VM is configured with two NICs where one connects to the “k8s-nodetransport” segment and the other to the “k8s-nodemanagement” segment:

    To get these three VMs up and running as quick as possible I built a vApp and stored it as a template in a vSphere content library:

    Each of the VMs in this vApp template is pre-configured as follows:

    • Hostname
    • IP stack on the mgt NIC
    • Persistent storage directories
    • Python
    • Docker
    • Kubernetes (installed not initialized)
    • NSX Container Plug-in installation files
    • NSX Container Plug-in container image loaded to the local Docker repository

    K8s cluster

    Once the vApp is deployed the first thing I do is to initialize the K8s cluster:

    k8s-master:~$ sudo kubeadm init

    The two worker nodes are joined to the cluster. For example:

    k8s-worker1:~$ sudo kubeadm join 10.190.22.10:6443 --token 8xlrqd.uuvi16c7bgacxihe --discovery-token-ca-cert-hash sha256:5ef8bae3ea509e9605bef2a931f0eeccce40da8ae857174df35fa9fd17d54371

    At this point “kubectl get nodes” shows me:

    kubectl get nodes
    NAME          STATUS     ROLES    AGE     VERSION
    k8s-master    NotReady   master   2m26s   v1.16.4
    k8s-worker1   NotReady            67s     v1.16.4
    k8s-worker2   NotReady            15s     v1.16.4

    Without a CNI plug-in installed the “NotReady” status is expected.

    NSX container plug-in

    Before installing NCP I need to tag the three segment ports of the “k8s-nodetransport” segment as follows:

    ScopeTag (k8s-master)Tag (k8s-worker1)Tag (k8s-worker2)
    ncp/node_namek8s-masterk8s-worker1k8s-worker2
    ncp/clusterk8s-clusterk8s-clusterk8s-cluster

    The ubuntu-ncp.yaml manifest that deploys NCP is already prepared for my lab environment. If you want to use it make sure you change the values for the following settings so that they match your environment:

    • nsx_api_managers
    • nsx_api_user
    • nsx_api_password
    • overlay_tz
    • tier0_gateway

    The manifest is aligned with the JSON that I use to create the NSX-T components.

    Installing the NSX container plugin from the master node by running:

    kubectl apply -f ncp-ubuntu.yaml

    After a minute or two the pods are running in their own “nsx-system” namespace:

    kubectl get pods -n nsx-system
    NAME                       READY   STATUS    RESTARTS   AGE
    nsx-ncp-6978b9cb69-899q8   1/1     Running   0          2m8s
    nsx-ncp-bootstrap-8879t    1/1     Running   0          2m8s
    nsx-ncp-bootstrap-xlnqh    1/1     Running   0          2m8s
    nsx-ncp-bootstrap-zqxh6    1/1     Running   0          2m8s
    nsx-node-agent-7twld       3/3     Running   0          2m8s
    nsx-node-agent-9n64w       3/3     Running   0          2m8s
    nsx-node-agent-jww7g       3/3     Running   0          2m8s

    The node status has changed to “Ready” now that NCP is installed:

    NAME          STATUS   ROLES    AGE   VERSION
    k8s-master    Ready    master   1h   v1.16.4
    k8s-worker1   Ready             1h   v1.16.4
    k8s-worker2   Ready             1h   v1.16.4

    Step 5 – Deploy a workload

    To have something to play around with I deploy a containerized WordPress in my K8s cluster. Here are the yaml files that I use to deploy WordPress in case you want to set this up yourself.

    First I create a separate namespace for the workload:

    kubectl create -f namespace.yaml

    Next, I deploy WordPress in this namespace:

    kubectl apply -k ./ -n wp

    Running a “kubectl get pods -n wp” shows me something like this:

    kubectl get pods -n wp
    NAME                               READY   STATUS    RESTARTS   AGE
    wordpress-55ddbf6d75-7zjc8         1/1     Running   1          109s
    wordpress-mysql-78dddb6bf7-n8pvn   1/1     Running   0          109s

    Running “kubectl get service -n wp” shows the external IP that is assigned by NSX-T from the “k8s-lb-pool”:

    kubectl get service -n wp
     NAME              TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
     wordpress         LoadBalancer   10.101.4.78   10.190.10.51   80:30008/TCP   3m38s
     wordpress-mysql   ClusterIP      None                   3306/TCP       3m38s

    And browsing to “10.190.10.51” brings up a familiar page:

    NSX-T container networking is operational. Happy blogging! 🙂

    Summary

    No rocket science here, but using the NSX-T hierarchical policy API is a time saver and so are vApp templates and yaml manifests. Put something like Ansible on top of this and you’re looking at a fully automated K8s with NSX-T deployment.

    Hopefully this post inspires or maybe even helps you setting up your own NSX-T – K8s integration. It’s a pretty awesome solution and one I plan on covering in future posts as I learn more about it myself.

    Stay tuned!

  • NSX-T Distributed Firewall Threshold Monitoring

    Like any other firewall the NSX-T Distributed Firewall (DFW) consumes memory and CPU. Unlike other firewalls the DFW’s resource consumption is distributed, taking place on the transport nodes where the workloads it protects reside.

    Memory allocation

    An ESXi transport node allocates a fixed amount of memory for the different DFW components. The amount of memory allocated depends on the total amount of RAM installed. For an ESXi host with 128GB RAM or more the allocation looks like this (NSX-T version 2.5):

    DFW ComponentDescriptionMemory Max Size (MB)
    vsip-attr Stores additional attributes used by the L7 context engine 1024
    vsip-flow Stores flow monitoring data 768
    vsip-fqdn Stores resolved FQDN addresses 512
    vsip-module Memory allocated to the vsip kernel process 2560
    vsip-rules Stores DFW rules, address sets and containers 3070
    vsip-si Memory allocated to the service insertion architecture 128
    vsip-state Stores DFW state (existing connections/connection table) 512

    Thresholds

    For both DFW memory and CPU usage the default threshold is set at 90%. You can see thresholds and current resource usage by running the “nsxcli -c get firewall thresholds” command on an ESXi transport node:

    A similar command can be used from an NSX Manager node: “on <transport-node-id> exec get firewall thresholds“.

    It’s nice that we can monitor the DFW resource usage on a per transport node basis, but in most environments this method isn’t very practical.

    In today’s article I want to have a look at two things concerning DFW resource monitoring. Firstly, at how to configure custom thresholds for memory and CPU usage. Secondly, at how to set up central threshold monitoring with alerting.

    Configuring custom DFW thresholds

    Below are the steps at a high level for configuring custom DFW thresholds:

    1. Create an NSGroup containing transport nodes
    2. Create a threshold profile
    3. Apply threshold profile
    4. Verify

    Time to get our hands dirty!

    Step 1 – Create an NSGroup containing transport nodes

    We need to group our transport nodes. Currently only NSGroups, the ones managed by the MP API, support having transport nodes as members.

    NSGroups are managed under Advanced Networking & Security > Inventory > Groups. I’m creating an NSGroup called “esxi-tn” with a membership criteria that will add all the host transport nodes as members:

    Copy the NSGroup ID to a text file as we need it at step 3:

    Step 2 – Create a threshold profile

    Using a REST API client we’re going to make a POST request to the NSX MP API:

    POST https://{{nsx-manager-fqdn}}/api/v1/firewall/profiles

    The request body contains the following piece of JSON code:

    {
         "cpu_threshold_percentage" : 75,
         "display_name" : "ESXi DFW Threshold Profile"
         "mem_threshold_percentage" : 75,
         "resource_type" : "FirewallCpuMemThresholdsProfile"
     }

    The values for “cpu_threshold_percentage” and “mem_threshold_percentage” will depend on your requirements. For this exercise I’m configuring a threshold at 75% for both memory and CPU usage.

    The POST request body and the result:

    profile post result

    Copy the threshold profile’s ID from the result to a text file as we need it in the next step.

    Step 3 – Apply threshold profile

    The second API call configures a service-config that links the threshold profile to the NSGroup:

    POST https://{{nsx-manager-fqdn}}/api/v1/service-configs

    With the following JSON code as the request body:

    {
         "display_name":"DfwCpuMemServiceConfig",
         "profiles":[
                 {
                     "profile_type":"FirewallCpuMemThresholdsProfile",
                     "target_id":"c4a003e0-468d-4582-a24e-ada96742f0ca"
                 }
             ],
         "precedence": 10,
         "applied_to": [
             {
                 "target_id":"744db991-e504-4e4b-83eb-c60b94a7f785",
                 "target_type" : "NSGroup"
             }
         ]
     }

    The threshold profile and the NSGroup IDs that we copied to a text file earlier are used as the values for the two target_ids.

    The POST request body and the result:

    Step 4 – Verify

    An easy way to verify that the new DFW thresholds have been applied is to run the “get firewall thresholds” NSXCLI command. This time I’ll run it from an NSX Manager node:

    As we see the new threshold value of 75% has been applied.

    Setting up alerting

    You might wonder what actually happens when a threshold is crossed? Currently there’s no alarming framework in NSX-T so the only thing that happens is that a threshold event is logged to syslog.

    Luckily there’s always vRealize Log Insight. Configured as a syslog target for the NSX-T platform, DFW threshold events end up there too:

    A quick look at a DFW threshold event. We see things like the transport node, the DFW component that crossed the threshold, as well as the configured threshold and the current usage.

    Now that we know what a threshold event looks like, it’s easy to configure an alert based on the query in vRLI:

    text contains “threshold event is raised”

    Click on “Create Alert from Query”:

    Fill out the details for the new alert:

    And that’s it. From now on you’ll be notified each time a DFW threshold is crossed.

    Summary

    Configuring custom DFW thresholds and monitoring these with Log Insight isn’t too hard to set up.
    It’s true that with a proper DFW design and by sticking to good practices for implementation, problems related to DFW memory or CPU usage are rare. That being said, it’s not a bad idea to keep an eye on the DFW’s resource utilization. Just in case.

  • Locking NSX-T Firewall Policies

    After receiving a couple questions about the NSX-T firewall policy locking feature, I decided to write a short blog post about it.

    The purpose of locking a firewall policy

    The easy part first. As explained in the official NSX-T documentation we lock a firewall policy to prevent multiple users from editing the same section.

    Locks could be short term like when a team is working in the NSX Manager firewall UI at the same time and want to avoid configuration collisions. Locks could also be long(er) term. For example when somebody is tasked with building a more complex firewall ruleset or when policies are subject to change management.

    Let’s start locking then!

    Here’s where it can get a bit confusing. While the option to lock a policy is always available, it won’t have any effect until you implement and use Role Based Access Control (RBAC) for NSX-T management. Why?

    The default “admin” account, which is the only account you can work with in the NSX Manager UI without RBAC, has the “Enterprise Admin” role assigned to it. This superuser role has permission to make changes to firewall policies even when they are locked.

    adding new rule in locked section as ea

    So, if your team is using this default account (very bad practice) or individual accounts with the “Enterprise Admin” role assigned, you can lock firewall policies all you want, but these locks won’t have any actual effect.

    Let’s fix this then!

    Yes. As said this requires that we implement and use RBAC for NSX-T management first. There’s documentation available that will help you set this up so I won’t go through that in this article. On a high level the process looks like this:

    1. Deploy vIDM
    2. Connect vIDM to Active Directory
    3. Configure remote app access for the NSX Manager
    4. Configure NSX Manager to use vIDM for AAA

    When that’s done we can start assigning NSX-T roles to Active Directory users:

    role assignment

    Example

    In this example I’m assigning the “Security Engineer” role to two AD users:

    The two security engineers have been configured:

    security engineers

    Let’s pretend “jsmith@demo.local” logs in to the NSX Manager UI and starts working on a new DFW policy:

    john's dfw policy

    When called into a meeting jsmith locks the policy he’s working on to prevent anybody from making changes:

    Next, the other security engineer “pgroot@demo.local” logs in to the NSX Manager UI. She has a look at the new policy and decides to make a minor change to it. When she tries to publish the change the following message appears:

    The change can’t be realized with her account. This is the expected and desired behaviour. The policy lock is enforced with RBAC implemented.

    Summary

    While most organizations have the RBAC components for NSX-T management in place (vIDM, AD, etc), actually leveraging NSX-T management roles so that things like locking firewall policies work is perhaps another thing. Hopefully this short article gave you some better understanding of how to get started.

  • Tier-1 Failure Domain

    With every new release of NSX-T interesting features are added to the platform. Take failure domain for example.

    Introduced in version 2.5, failure domain adds another layer of protection for the centralized services running on Tier-1 Gateways. It basically facilitates a rack aware placement mechanism for the Tier-1 service router (SR) components.

    In today’s article I’m going to do a simple failure domain proof of concept. I’ll walk through the configuration steps for setting up failure domain and verify its functionality.

    The lab environment

    For this exercise I installed a vSphere cluster consisting of four ESXi hosts divided over two racks. I’m calling these the Edge racks and made this very advanced diagram:

    rack diagram with esxi

    I then deployed four NSX-T Edge nodes (EN1 – EN4), one on each host, and added these to NSX-T Edge Cluster “T0 Cluster ECMP”:

    rack diagram with tier-0 cluster

    I threw in a Tier-0 Gateway called “T0-01” which is running in Active-Active HA mode with ECMP enabled. The Tier-0’s 8 uplinks are all taking part in forwarding North-South traffic, simultaneously:

    rack diagram with tier-1

    Finally, I deployed four more Edge nodes (EN5 – EN8), one on each host, and added these to Edge cluster “T1 Cluster”:

    rack diagram with tier-1 cluster

    The eight Edge nodes in the NSX Manager UI:

    edge nodes

    Next step – Create Tier-1s

    I will create Tier-1 Routers (Manager API) as opposed to Tier-1 Gateways (Policy API). This because the API call to trigger a Tier-1 SR reallocate I want to run later on only works on Tier-1 Routers. This has nothing to do with the failure domain feature itself which is compatible with both Tier-1 Routers and Tier-1 Gateways of course.

    Configuring the first Tier-1 called “T1-01”:

    t1-01

    I’m selecting the “T1 Cluster” Edge Cluster and no specific Edge Cluster Members.

    Both of the Tier-1s and the Tier-0 listed in the NSX Manager UI:

    tier-1 created

    Tier-1 service routers

    Selecting an Edge cluster for a Tier-1 indicates that you intend to run one or more centralized services on that Tier-1. This means that one active and one standby service router (SR) are instantiated on two different Edge nodes in that cluster (a Tier-1 SR always runs in Active-Standby HA mode).

    By the way, you should not select an Edge Cluster for a Tier-1 if you don’t intend to run centralized services on it as this can lead to unintended hairpinning of traffic over the Edge nodes.

    You noticed that I didn’t specify any Edge Cluster Members for the SRs. This results in the management plane picking them for me. So where did they end up?

    Clicking the Active-Standby link for each of the Tier-1 Routers reveals the SRs location. “T1-01” has its active SR on Edge node EN7 and its standby SR on Edge node EN5:

    t1-01 sr location

    “T1-02” has its active SR on Edge node EN8 and its standby SR on Edge node EN6:

    t1-02 sr location

    Fine. Let’s have a look at the Edge rack again now that we have introduced these Tier-1 SRs to the environment:

    rack diagram with tier-1

    My two Tier-1s are in separate racks. Great! Or is it?
    With the current Tier-1 SR placement a single Edge rack failure will result in one of the Tier-1 Routers losing both its active and the standby SR. That’s pretty bad.

    Failure Domain

    Failure domain prevents this silly SR placement from happening. Correctly configured, failure domain ensures that the active and standby SRs of a Tier-1 are always placed in different racks.

    Sounds great. Time set this up.

    Step 1 – Create two failure domains

    Failure domains are created using a POST request to the NSX API at:

    POST https://{{nsx-manager-fqdn}}/api/v1/failure-domains/

    The request body for my first failure domain contains the following piece of JSON code:

    {   
    "display_name": "Rack-1"
    } 

    The JSON code for my second failure domain:

    {   
    "display_name": "Rack-2"
    } 

    Creating the first failure domain using Postman:

    create failure domain postman

    Copy the value for “id” from the request result for each of the failure domains as we need these in the next step.

    Step 2 – Assign Edge nodes to failure domains

    The Edge nodes in the “T1 Cluster” need to be assigned to their respective failure domains. This too is done through an API call to the Manager API.

    For each Edge node we first retrieve its current configuration using the following GET request:

    GET https://{{nsx-manager-fqdn}}/api/v1/transport-nodes/{{edge-node-id}}

    You can find the ID of an Edge node in the NSX Manager UI (or via API):

    edge node id

    The GET request for Edge node EN5:

    get en5

    Copy the request result to the body of a new PUT request and change the value for “failure_domain_id” to match the ID of one of the newly created failure domains.

    PUT https://{{nsx-manager-fqdn}}/api/v1/transport-nodes/{{edge-node-id}}

    Which failure domain ID to use depends on the rack location of the Edge node. The following table lists the failure domain plan for my Tier-1 Edge nodes:

    Edge NodeFailure DomainFailure Domain ID
    EN5Rack-17e1af661-8e2c-43f7-924f-68eabce0f40b
    EN6Rack-2d78707df-2f7f-48a9-9e3e-98a5523901c7
    EN7Rack-17e1af661-8e2c-43f7-924f-68eabce0f40b
    EN8Rack-2d78707df-2f7f-48a9-9e3e-98a5523901c7

    Four GET/PUT requests later the Edge nodes have been assigned to the correct failure domains.

    Step 3 – Configure the Edge Cluster

    The “T1 Cluster” Edge Cluster needs to be configured for failure domain based placement. This is also done via the API.

    First a GET request to retrieve the current configuration of the Edge Cluster:

    GET https://{{nsx-manager-fqdn}}/api/v1/edge-clusters/{{edge-cluster-id}}

    The “edge-cluster-id” can be found in the NSX Manager UI (or via API):

    edge cluster id

    The GET request’s result in JSON:

    get edge cluster

    Again, you copy the request result to the body of a new PUT request. The only thing that we need to change here is the value for “allocation_rules

    from:

    "allocation_rules": [],

    to:

    "allocation_rules": [ {"action": {"enabled": true,"action_type": "AllocationBasedOnFailureDomain" } } ],

    Send the PUT request to:

    PUT https://{{nsx-manager-fqdn}}/api/v1/edge-clusters/{{edge-cluster-id}}

    And we’re done. From now on this Edge Cluster will perform failure domain based placement for new Tier-1 SRs.

    A new Tier-1

    Let’s put this to the test immediately by creating a new Tier-1.

    Here comes “T1-03”:

    t1-03

    Once again I’m selecting the “T1 Cluster” Edge cluster and no specific Edge nodes (= Auto Allocated). So where did the management plane decide to place the SRs this time?

    t1-03 placement

    The Active SR is on EN5 and the standby SR on EN6. They indeed ended up in separate racks!

    Existing Tier-1s

    What about the Tier-1 SRs that were deployed before we configured failure domains? Can we trigger a reallocation so that they too are placed in accordance to the new failure domain configuration?

    It turns out that we can, but it’s a data plane disruptive operation and, as far as I know, only works for Tier-1s created through the manager API (or in the UI under Advanced Networking & Security). Thank you Gary Hills for letting us know that the reallocate API call works for Tier-1s created in Policy UI/API as well by adding a header to the below request with key “X-Allow-Overwrite” and value of “true”:

    X-Allow-Overwrite

    A POST request on each of the existing Tier-1s will do the trick:

    POST https://{{nsx-manager-fqdn}}/api/v1/logical-routers/{{logical-router-id}}?action=reallocate

    The request body should contain the following JSON:

    {
       "edge_cluster_id": "{{edge-cluster-id}}"
    }

    The values for “logical-router-id” and “edge-cluster-id” can be found in the NSX Manager UI (or via API).

    Request accepted by the API:

    reallocate in postman

    A reallocation process now takes places behind the scenes. A few moments later we see that the active and standby SRs of the existing Tier-1s are now in separate racks:

    after tier-1 reallocation
    after tier-1 reallocation

    Let’s have a last look at the Edge rack after implementation and enforcement of the failure domains:

    tier-1 failure domain implemented

    Looks so much better now!

    Summary

    Today we had a look at how to set up Tier-1 failure domain in NSX-T 2.5. The goal was to ensure that active and standby Tier-1 SRs ended up in separate racks.

    Failure domain is a pretty cool and useful new feature adding extra protection for the Tier-1 SRs. Currently configurable via the API only, but that process was straight forward. With just a couple of request we got failure domains up and running.

    Whether Tier-1 failure domain makes sense in your environment will depend on your NSX Edge design, number of Edge nodes, and things like future growth.

    Good luck!

  • Bulk Create NSX-T Segments Using A Postman Data File

    Imagine this, you’ve been tasked with implementing micro-segmentation in your vSphere environment. You just deployed and configured NSX-T and the next step is to migrate VMs from their VDS port groups to N-VDS segments.

    You fire up the vSphere Client and expand the VDS to have a look at the current situation:

    vds port groups

    It’s pretty bad.

    Turns out your VMs are connected to no less than 784 different port groups! Overlay networking/consolidation and re-IP are currently not part of the plan so you’re stuck with these 784 VLANs. You now realize that you need to create 784 VLAN backed segments in NSX-T. Life sucks.

    Postman to the rescue

    In today’s short post I want to share an easy way that can help you out in a scenario like the one above. It involves the NSX-T Policy API, Postman, and a text file. Let’s go!

    Step 1 – Prepare the CSV file

    First we need to create a simple text file that contains values for the NSX-T segments and their corresponding VLAN IDs. The format of the comma separated text file is as follows:

    segment_name, vlan_id
    vlan-1000, 1000
    vlan-1001, 1001
    vlan-1002, 1002
    vlan-1003, 1003
    ....
    ....
    vlan-1783, 1783

    For “segment_name” you use whatever fits your naming convention. I’m saving this file as “segments.csv”:

    segments.csv

    Step 2 – Prepare the Postman request

    We’re going to leverage the NSX-T Hierarchical API to create these segments by making a PATCH request to:

    https://{{nsx-manager-fqdn}}/policy/api/v1/infra

    Only a small piece of JSON code is needed in the request body:

    {
     "resource_type": "Infra", "children": [{
     "resource_type": "ChildSegment", "marked_for_delete": "false", "Segment": {
     "resource_type": "Segment",
     "type": "DISCONNECTED",
     "id": "{{segment_name}}",
     "display_name": "{{segment_name}}",
      "vlan_ids": [
             "{{vlan_id}}"
           ],
     "path": "/infra/segments/{{segment_name}}",
     "relative_path": "{{segment_name}}",
     "parent_path": "/infra/segments/{{segment_name}}",
     "transport_zone_path": "/infra/sites/default/enforcement-points/default/transport-zones/e82afbae-c811-48e1-8946-6e1f62b67871"
     } }]
     }

    As you can see the variables “{{segment_name}}” and “{{vlan_id}}” are used a couple of times in this piece of code. Their values will be fetched from the matching columns in the ”segments.csv”.

    The value for “transport_zone_path” is unique in every NSX-T deployment. You can easily find the ID of your VLAN transport zone in the NSX Manager UI under System > Configuration > Fabric > Transport Zones:

    tz id

    Putting it all together the Postman request will look like this:

    postman request

    I’m saving this request as “Create NSX-T Segments with data file” in a new collection folder called “NSX-T”.

    Step 3 – Start the Postman Runner

    Click the Runner button to start the collection runner:

    collection runner

    In the next screen you select the saved request:

    select request to run

    We need to configure a couple of things for this run. The table below lists my settings:

    SettingValueComment
    EnvironmentYour NSX-T environment Have a look at this post for more information about working with Postman environments.
    Iterations784We have 784 segments in our data file.
    Datasegments.csvThe data file.
    runner settings

    After selecting your data file you can click the Preview button just to verify that Postman is interpreting the data correctly:

    preview

    Looks pretty good to me. Time to press the big button:

    run nsx-t

    Running these 784 iterations will take a couple of minutes. You can monitor the progress in the “Run Results” screen:

    runner progress

    Notice the “200 OK” status for each iteration which is the NSX-T API’s response to the requests and means it was processed successfully.

    Once the Runner is finished it’s time to have a look in the NSX Manager UI under Networking > Connectivity > Segments to see if new segments have been created:

    segments

    That certainly seems to be the case. All of the 784 VLAN backed segments are there and configured with the correct transport zone and VLAN ID:

    segment detail

    Summary

    Bulk creating or modifying NSX-T objects can be done in a number of different ways. If coding is your thing you’ll probably have little trouble putting together a tool for this using your preferred language. If you’re more into scripting you can use something like PowerShell. And if you like to work really slow you can always turn to the NSX Manager UI.

    For everybody else there’s Postman. Using this tool in combination with data files offers an easy and quick way for creating or modifying large amounts of NSX-T objects.

    You can read more about the NSX-T Policy API in the NSX Policy API: Getting Started Guide. To learn more about working with Postman data files check out this tutorial.

    Have fun!

  • Welcome back! I’m in the process of setting up NSX-T in a stretched cluster environment.

    In part 1 I deployed the NSX manager cluster and configured the ESXi hosts as NSX transport nodes. The N-VDS was installed on the ESXi hosts and their vmkernel adapters migrated from the VDS to the N-VDS.

    In this second part I will configure the NSX data plane for north-south and east-west networking. Again, there’s a lot to do so let’s begin!

    The lab environment

    A couple of things happened since the last time I had a look at the lab environment’s diagram:

    The vSphere management cluster is now also hosting an NSX manager cluster and the ESXi hosts turned into NSX-T transport nodes.

    Speaking of ESXi hosts, here’s a little closer look at one of them:

    There’s now an N-VDS instead of a VDS with the three vmkernel adapters Management, vMotion, and vSAN. There are also two new vmkernel adapters which are acting as tunnel endpoints (TEPs) for the NSX overlay networking (geneve encapsulation/decapsulation).

    The infrastructure for east-west networking is largely in place, but without a north-south network path this cluster is pretty isolated.

    NSX Edge

    The NSX Edge provides a central entrance/exit point for network traffic entering and exiting the SDDC and is exactly what this environment needs.

    Deploy edge VMs

    I’m deploying a total of four edge VMs (two at each site). I’ll deploy them using the Edge VM OVA package so that I can connect the edge node’s management interface to the NSX-T segment at the time of deployment.

    The table below contains the deployment details for the edge VMs:

    Settingen01-aen01-ben02-aen02-b
    Nameen01-aen01-ben02-aen02-b
    Network 0site-a-nvds01-managementsite-b-nvds01-managementsite-a-nvds01-managementsite-b-nvds01-management
    Network 1edge-uplink1edge-uplink1edge-uplink1edge-uplink1
    Network 2edge-uplink2edge-uplink2edge-uplink2edge-uplink2
    Network 3not usednot usednot usednot used
    Mgmt IP172.16.41.21/24172.16.51.21/24172.16.41.22/24172.16.51.22/24

    Deploying the edge VM using the OVA package:

    ovf edge vm deployment

    Configure edge nodes

    After deployment the edge nodes need to join the management plane. For this I use the “join management-plane” NSX CLI command:

    cli join

    Once he edge nodes have joined the management plane, I can pick them up in the NSX Manager UI to configure each of them as Edge Transport Nodes. I’m using the following configuration details for this :

    Settingen01-aen01-ben02-aen02-b
    Transport Zonestz-vlan, tz-overlaytz-vlan, tz-overlaytz-vlan, tz-overlaytz-vlan, tz-overlay
    N-VDS Namenvds01nvds01nvds01nvds01
    Uplink Profileup-site-a-edgeup-site-b-edgeup-site-a-edgeup-site-b-edge
    IP AssignmentUse Static IP ListUse Static IP ListUse Static IP ListUse Static IP List
    Static IP List172.16.49.30,172.16.49.31172.16.59.30,172.16.59.31172.16.49.32,172.16.49.33172.16.59.32,172.16.59.33
    Virtual NICsfp-eth0 – uplink-1,
    fp-eth1 – uplink-2
    fp-eth0 – uplink-1,
    fp-eth1 – uplink-2
    fp-eth0 – uplink-1,
    fp-eth1 – uplink-2
    fp-eth0 – uplink-1,
    fp-eth1 – uplink-2

    Edge transport nodes are managed under System > Fabric > Nodes > Edge Transport Nodes.

    en01-a transport node configuration

    Like the ESXi hosts, all four edge nodes are now fully configured transport nodes:

    edge transport nodes

    Edge cluster

    The edge transport nodes need to be part of an edge cluster. I will create an edge cluster called edge-cluster01 and add all four nodes to this cluster.

    Edge clusters are managed under System > Fabric > Nodes > Edge Clusters:

    Anti-affinity rules

    The edge VMs shouldn’t be running on the same ESXi host. To prevent this from happening I create two anti-affinity rules on the vSphere cluster; one for the edge VMs at Site A and another for the edge VMs at Site B:

    vm/host rule

    Groups and rules

    The edge VMs should also stick to their site. For this I create two host and a two VM groups. A “virtual machine to host” rule will then make sure that the edge VMs stay pinned to their respective site.

    The host group for Site A:

    host group

    The VM group for the edge VMs at Site B:

    vm group

    The “virtual machine to host” rule keeping edge VMs belonging to Site A on the ESXi hosts of Site A:

    vm to host rule

    The result of having these groups and rules in place becomes visible after some seconds. Edge VMs are running at the correct site and on seperate ESXi hosts within a site:

    correctly placed VMs

    That pretty much completes the NSX Edge infrastructure deployment in my stretched cluster.

    Routing

    Now that the NSX-T Edge is in place, it’s time to set up a connection with the physical network so that packets can actually get in and out of the environment.

    Tier-0 gateway

    A Tier-0 gateway provides the gateway service between the logical and the physical network and is just what I need.

    I’m creating my Tier-0 gateway with the following configuration details:

    SettingValue
    Nametier0-01
    High Availability ModeActive-Active
    Edge Clusteredge-cluster01
    Route Re-Distributionall

    Tier-0 gateways are managed under Networking > Connectivity > Tier-0 Gateways.

    tier-0 gateway

    Interfaces

    This Tier-0 will have eight external interfaces mapped to the different edge transport nodes at the two sites. The table below shows the interfaces and their configuration details:

    NameIP Address / MaskConnected ToEdge NodeMTU
    en01-a-uplink01172.16.47.2/24site-a-edge-transit01en01-a9000
    en01-a-uplink02172.16.48.2/24site-a-edge-transit02en01-a9000
    en02-a-uplink01172.16.47.3/24site-a-edge-transit01en02-a9000
    en02-a-uplink02172.16.48.3/24site-a-edge-transit02en02-a9000
    en01-b-uplink01172.16.57.2/24site-b-edge-transit01en01-b9000
    en01-b-uplink02172.16.58.2/24site-b-edge-transit02en01-b9000
    en02-b-uplink01172.16.57.3/24site-b-edge-transit01en02-b9000
    en02-b-uplink02172.16.58.3/24site-b-edge-transit02en02-b9000

    The Tier-0 external interfaces are now configured and active:

    tier-0 interfaces

    BGP

    The TORs have been configured for BGP already and now I need to set up BGP at the Tier-0 gateway too.

    The BGP settings that I will use on the Tier-0 gateway are:

    SettingValue
    Local AS65000
    BGPOn
    Graceful RestartOff
    Inter SR iBGPOn
    ECMPOn
    Multipath RelaxOn

    Configuring BGP details on the Tier-0 gateway:

    I’m adding each TOR as a BGP neighbor to the Tier-0 gateway. The following table shows the configuration details for the four BGP neighbor entries:

    IP addressBFDRemote ASHold DownKeep Alive
    172.16.47.1Enabled65001124
    172.16.48.1Enabled65001124
    172.16.57.1Enabled65002124
    172.16.58.1Enabled65002124

    The BGP neighbor status after the four TORs are added:

    bgp nrighbors

    Route map

    To prevent asymmetric traffic flows, the NSX Edge infrastructure at Site A should be the preferred ingress/egress point for the north-south traffic.

    I achieve this by AS path prepending on the BGP paths to Site B. This is configured in a route map on the Tier-0 gateway.

    First I need to create an IP prefix list. Both IP prefix lists and route maps are managed on the Tier-0 gateways under Routing:

    route maps

    The details of the IP prefix list:

    SettingValue
    Nameany-prefix
    Networkany
    ActionPermit

    The details of the route map:

    SettingValue
    Route Map Namesiteb-route-map
    TypeIP Prefix
    Membersany-prefix
    AS path prepend65000 65000

    The route map needs to be attached to the BGP neighbor entries belonging to Site B. I configure the route map as Out Filter and In Filter:

    route map out filter

    The Site B neighbors now have filters configured:

    filters configured for site b

    This completes the Tier-0 gateway deployment.

    Diagram

    I’m just taking a step back to have a look at what it is I actually did here.

    The diagram below shows the Tier-0 gateway’s L3 connectivity with the physical network:

    tier-0 bgp

    It’s a pretty wild diagram I’m aware, but hopefully it makes some sense.

    East-West

    The Tier-1 gateway is where the NSX-T segments for virtual machine networking will be connected. The Tier-1 gateway is linked to the Tier-0 gateway too, of course.

    I’m creating a Tier-1 gateway with the following configuration details:

    SettingValue
    Nametier1-01
    Linked Tier-0 Gatewaytier0-01
    Fail OverNon Preemptive
    Edge Clusteredge-cluster01
    Route Advertisementall

    Tier-1 gateways are managed under Networking > Connectivity > Tier-1 Gateways.

    tier-1 gateway

    Workload segments

    With the Tier-1 gateway in place I can now attach some NSX-T segments for the workloads (VMs).

    I’m creating three segments Web, App, and DB with the following configuration details:

    SettingValue
    Connected Gateway & Typetier1-01, flexible
    Transport Zonetz-overlay
    Subnets (gateway)10.0.1.1/24 (Web), 10.0.2.1 (App), 10.0.3.1 (DB)

    Creating the segments:

    segments

    I notice that downlink ports have been created on the Tier-1 gateway:

    downlink ports

    Provision VMs

    It’s all about the VMs of course. So I deploy three VMs web01, app01, and db01. They are connected to the segments.

    VM web01 connected to segment Web as seen at the N-VDS Visualization in the NSX Manager UI:

    web01

    Connectivity test

    Time to test connectivity.

    East-west

    First between the VMs which I place on different ESXi hosts and at different sites.

    web01 (10.0.1.10) at Site B pinging db01 (10.0.3.10) at Site A:

    web01 pings db01

    Visualized by the Port Connection tool in the NSX Manager UI:

    port connection

    app01 (10.0.2.10) at Site A pinging web01 at Site B:

    app01 pings web01

    Once again visualized by the Port Connection tool:

    port connection

    East-west and cross-site logical networking seems to be working!

    North-south

    How about north-south? Let’s see.

    db01 at Site A pinging a host on the physical network (10.2.129.86):

    db01 pings physical

    The Traceflow tool in the NSX Manager UI tells me a bit more about the network path. I can see that the traffic exits the SDDC through Site A (en02-a):

    traceflow

    The other way around a traceroute from the physical network to web01 at Site B:

    traceroute from physical

    Traffic entering the SDDC through Site A (en01-a). Perfect!

    Summary

    Wow! This has been quite an exercise. Are you still there? 😉

    It all started with deploying the NSX Edge (virtual) infrastructure. On top of that infrastructure I deployed a Tier-0 gateway and configured dynamic routing between the Tier-0 and the TORs.

    To facilitate for east-west distributed logical networking, I deployed a Tier-1 gateway and linked it to the Tier-0. I connected some NSX-T segments to the Tier-1 gateway and some virtual machines to the segments.

    Some simple connectivity testing showed that north-south and east-west networking were working as intended. Site A is consistently used for the north-south ingress/egress traffic flows thanks to the BGP AS prepending.

    Thanks for staying tuned this long. I hope this and the previous article about deploying NSX-T in a stretched cluster environment have been interesting reads. I might return to this environment for some more NSX-T multisite scenarios in future articles.

    Cheers!