• Kubernetes – NSX-T Lab

    A while back Dumlu Timuralp published an excellent guide on integrating NSX-T 2.5 with K8s. If you haven’t read it already I strongly recommend that you have a look at it. The guide goes through every step of configuring the integration and does a great job explaining the architecture and components that make up this solution.

    Today’s article is a quick walkthrough of my NSX-T integrated K8s lab which is based on Dumlu’s guide.

    Bill of materials

    The following components are used in my NSX-T – K8s lab:

    1. vSphere 6.7 U3
    2. NSX-T 2.5.1
    3. Ubuntu 18.04
    4. Docker CE 18.06
    5. Kubernetes 1.16

    The lab environment

    The starting point before setting up the K8s integration:

    A standard vSphere platform consisting of a couple of ESXi hosts and a vCenter server. NSX-T has been deployed and an overlay transport zone has been configured.

    On the logical network side of things I have a very basic setup with just a Tier-0 gateway for the North-South connectivity.

    The above infrastructure is pretty much always in place and mostly left untouched. The components for the NSX-T – K8s integration are connected to this existing infrastructure. Let’s have a look at how that’s done.

    NSX-T constructs

    A couple of NSX-T constructs are needed for the K8s integration:

    • Tier-1 gateway for K8s node management
    • Segment for K8s node management
    • Segment for K8s node data plane
    • IP block for K8s namespaces
    • IP block for K8s namespaces not doing source NAT
    • IP pool for K8s Ingress or LoadBalancer service type
    • IP pool for source NATing K8s Pods in the namespaces
    • Two distributed firewall policies

    Placing the components on the diagram for some clarity:

    Nothing too complex, but creating and configuring this by hand takes some time. Especially when doing this many times, which is not uncommon in my lab, it gets boring.

    Luckily, the NSX-T hierarchical policy API helps me out here. I simply specify the desired topology and its configuration as a piece of code and then tell the API to create it for me.

    So here’s the JSON-code for the topology and components above. If you want to use it yourself make sure that you change the values for:

    • tier0_path – the path to your Tier-0 gateway
    • transport_zone_path – the path to your overlay transport zone

    I send this code as the body of a PATCH request to:

    PATCH https://<nsx-mgr>/policy/api/v1/infra

    And in a matter of seconds the components are in place.

    Ubuntu VMs

    On the compute side my K8s cluster consists of three Ubuntu VMs: A master and two worker nodes. Each VM is configured with two NICs where one connects to the “k8s-nodetransport” segment and the other to the “k8s-nodemanagement” segment:

    To get these three VMs up and running as quick as possible I built a vApp and stored it as a template in a vSphere content library:

    Each of the VMs in this vApp template is pre-configured as follows:

    • Hostname
    • IP stack on the mgt NIC
    • Persistent storage directories
    • Python
    • Docker
    • Kubernetes (installed not initialized)
    • NSX Container Plug-in installation files
    • NSX Container Plug-in container image loaded to the local Docker repository

    K8s cluster

    Once the vApp is deployed the first thing I do is to initialize the K8s cluster:

    k8s-master:~$ sudo kubeadm init

    The two worker nodes are joined to the cluster. For example:

    k8s-worker1:~$ sudo kubeadm join 10.190.22.10:6443 --token 8xlrqd.uuvi16c7bgacxihe --discovery-token-ca-cert-hash sha256:5ef8bae3ea509e9605bef2a931f0eeccce40da8ae857174df35fa9fd17d54371

    At this point “kubectl get nodes” shows me:

    kubectl get nodes
    NAME          STATUS     ROLES    AGE     VERSION
    k8s-master    NotReady   master   2m26s   v1.16.4
    k8s-worker1   NotReady            67s     v1.16.4
    k8s-worker2   NotReady            15s     v1.16.4

    Without a CNI plug-in installed the “NotReady” status is expected.

    NSX container plug-in

    Before installing NCP I need to tag the three segment ports of the “k8s-nodetransport” segment as follows:

    ScopeTag (k8s-master)Tag (k8s-worker1)Tag (k8s-worker2)
    ncp/node_namek8s-masterk8s-worker1k8s-worker2
    ncp/clusterk8s-clusterk8s-clusterk8s-cluster

    The ubuntu-ncp.yaml manifest that deploys NCP is already prepared for my lab environment. If you want to use it make sure you change the values for the following settings so that they match your environment:

    • nsx_api_managers
    • nsx_api_user
    • nsx_api_password
    • overlay_tz
    • tier0_gateway

    The manifest is aligned with the JSON that I use to create the NSX-T components.

    Installing the NSX container plugin from the master node by running:

    kubectl apply -f ncp-ubuntu.yaml

    After a minute or two the pods are running in their own “nsx-system” namespace:

    kubectl get pods -n nsx-system
    NAME                       READY   STATUS    RESTARTS   AGE
    nsx-ncp-6978b9cb69-899q8   1/1     Running   0          2m8s
    nsx-ncp-bootstrap-8879t    1/1     Running   0          2m8s
    nsx-ncp-bootstrap-xlnqh    1/1     Running   0          2m8s
    nsx-ncp-bootstrap-zqxh6    1/1     Running   0          2m8s
    nsx-node-agent-7twld       3/3     Running   0          2m8s
    nsx-node-agent-9n64w       3/3     Running   0          2m8s
    nsx-node-agent-jww7g       3/3     Running   0          2m8s

    The node status has changed to “Ready” now that NCP is installed:

    NAME          STATUS   ROLES    AGE   VERSION
    k8s-master    Ready    master   1h   v1.16.4
    k8s-worker1   Ready             1h   v1.16.4
    k8s-worker2   Ready             1h   v1.16.4

    Step 5 – Deploy a workload

    To have something to play around with I deploy a containerized WordPress in my K8s cluster. Here are the yaml files that I use to deploy WordPress in case you want to set this up yourself.

    First I create a separate namespace for the workload:

    kubectl create -f namespace.yaml

    Next, I deploy WordPress in this namespace:

    kubectl apply -k ./ -n wp

    Running a “kubectl get pods -n wp” shows me something like this:

    kubectl get pods -n wp
    NAME                               READY   STATUS    RESTARTS   AGE
    wordpress-55ddbf6d75-7zjc8         1/1     Running   1          109s
    wordpress-mysql-78dddb6bf7-n8pvn   1/1     Running   0          109s

    Running “kubectl get service -n wp” shows the external IP that is assigned by NSX-T from the “k8s-lb-pool”:

    kubectl get service -n wp
     NAME              TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
     wordpress         LoadBalancer   10.101.4.78   10.190.10.51   80:30008/TCP   3m38s
     wordpress-mysql   ClusterIP      None                   3306/TCP       3m38s

    And browsing to “10.190.10.51” brings up a familiar page:

    NSX-T container networking is operational. Happy blogging! 🙂

    Summary

    No rocket science here, but using the NSX-T hierarchical policy API is a time saver and so are vApp templates and yaml manifests. Put something like Ansible on top of this and you’re looking at a fully automated K8s with NSX-T deployment.

    Hopefully this post inspires or maybe even helps you setting up your own NSX-T – K8s integration. It’s a pretty awesome solution and one I plan on covering in future posts as I learn more about it myself.

    Stay tuned!

  • NSX-T Distributed Firewall Threshold Monitoring

    Like any other firewall the NSX-T Distributed Firewall (DFW) consumes memory and CPU. Unlike other firewalls the DFW’s resource consumption is distributed, taking place on the transport nodes where the workloads it protects reside.

    Memory allocation

    An ESXi transport node allocates a fixed amount of memory for the different DFW components. The amount of memory allocated depends on the total amount of RAM installed. For an ESXi host with 128GB RAM or more the allocation looks like this (NSX-T version 2.5):

    DFW ComponentDescriptionMemory Max Size (MB)
    vsip-attr Stores additional attributes used by the L7 context engine 1024
    vsip-flow Stores flow monitoring data 768
    vsip-fqdn Stores resolved FQDN addresses 512
    vsip-module Memory allocated to the vsip kernel process 2560
    vsip-rules Stores DFW rules, address sets and containers 3070
    vsip-si Memory allocated to the service insertion architecture 128
    vsip-state Stores DFW state (existing connections/connection table) 512

    Thresholds

    For both DFW memory and CPU usage the default threshold is set at 90%. You can see thresholds and current resource usage by running the “nsxcli -c get firewall thresholds” command on an ESXi transport node:

    A similar command can be used from an NSX Manager node: “on <transport-node-id> exec get firewall thresholds“.

    It’s nice that we can monitor the DFW resource usage on a per transport node basis, but in most environments this method isn’t very practical.

    In today’s article I want to have a look at two things concerning DFW resource monitoring. Firstly, at how to configure custom thresholds for memory and CPU usage. Secondly, at how to set up central threshold monitoring with alerting.

    Configuring custom DFW thresholds

    Below are the steps at a high level for configuring custom DFW thresholds:

    1. Create an NSGroup containing transport nodes
    2. Create a threshold profile
    3. Apply threshold profile
    4. Verify

    Time to get our hands dirty!

    Step 1 – Create an NSGroup containing transport nodes

    We need to group our transport nodes. Currently only NSGroups, the ones managed by the MP API, support having transport nodes as members.

    NSGroups are managed under Advanced Networking & Security > Inventory > Groups. I’m creating an NSGroup called “esxi-tn” with a membership criteria that will add all the host transport nodes as members:

    Copy the NSGroup ID to a text file as we need it at step 3:

    Step 2 – Create a threshold profile

    Using a REST API client we’re going to make a POST request to the NSX MP API:

    POST https://{{nsx-manager-fqdn}}/api/v1/firewall/profiles

    The request body contains the following piece of JSON code:

    {
         "cpu_threshold_percentage" : 75,
         "display_name" : "ESXi DFW Threshold Profile"
         "mem_threshold_percentage" : 75,
         "resource_type" : "FirewallCpuMemThresholdsProfile"
     }

    The values for “cpu_threshold_percentage” and “mem_threshold_percentage” will depend on your requirements. For this exercise I’m configuring a threshold at 75% for both memory and CPU usage.

    The POST request body and the result:

    profile post result

    Copy the threshold profile’s ID from the result to a text file as we need it in the next step.

    Step 3 – Apply threshold profile

    The second API call configures a service-config that links the threshold profile to the NSGroup:

    POST https://{{nsx-manager-fqdn}}/api/v1/service-configs

    With the following JSON code as the request body:

    {
         "display_name":"DfwCpuMemServiceConfig",
         "profiles":[
                 {
                     "profile_type":"FirewallCpuMemThresholdsProfile",
                     "target_id":"c4a003e0-468d-4582-a24e-ada96742f0ca"
                 }
             ],
         "precedence": 10,
         "applied_to": [
             {
                 "target_id":"744db991-e504-4e4b-83eb-c60b94a7f785",
                 "target_type" : "NSGroup"
             }
         ]
     }

    The threshold profile and the NSGroup IDs that we copied to a text file earlier are used as the values for the two target_ids.

    The POST request body and the result:

    Step 4 – Verify

    An easy way to verify that the new DFW thresholds have been applied is to run the “get firewall thresholds” NSXCLI command. This time I’ll run it from an NSX Manager node:

    As we see the new threshold value of 75% has been applied.

    Setting up alerting

    You might wonder what actually happens when a threshold is crossed? Currently there’s no alarming framework in NSX-T so the only thing that happens is that a threshold event is logged to syslog.

    Luckily there’s always vRealize Log Insight. Configured as a syslog target for the NSX-T platform, DFW threshold events end up there too:

    A quick look at a DFW threshold event. We see things like the transport node, the DFW component that crossed the threshold, as well as the configured threshold and the current usage.

    Now that we know what a threshold event looks like, it’s easy to configure an alert based on the query in vRLI:

    text contains “threshold event is raised”

    Click on “Create Alert from Query”:

    Fill out the details for the new alert:

    And that’s it. From now on you’ll be notified each time a DFW threshold is crossed.

    Summary

    Configuring custom DFW thresholds and monitoring these with Log Insight isn’t too hard to set up.
    It’s true that with a proper DFW design and by sticking to good practices for implementation, problems related to DFW memory or CPU usage are rare. That being said, it’s not a bad idea to keep an eye on the DFW’s resource utilization. Just in case.

  • Locking NSX-T Firewall Policies

    After receiving a couple questions about the NSX-T firewall policy locking feature, I decided to write a short blog post about it.

    The purpose of locking a firewall policy

    The easy part first. As explained in the official NSX-T documentation we lock a firewall policy to prevent multiple users from editing the same section.

    Locks could be short term like when a team is working in the NSX Manager firewall UI at the same time and want to avoid configuration collisions. Locks could also be long(er) term. For example when somebody is tasked with building a more complex firewall ruleset or when policies are subject to change management.

    Let’s start locking then!

    Here’s where it can get a bit confusing. While the option to lock a policy is always available, it won’t have any effect until you implement and use Role Based Access Control (RBAC) for NSX-T management. Why?

    The default “admin” account, which is the only account you can work with in the NSX Manager UI without RBAC, has the “Enterprise Admin” role assigned to it. This superuser role has permission to make changes to firewall policies even when they are locked.

    adding new rule in locked section as ea

    So, if your team is using this default account (very bad practice) or individual accounts with the “Enterprise Admin” role assigned, you can lock firewall policies all you want, but these locks won’t have any actual effect.

    Let’s fix this then!

    Yes. As said this requires that we implement and use RBAC for NSX-T management first. There’s documentation available that will help you set this up so I won’t go through that in this article. On a high level the process looks like this:

    1. Deploy vIDM
    2. Connect vIDM to Active Directory
    3. Configure remote app access for the NSX Manager
    4. Configure NSX Manager to use vIDM for AAA

    When that’s done we can start assigning NSX-T roles to Active Directory users:

    role assignment

    Example

    In this example I’m assigning the “Security Engineer” role to two AD users:

    The two security engineers have been configured:

    security engineers

    Let’s pretend “jsmith@demo.local” logs in to the NSX Manager UI and starts working on a new DFW policy:

    john's dfw policy

    When called into a meeting jsmith locks the policy he’s working on to prevent anybody from making changes:

    Next, the other security engineer “pgroot@demo.local” logs in to the NSX Manager UI. She has a look at the new policy and decides to make a minor change to it. When she tries to publish the change the following message appears:

    The change can’t be realized with her account. This is the expected and desired behaviour. The policy lock is enforced with RBAC implemented.

    Summary

    While most organizations have the RBAC components for NSX-T management in place (vIDM, AD, etc), actually leveraging NSX-T management roles so that things like locking firewall policies work is perhaps another thing. Hopefully this short article gave you some better understanding of how to get started.

  • Tier-1 Failure Domain

    With every new release of NSX-T interesting features are added to the platform. Take failure domain for example.

    Introduced in version 2.5, failure domain adds another layer of protection for the centralized services running on Tier-1 Gateways. It basically facilitates a rack aware placement mechanism for the Tier-1 service router (SR) components.

    In today’s article I’m going to do a simple failure domain proof of concept. I’ll walk through the configuration steps for setting up failure domain and verify its functionality.

    The lab environment

    For this exercise I installed a vSphere cluster consisting of four ESXi hosts divided over two racks. I’m calling these the Edge racks and made this very advanced diagram:

    rack diagram with esxi

    I then deployed four NSX-T Edge nodes (EN1 – EN4), one on each host, and added these to NSX-T Edge Cluster “T0 Cluster ECMP”:

    rack diagram with tier-0 cluster

    I threw in a Tier-0 Gateway called “T0-01” which is running in Active-Active HA mode with ECMP enabled. The Tier-0’s 8 uplinks are all taking part in forwarding North-South traffic, simultaneously:

    rack diagram with tier-1

    Finally, I deployed four more Edge nodes (EN5 – EN8), one on each host, and added these to Edge cluster “T1 Cluster”:

    rack diagram with tier-1 cluster

    The eight Edge nodes in the NSX Manager UI:

    edge nodes

    Next step – Create Tier-1s

    I will create Tier-1 Routers (Manager API) as opposed to Tier-1 Gateways (Policy API). This because the API call to trigger a Tier-1 SR reallocate I want to run later on only works on Tier-1 Routers. This has nothing to do with the failure domain feature itself which is compatible with both Tier-1 Routers and Tier-1 Gateways of course.

    Configuring the first Tier-1 called “T1-01”:

    t1-01

    I’m selecting the “T1 Cluster” Edge Cluster and no specific Edge Cluster Members.

    Both of the Tier-1s and the Tier-0 listed in the NSX Manager UI:

    tier-1 created

    Tier-1 service routers

    Selecting an Edge cluster for a Tier-1 indicates that you intend to run one or more centralized services on that Tier-1. This means that one active and one standby service router (SR) are instantiated on two different Edge nodes in that cluster (a Tier-1 SR always runs in Active-Standby HA mode).

    By the way, you should not select an Edge Cluster for a Tier-1 if you don’t intend to run centralized services on it as this can lead to unintended hairpinning of traffic over the Edge nodes.

    You noticed that I didn’t specify any Edge Cluster Members for the SRs. This results in the management plane picking them for me. So where did they end up?

    Clicking the Active-Standby link for each of the Tier-1 Routers reveals the SRs location. “T1-01” has its active SR on Edge node EN7 and its standby SR on Edge node EN5:

    t1-01 sr location

    “T1-02” has its active SR on Edge node EN8 and its standby SR on Edge node EN6:

    t1-02 sr location

    Fine. Let’s have a look at the Edge rack again now that we have introduced these Tier-1 SRs to the environment:

    rack diagram with tier-1

    My two Tier-1s are in separate racks. Great! Or is it?
    With the current Tier-1 SR placement a single Edge rack failure will result in one of the Tier-1 Routers losing both its active and the standby SR. That’s pretty bad.

    Failure Domain

    Failure domain prevents this silly SR placement from happening. Correctly configured, failure domain ensures that the active and standby SRs of a Tier-1 are always placed in different racks.

    Sounds great. Time set this up.

    Step 1 – Create two failure domains

    Failure domains are created using a POST request to the NSX API at:

    POST https://{{nsx-manager-fqdn}}/api/v1/failure-domains/

    The request body for my first failure domain contains the following piece of JSON code:

    {   
    "display_name": "Rack-1"
    } 

    The JSON code for my second failure domain:

    {   
    "display_name": "Rack-2"
    } 

    Creating the first failure domain using Postman:

    create failure domain postman

    Copy the value for “id” from the request result for each of the failure domains as we need these in the next step.

    Step 2 – Assign Edge nodes to failure domains

    The Edge nodes in the “T1 Cluster” need to be assigned to their respective failure domains. This too is done through an API call to the Manager API.

    For each Edge node we first retrieve its current configuration using the following GET request:

    GET https://{{nsx-manager-fqdn}}/api/v1/transport-nodes/{{edge-node-id}}

    You can find the ID of an Edge node in the NSX Manager UI (or via API):

    edge node id

    The GET request for Edge node EN5:

    get en5

    Copy the request result to the body of a new PUT request and change the value for “failure_domain_id” to match the ID of one of the newly created failure domains.

    PUT https://{{nsx-manager-fqdn}}/api/v1/transport-nodes/{{edge-node-id}}

    Which failure domain ID to use depends on the rack location of the Edge node. The following table lists the failure domain plan for my Tier-1 Edge nodes:

    Edge NodeFailure DomainFailure Domain ID
    EN5Rack-17e1af661-8e2c-43f7-924f-68eabce0f40b
    EN6Rack-2d78707df-2f7f-48a9-9e3e-98a5523901c7
    EN7Rack-17e1af661-8e2c-43f7-924f-68eabce0f40b
    EN8Rack-2d78707df-2f7f-48a9-9e3e-98a5523901c7

    Four GET/PUT requests later the Edge nodes have been assigned to the correct failure domains.

    Step 3 – Configure the Edge Cluster

    The “T1 Cluster” Edge Cluster needs to be configured for failure domain based placement. This is also done via the API.

    First a GET request to retrieve the current configuration of the Edge Cluster:

    GET https://{{nsx-manager-fqdn}}/api/v1/edge-clusters/{{edge-cluster-id}}

    The “edge-cluster-id” can be found in the NSX Manager UI (or via API):

    edge cluster id

    The GET request’s result in JSON:

    get edge cluster

    Again, you copy the request result to the body of a new PUT request. The only thing that we need to change here is the value for “allocation_rules

    from:

    "allocation_rules": [],

    to:

    "allocation_rules": [ {"action": {"enabled": true,"action_type": "AllocationBasedOnFailureDomain" } } ],

    Send the PUT request to:

    PUT https://{{nsx-manager-fqdn}}/api/v1/edge-clusters/{{edge-cluster-id}}

    And we’re done. From now on this Edge Cluster will perform failure domain based placement for new Tier-1 SRs.

    A new Tier-1

    Let’s put this to the test immediately by creating a new Tier-1.

    Here comes “T1-03”:

    t1-03

    Once again I’m selecting the “T1 Cluster” Edge cluster and no specific Edge nodes (= Auto Allocated). So where did the management plane decide to place the SRs this time?

    t1-03 placement

    The Active SR is on EN5 and the standby SR on EN6. They indeed ended up in separate racks!

    Existing Tier-1s

    What about the Tier-1 SRs that were deployed before we configured failure domains? Can we trigger a reallocation so that they too are placed in accordance to the new failure domain configuration?

    It turns out that we can, but it’s a data plane disruptive operation and, as far as I know, only works for Tier-1s created through the manager API (or in the UI under Advanced Networking & Security). Thank you Gary Hills for letting us know that the reallocate API call works for Tier-1s created in Policy UI/API as well by adding a header to the below request with key “X-Allow-Overwrite” and value of “true”:

    X-Allow-Overwrite

    A POST request on each of the existing Tier-1s will do the trick:

    POST https://{{nsx-manager-fqdn}}/api/v1/logical-routers/{{logical-router-id}}?action=reallocate

    The request body should contain the following JSON:

    {
       "edge_cluster_id": "{{edge-cluster-id}}"
    }

    The values for “logical-router-id” and “edge-cluster-id” can be found in the NSX Manager UI (or via API).

    Request accepted by the API:

    reallocate in postman

    A reallocation process now takes places behind the scenes. A few moments later we see that the active and standby SRs of the existing Tier-1s are now in separate racks:

    after tier-1 reallocation
    after tier-1 reallocation

    Let’s have a last look at the Edge rack after implementation and enforcement of the failure domains:

    tier-1 failure domain implemented

    Looks so much better now!

    Summary

    Today we had a look at how to set up Tier-1 failure domain in NSX-T 2.5. The goal was to ensure that active and standby Tier-1 SRs ended up in separate racks.

    Failure domain is a pretty cool and useful new feature adding extra protection for the Tier-1 SRs. Currently configurable via the API only, but that process was straight forward. With just a couple of request we got failure domains up and running.

    Whether Tier-1 failure domain makes sense in your environment will depend on your NSX Edge design, number of Edge nodes, and things like future growth.

    Good luck!

  • Bulk Create NSX-T Segments Using A Postman Data File

    Imagine this, you’ve been tasked with implementing micro-segmentation in your vSphere environment. You just deployed and configured NSX-T and the next step is to migrate VMs from their VDS port groups to N-VDS segments.

    You fire up the vSphere Client and expand the VDS to have a look at the current situation:

    vds port groups

    It’s pretty bad.

    Turns out your VMs are connected to no less than 784 different port groups! Overlay networking/consolidation and re-IP are currently not part of the plan so you’re stuck with these 784 VLANs. You now realize that you need to create 784 VLAN backed segments in NSX-T. Life sucks.

    Postman to the rescue

    In today’s short post I want to share an easy way that can help you out in a scenario like the one above. It involves the NSX-T Policy API, Postman, and a text file. Let’s go!

    Step 1 – Prepare the CSV file

    First we need to create a simple text file that contains values for the NSX-T segments and their corresponding VLAN IDs. The format of the comma separated text file is as follows:

    segment_name, vlan_id
    vlan-1000, 1000
    vlan-1001, 1001
    vlan-1002, 1002
    vlan-1003, 1003
    ....
    ....
    vlan-1783, 1783

    For “segment_name” you use whatever fits your naming convention. I’m saving this file as “segments.csv”:

    segments.csv

    Step 2 – Prepare the Postman request

    We’re going to leverage the NSX-T Hierarchical API to create these segments by making a PATCH request to:

    https://{{nsx-manager-fqdn}}/policy/api/v1/infra

    Only a small piece of JSON code is needed in the request body:

    {
     "resource_type": "Infra", "children": [{
     "resource_type": "ChildSegment", "marked_for_delete": "false", "Segment": {
     "resource_type": "Segment",
     "type": "DISCONNECTED",
     "id": "{{segment_name}}",
     "display_name": "{{segment_name}}",
      "vlan_ids": [
             "{{vlan_id}}"
           ],
     "path": "/infra/segments/{{segment_name}}",
     "relative_path": "{{segment_name}}",
     "parent_path": "/infra/segments/{{segment_name}}",
     "transport_zone_path": "/infra/sites/default/enforcement-points/default/transport-zones/e82afbae-c811-48e1-8946-6e1f62b67871"
     } }]
     }

    As you can see the variables “{{segment_name}}” and “{{vlan_id}}” are used a couple of times in this piece of code. Their values will be fetched from the matching columns in the ”segments.csv”.

    The value for “transport_zone_path” is unique in every NSX-T deployment. You can easily find the ID of your VLAN transport zone in the NSX Manager UI under System > Configuration > Fabric > Transport Zones:

    tz id

    Putting it all together the Postman request will look like this:

    postman request

    I’m saving this request as “Create NSX-T Segments with data file” in a new collection folder called “NSX-T”.

    Step 3 – Start the Postman Runner

    Click the Runner button to start the collection runner:

    collection runner

    In the next screen you select the saved request:

    select request to run

    We need to configure a couple of things for this run. The table below lists my settings:

    SettingValueComment
    EnvironmentYour NSX-T environment Have a look at this post for more information about working with Postman environments.
    Iterations784We have 784 segments in our data file.
    Datasegments.csvThe data file.
    runner settings

    After selecting your data file you can click the Preview button just to verify that Postman is interpreting the data correctly:

    preview

    Looks pretty good to me. Time to press the big button:

    run nsx-t

    Running these 784 iterations will take a couple of minutes. You can monitor the progress in the “Run Results” screen:

    runner progress

    Notice the “200 OK” status for each iteration which is the NSX-T API’s response to the requests and means it was processed successfully.

    Once the Runner is finished it’s time to have a look in the NSX Manager UI under Networking > Connectivity > Segments to see if new segments have been created:

    segments

    That certainly seems to be the case. All of the 784 VLAN backed segments are there and configured with the correct transport zone and VLAN ID:

    segment detail

    Summary

    Bulk creating or modifying NSX-T objects can be done in a number of different ways. If coding is your thing you’ll probably have little trouble putting together a tool for this using your preferred language. If you’re more into scripting you can use something like PowerShell. And if you like to work really slow you can always turn to the NSX Manager UI.

    For everybody else there’s Postman. Using this tool in combination with data files offers an easy and quick way for creating or modifying large amounts of NSX-T objects.

    You can read more about the NSX-T Policy API in the NSX Policy API: Getting Started Guide. To learn more about working with Postman data files check out this tutorial.

    Have fun!

  • Welcome back! I’m in the process of setting up NSX-T in a stretched cluster environment.

    In part 1 I deployed the NSX manager cluster and configured the ESXi hosts as NSX transport nodes. The N-VDS was installed on the ESXi hosts and their vmkernel adapters migrated from the VDS to the N-VDS.

    In this second part I will configure the NSX data plane for north-south and east-west networking. Again, there’s a lot to do so let’s begin!

    The lab environment

    A couple of things happened since the last time I had a look at the lab environment’s diagram:

    The vSphere management cluster is now also hosting an NSX manager cluster and the ESXi hosts turned into NSX-T transport nodes.

    Speaking of ESXi hosts, here’s a little closer look at one of them:

    There’s now an N-VDS instead of a VDS with the three vmkernel adapters Management, vMotion, and vSAN. There are also two new vmkernel adapters which are acting as tunnel endpoints (TEPs) for the NSX overlay networking (geneve encapsulation/decapsulation).

    The infrastructure for east-west networking is largely in place, but without a north-south network path this cluster is pretty isolated.

    NSX Edge

    The NSX Edge provides a central entrance/exit point for network traffic entering and exiting the SDDC and is exactly what this environment needs.

    Deploy edge VMs

    I’m deploying a total of four edge VMs (two at each site). I’ll deploy them using the Edge VM OVA package so that I can connect the edge node’s management interface to the NSX-T segment at the time of deployment.

    The table below contains the deployment details for the edge VMs:

    Settingen01-aen01-ben02-aen02-b
    Nameen01-aen01-ben02-aen02-b
    Network 0site-a-nvds01-managementsite-b-nvds01-managementsite-a-nvds01-managementsite-b-nvds01-management
    Network 1edge-uplink1edge-uplink1edge-uplink1edge-uplink1
    Network 2edge-uplink2edge-uplink2edge-uplink2edge-uplink2
    Network 3not usednot usednot usednot used
    Mgmt IP172.16.41.21/24172.16.51.21/24172.16.41.22/24172.16.51.22/24

    Deploying the edge VM using the OVA package:

    ovf edge vm deployment

    Configure edge nodes

    After deployment the edge nodes need to join the management plane. For this I use the “join management-plane” NSX CLI command:

    cli join

    Once he edge nodes have joined the management plane, I can pick them up in the NSX Manager UI to configure each of them as Edge Transport Nodes. I’m using the following configuration details for this :

    Settingen01-aen01-ben02-aen02-b
    Transport Zonestz-vlan, tz-overlaytz-vlan, tz-overlaytz-vlan, tz-overlaytz-vlan, tz-overlay
    N-VDS Namenvds01nvds01nvds01nvds01
    Uplink Profileup-site-a-edgeup-site-b-edgeup-site-a-edgeup-site-b-edge
    IP AssignmentUse Static IP ListUse Static IP ListUse Static IP ListUse Static IP List
    Static IP List172.16.49.30,172.16.49.31172.16.59.30,172.16.59.31172.16.49.32,172.16.49.33172.16.59.32,172.16.59.33
    Virtual NICsfp-eth0 – uplink-1,
    fp-eth1 – uplink-2
    fp-eth0 – uplink-1,
    fp-eth1 – uplink-2
    fp-eth0 – uplink-1,
    fp-eth1 – uplink-2
    fp-eth0 – uplink-1,
    fp-eth1 – uplink-2

    Edge transport nodes are managed under System > Fabric > Nodes > Edge Transport Nodes.

    en01-a transport node configuration

    Like the ESXi hosts, all four edge nodes are now fully configured transport nodes:

    edge transport nodes

    Edge cluster

    The edge transport nodes need to be part of an edge cluster. I will create an edge cluster called edge-cluster01 and add all four nodes to this cluster.

    Edge clusters are managed under System > Fabric > Nodes > Edge Clusters:

    Anti-affinity rules

    The edge VMs shouldn’t be running on the same ESXi host. To prevent this from happening I create two anti-affinity rules on the vSphere cluster; one for the edge VMs at Site A and another for the edge VMs at Site B:

    vm/host rule

    Groups and rules

    The edge VMs should also stick to their site. For this I create two host and a two VM groups. A “virtual machine to host” rule will then make sure that the edge VMs stay pinned to their respective site.

    The host group for Site A:

    host group

    The VM group for the edge VMs at Site B:

    vm group

    The “virtual machine to host” rule keeping edge VMs belonging to Site A on the ESXi hosts of Site A:

    vm to host rule

    The result of having these groups and rules in place becomes visible after some seconds. Edge VMs are running at the correct site and on seperate ESXi hosts within a site:

    correctly placed VMs

    That pretty much completes the NSX Edge infrastructure deployment in my stretched cluster.

    Routing

    Now that the NSX-T Edge is in place, it’s time to set up a connection with the physical network so that packets can actually get in and out of the environment.

    Tier-0 gateway

    A Tier-0 gateway provides the gateway service between the logical and the physical network and is just what I need.

    I’m creating my Tier-0 gateway with the following configuration details:

    SettingValue
    Nametier0-01
    High Availability ModeActive-Active
    Edge Clusteredge-cluster01
    Route Re-Distributionall

    Tier-0 gateways are managed under Networking > Connectivity > Tier-0 Gateways.

    tier-0 gateway

    Interfaces

    This Tier-0 will have eight external interfaces mapped to the different edge transport nodes at the two sites. The table below shows the interfaces and their configuration details:

    NameIP Address / MaskConnected ToEdge NodeMTU
    en01-a-uplink01172.16.47.2/24site-a-edge-transit01en01-a9000
    en01-a-uplink02172.16.48.2/24site-a-edge-transit02en01-a9000
    en02-a-uplink01172.16.47.3/24site-a-edge-transit01en02-a9000
    en02-a-uplink02172.16.48.3/24site-a-edge-transit02en02-a9000
    en01-b-uplink01172.16.57.2/24site-b-edge-transit01en01-b9000
    en01-b-uplink02172.16.58.2/24site-b-edge-transit02en01-b9000
    en02-b-uplink01172.16.57.3/24site-b-edge-transit01en02-b9000
    en02-b-uplink02172.16.58.3/24site-b-edge-transit02en02-b9000

    The Tier-0 external interfaces are now configured and active:

    tier-0 interfaces

    BGP

    The TORs have been configured for BGP already and now I need to set up BGP at the Tier-0 gateway too.

    The BGP settings that I will use on the Tier-0 gateway are:

    SettingValue
    Local AS65000
    BGPOn
    Graceful RestartOff
    Inter SR iBGPOn
    ECMPOn
    Multipath RelaxOn

    Configuring BGP details on the Tier-0 gateway:

    I’m adding each TOR as a BGP neighbor to the Tier-0 gateway. The following table shows the configuration details for the four BGP neighbor entries:

    IP addressBFDRemote ASHold DownKeep Alive
    172.16.47.1Enabled65001124
    172.16.48.1Enabled65001124
    172.16.57.1Enabled65002124
    172.16.58.1Enabled65002124

    The BGP neighbor status after the four TORs are added:

    bgp nrighbors

    Route map

    To prevent asymmetric traffic flows, the NSX Edge infrastructure at Site A should be the preferred ingress/egress point for the north-south traffic.

    I achieve this by AS path prepending on the BGP paths to Site B. This is configured in a route map on the Tier-0 gateway.

    First I need to create an IP prefix list. Both IP prefix lists and route maps are managed on the Tier-0 gateways under Routing:

    route maps

    The details of the IP prefix list:

    SettingValue
    Nameany-prefix
    Networkany
    ActionPermit

    The details of the route map:

    SettingValue
    Route Map Namesiteb-route-map
    TypeIP Prefix
    Membersany-prefix
    AS path prepend65000 65000

    The route map needs to be attached to the BGP neighbor entries belonging to Site B. I configure the route map as Out Filter and In Filter:

    route map out filter

    The Site B neighbors now have filters configured:

    filters configured for site b

    This completes the Tier-0 gateway deployment.

    Diagram

    I’m just taking a step back to have a look at what it is I actually did here.

    The diagram below shows the Tier-0 gateway’s L3 connectivity with the physical network:

    tier-0 bgp

    It’s a pretty wild diagram I’m aware, but hopefully it makes some sense.

    East-West

    The Tier-1 gateway is where the NSX-T segments for virtual machine networking will be connected. The Tier-1 gateway is linked to the Tier-0 gateway too, of course.

    I’m creating a Tier-1 gateway with the following configuration details:

    SettingValue
    Nametier1-01
    Linked Tier-0 Gatewaytier0-01
    Fail OverNon Preemptive
    Edge Clusteredge-cluster01
    Route Advertisementall

    Tier-1 gateways are managed under Networking > Connectivity > Tier-1 Gateways.

    tier-1 gateway

    Workload segments

    With the Tier-1 gateway in place I can now attach some NSX-T segments for the workloads (VMs).

    I’m creating three segments Web, App, and DB with the following configuration details:

    SettingValue
    Connected Gateway & Typetier1-01, flexible
    Transport Zonetz-overlay
    Subnets (gateway)10.0.1.1/24 (Web), 10.0.2.1 (App), 10.0.3.1 (DB)

    Creating the segments:

    segments

    I notice that downlink ports have been created on the Tier-1 gateway:

    downlink ports

    Provision VMs

    It’s all about the VMs of course. So I deploy three VMs web01, app01, and db01. They are connected to the segments.

    VM web01 connected to segment Web as seen at the N-VDS Visualization in the NSX Manager UI:

    web01

    Connectivity test

    Time to test connectivity.

    East-west

    First between the VMs which I place on different ESXi hosts and at different sites.

    web01 (10.0.1.10) at Site B pinging db01 (10.0.3.10) at Site A:

    web01 pings db01

    Visualized by the Port Connection tool in the NSX Manager UI:

    port connection

    app01 (10.0.2.10) at Site A pinging web01 at Site B:

    app01 pings web01

    Once again visualized by the Port Connection tool:

    port connection

    East-west and cross-site logical networking seems to be working!

    North-south

    How about north-south? Let’s see.

    db01 at Site A pinging a host on the physical network (10.2.129.86):

    db01 pings physical

    The Traceflow tool in the NSX Manager UI tells me a bit more about the network path. I can see that the traffic exits the SDDC through Site A (en02-a):

    traceflow

    The other way around a traceroute from the physical network to web01 at Site B:

    traceroute from physical

    Traffic entering the SDDC through Site A (en01-a). Perfect!

    Summary

    Wow! This has been quite an exercise. Are you still there? 😉

    It all started with deploying the NSX Edge (virtual) infrastructure. On top of that infrastructure I deployed a Tier-0 gateway and configured dynamic routing between the Tier-0 and the TORs.

    To facilitate for east-west distributed logical networking, I deployed a Tier-1 gateway and linked it to the Tier-0. I connected some NSX-T segments to the Tier-1 gateway and some virtual machines to the segments.

    Some simple connectivity testing showed that north-south and east-west networking were working as intended. Site A is consistently used for the north-south ingress/egress traffic flows thanks to the BGP AS prepending.

    Thanks for staying tuned this long. I hope this and the previous article about deploying NSX-T in a stretched cluster environment have been interesting reads. I might return to this environment for some more NSX-T multisite scenarios in future articles.

    Cheers!

  • A stretched cluster architecture facilitates for higher levels of availability and things like inter-site load balancing. It’s a common multisite solution and also part of VMware’s Validated Design for SDDCs with multiple availability zones.

    Traditionally compute networking in an active-active multisite setup has had its challenges, but with vSAN storage and NSX networking technologies that’s a thing of the past.

    In the coming two articles I want to have a closer look at NSX-T in an active-active multisite environment. Specifically I want to learn more about how the different NSX-T components are deployed and how the data plane is configured in a stretched cluster.

    In this first part I will deploy the NSX-T 2.5 platform and perform the necessary configurations and preparations so that in part two I can focus solely on the data plane (north-south and east-west).

    This is going to be quite an exercise so let’s get right to it!

    The lab environment

    Below a high level overview of the lab environment as it looks right now:

    lab environment

    A vSAN cluster consisting of eight ESXi hosts stretched to a second site. A third site is hosting the vSAN witness appliance. A completely separate vSphere management cluster is only hosting the vCenter server right now.

    A quick look at the vSphere environment then. I’m running vSphere 6.7 U3:

    vcenter

    The hosts have two physical 10Gbit NICs:

    physical nics

    Three vmkernel adapters have been configured: Management, vMotion, and vSAN:

    vmkernel adapters

    As mentioned, this is a vSAN stretched cluster:

    vsan stretched cluster

    The following tables list the VLANs and the associated IP subnets that are currently configured per site:

    Site A:

    VLAN FunctionVLAN IDSubnetGateway
    ESXi Management1641172.16.41.0/24172.16.41.253
    vMotion1642172.16.42.0/24172.16.42.253
    vSAN1643172.16.43.0/24172.16.43.253

    Site B:

    VLAN FunctionVLAN IDSubnetGateway
    ESXi Management1651172.16.51.0/24172.16.51.253
    vMotion1652172.16.52.0/24172.16.52.253
    vSAN1653172.16.53.0/24172.16.53.253

    Witness Site:

    VLAN FunctionVLAN IDSubnetGateway
    ESXi Management1711172.17.11.0/24172.17.11.253
    vSAN1713172.17.13.0/24172.17.13.253

    Management Cluster:

    VLAN FunctionVLAN IDSubnetGateway
    SDDC Management1611172.16.11.0/24172.16.11.253

    NSX-T is not deployed yet, but that’s about to change pretty soon 😉

    Deploying the NSX-T manager cluster

    Installing NSX-T 2.5 always starts with deploying the manager cluster. It consists of three manager nodes and an optional virtual IP (VIP).

    I will deploy the NSX manager cluster nodes in the vSphere management cluster and connect them to the SDDC Management VLAN (1611).

    The IP plan for the NSX manager cluster looks like this:

    HostnameIP Address
    nsxmanager01172.16.11.82
    nsxmanager02172.16.11.83
    nsxmanager03172.16.11.84
    nsxmanager172.16.11.81 (virtual IP)

    First manager node

    I deploy the first manager node from the OVA package:

    first nsx manager node

    Filling out the configuration details and then kicking off the deployment.

    When the first manager node is up and running I’m logging in to the NSX Manager UI:

    first nsx manager ui login

    Second and third manager nodes

    The second and third manager nodes can be deployed from the NSX Manager UI. Before I can do that I need to add my vCenter server under System > Fabric > Compute Manager:

    compute manager added

    Now I’m able to deploy the second and third manager nodes via System > Appliances > Add Nodes.

    Once done the three nodes are shown in the UI and the cluster connectivity is up:

    three nodes deployed

    Assign virtual IP address

    I finalize the manager cluster deployment by configuring a virtual IP address. This is done under System > Appliances > Virtual IP:

    change vip

    A couple of minutes later the virtual IP is active:

    vip configured

    Configuring the NSX-T data plane

    Now that the NSX-T management plane is fully operational I will continue with the data plane preparations and configurations.

    More VLANs

    First I need to provision some more VLANs in the TORs at the data sites. At each site I need two VLANs for overlay and another two for connecting NSX with the physical network later on:

    Site A:

    VLAN FunctionVLAN IDSubnetGateway
    Host overlay1644172.16.44.0/24172.16.44.253
    Uplink011647172.16.47.0/24172.16.47.253
    Uplink021648172.16.48.0/24172.16.48.253
    Edge overlay1649172.16.49.0/24172.16.49.253

    Site B:

    VLAN FunctionVLAN IDSubnetGateway
    Host overlay1654172.16.54.0/24172.16.54.253
    Uplink011657172.16.57.0/24172.16.57.253
    Uplink021658172.16.58.0/24172.16.58.253
    Edge overlay1659172.16.59.0/24172.16.59.253

    Transport zones

    Two transport zones should do it I believe. I create them using the following details:

    NameN-VDS NameTraffic Type
    tz-vlannvds01VLAN
    tz-overlaynvds01Overlay

    Transport zones are managed under System > Fabric > Transport Zones:

    transport zones

    Uplink profiles

    Next, I need to create four uplink profiles. The table below shows the configuration details for each of them:

    NameTeaming PolicyActive UplinksTransport VLANMTU
    up-site-a-esxiLoad Balance Sourceuplink-1, uplink-2 16449000
    up-site-a-edgeLoad Balance Sourceuplink-1, uplink-2 16499000
    up-site-b-esxiLoad Balance Sourceuplink-1, uplink-2 16549000
    up-site-b-edgeLoad Balance Sourceuplink-1, uplink-2 16599000

    Uplink profiles are managed under System > Fabric > Profiles > Uplink Profiles:

    uplink profiles

    In order to achieve VLAN pinning, deterministic routing, and ECMP I need to add two named teaming policies to the uplink profiles that I just created:

    NameTeaming PolicyActive Uplinks
    Uplink01Failover Orderuplink-1
    Uplink02Failover Orderuplink-2

    Adding the named teaming policies to the uplink profiles:

    add named teaming policies

    I also need to add the Uplink01 and Uplink02 named teaming policies to transport zone tz-vlan. This so that they can be selected on segments belonging to that transport zone later on:

    named teaming policies to transport zone.

    Network I/O Control profile

    To allocate bandwidth to different types of network traffic I create a network I/O control profile. After long and hard thinking I decided to call it nioc-profile and it has the following settings:

    Traffic Type / Traffic NameShares
    Fault Tolerance (FT) Traffic25
    vSphere Replication (VR) Traffic25
    iSCSI Traffic25
    Management Traffic50
    NFS Traffic25
    vSphere Data Protection Backup Traffic 25
    Virtual Machine Traffic100
    vMotion Traffic 25
    vSAN Traffic 100

    Network I/O control profiles are managed under System > Fabric > Profiles > NIOC Profiles:

    nioc profile

    Segments

    VLAN-backed segments are needed for system, uplink/transit, and overlay traffic. The table below lists the segments with their settings that I will create:

    Segment NameUplink & TypeTransport ZoneVLAN
    site-a-nvds01-managementnonetz-vlan1641
    site-a-nvds01-vmotionnonetz-vlan1642
    site-a-nvds01-vsannonetz-vlan1643
    site-a-edge-transit01nonetz-vlan1647
    site-a-edge-transit02nonetz-vlan1648
    site-b-nvds01-managementnonetz-vlan1651
    site-b-nvds01-vmotionnonetz-vlan1652
    site-b-nvds01-vsannonetz-vlan1653
    site-b-edge-transit01nonetz-vlan1657
    site-b-edge-transit02nonetz-vlan1658
    edge-uplink1nonetz-vlan0-4094
    edge-uplink2nonetz-vlan0-4094

    Segments are managed under Networking > Connectivity > Segments:

    vlan-backed segments

    Uplink teaming policy

    The uplink teaming policy for segments edge-uplink1 and edge-uplink2 need to be modified so that the named teaming policies Uplink01 and Uplink02 are used instead of the default.

    For this I have to edit these segments under Advanced Networking & Security > Networking > Switching:

    change segment teaming

    Configure ESXi hosts

    Now the time has come to configure the ESXi hosts and turn them into NSX-T transport nodes!

    In the NSX Manager UI I navigate to System > Fabric > Nodes and change the “Managed by” to my vCenter server. The ESXi hosts are listed:

    unconfigured hosts

    Unfortunately, I can’t make use of a transport node profiles here as these are assigned at the vSphere cluster level. I will therefore configure my hosts one at a time.

    The ESXi transport nodes in Site A will be configured with the following settings:

    SettingValues
    Transport Zonetz-vlan, tz-overlay
    N-VDS Namenvds01
    NIOC Profilenioc-profile
    Uplink Profileup-site-a-esxi
    LLDP ProfileLLDP [Send Packet Disabled]
    IP AssignmentUse DHCP
    Physical NICSvmnic0 – uplink-1
    vmnic1 – uplink-2
    vmk0site-a-nvds01-management
    vmk1site-a-nvds01-vmotion
    vmk2site-a-nvds01-vsan

    ESXi transport nodes in Site B use slightly different settings:

    SettingValues
    Transport Zonetz-vlan, tz-overlay
    N-VDS Namenvds01
    NIOC Profilenioc-profile
    Uplink Profileup-site-b-esxi
    LLDP ProfileLLDP [Send Packet Disabled]
    IP AssignmentUse DHCP
    Physical NICSvmnic0 – uplink-1
    vmnic1 – uplink-2
    vmk0site-b-nvds01-management
    vmk1site-b-nvds01-vmotion
    vmk2site-b-nvds01-vsan

    Selecting one host at a time clicking Configure NSX:

    configure nsx

    The network mappings for install for vmkernel adapter migration:

    network mappings for install

    When I click Finish the NSX installation and configuration process starts on the selected ESXi host. NSX bits are installed, the host receives the N-VDS, and the vmkernel adapters are migrated from VDS port groups to the N-VDS segments.

    When all hosts have been configured I quickly check the status of the transport nodes:

    transport node status

    And in vCenter I notice there’s now an N-VDS with a bunch of opaque port groups:

    n-vds installed

    Summary

    Most of the NSX-T platform is in place now and I think this is a good point to take a small break.

    I started by deploying and configuring the NSX manager cluster (aka the central management plane). Next, I prepared the environment for the NSX data plane by provisioning some VLANs, profiles, and segments. Lastly, I prepared the ESXi hosts in the stretched cluster by installing the NSX VIBs and configuring them as NSX transport nodes. vSphere system networking (vmkernel adapters) was migrated to the N-VDS.

    In the next part I will continue with the installation of the data plane and more specifically deployment and configuration of the NSX Edge as well as the logical networking components.

    Stay tuned!

  • Recently a new version of the NSX-T Reference Design Guide was released. This guide, which now covers NSX-T versions 2.0 – 2.5, is a must read for anyone interested in the NSX-T solutions and their recommended design.

    One of the things you’ll find in the updated guide is a new recommended deployment mode for the edge VM for NSX-T 2.5 and onwards. The new recommended design for the Edge VM looks likes this:

    one n-vds edge vm

    This new design has a couple of advantages:

    • One N-VDS carrying both overlay and VLAN traffic.
    • Multi-TEP configuration for load balancing of overlay traffic.
    • Distribution of VLAN traffic to specific TORs for deterministic point-to-point routing adjacencies.
    • No change required in the vSphere distributed port group configuration when new workload VLAN segments are added.

    This “single N-VDS per Edge VM” design is only supported with NSX-T version 2.5 and above. For NSX-T version 2.4 and lower you stick with the “three N-VDS per Edge VM” design that looks like this:

    three n-vds edge vm

    Getting to the 2.5 Edge VM design

    The “three N-VDS per Edge VM” design is still perfectly valid and fully supported with NSX-T 2.5.

    Upgrading NSX-T from 2.x to 2.5 won’t touch your Edge VM configuration so you automatically end up with the “three N-VDS per Edge VM” design in version 2.5.

    And in most cases there’s no immediate reason to start messing around with the Edge VM design in a production environment just to have it aligned with the recommended design for version 2.5.

    That being said, I wanted to go through the process just to see if it could be done with acceptable data plane disruption and of course to learn a thing or two in the process. Maybe you want to follow along and perhaps learn something too. Let’s have a look at what I did.

    Step 1 – Create VLAN trunking port groups

    I’m using my 2.5 Edge VM design diagram above as a blueprint and the first thing that I need to do is create two new port groups on the vSphere VDS. The Edge VM design requires two port groups configured as trunks. I will call these port groups Trunk1 and Trunk2.

    Starting with Trunk1:

    trunk 1

    Setting the VLAN type to VLAN trunking:

    VLAN trunking

    For Teaming and failover I configure Uplink 1 as the active uplink and Uplink 2 as the standby uplink:

    uplink 1 active

    I then create the Trunk2 port group and configure it the same way except for the Failover order which is set the other way around:

    uplink 2 active

    The following port groups are now available on the VDS:

    The idea here is that Trunk1 and Trunk2 will replace PG-OVERLAY, PG-UPLINK1, and PG-UPLINK2.

    Step 2 – Create new Tier 0 transit segments

    The current “three N-VDS per Edge VM” deployment in my lab environment is using Tier 0 transit segments with VLAN ID “0”. This means that they are backed by whatever VLAN ID is specified in the PG-UPLINK1 and PG-UPLINK2 VDS port groups.

    An improvement upon this is to configure the VLAN ID at the NSX-T segment level instead. In this way we keep the VLAN configuration and control of it within the NSX platform which is a good thing.

    I create two new segments called vlan1613 and vlan1614 and configure them with VLAN ID 1613 and 1614 respectively:

    new transit segments

    Step 3 – Create a new NSX-T uplink profile

    The way the Edge VMs connect to the physical network is different with the 2.5 Edge VM design. I need to configure a new uplink profile that contains the required configuration.

    Uplink profiles are managed under System > Fabric > Profiles > Uplink Profiles:

    The new uplink profile called EdgeVM-Uplink-Profile contains three teaming configurations.

    edge uplink profile

    The [Default Teaming] is load balancing traffic between Uplink1 and Uplink2 and facilitates the multi-TEP capability of the 2.5 Edge VM design. The two other teaming configurations, VLAN-1613-Policy and VLAN-1614-Policy, are used for the point-to-point routing adjacencies.

    Step 4 – Deploy new Edge VMs

    As far as I know there is no easy way to reconfigure an N-VDS setup on existing edge transport nodes. I simply deploy two new Edge VMs that eventually will replace the existing Edge VMs:

    deploy new edge vms

    It’s at the Configure NSX step I configure the Edge VM according to the version 2.5 Edge VM design. So what does that look like? Something like this:

    one n-vds config

    A single N-VDS that is associated with both an overlay and a VLAN transport zone. The EdgeVM-Uplink-Profile gives me two DPDK Fastpath interfaces that I assign to each their VDS trunk port group.

    When deployment of the two new Edge VMs is finished I have the following situation under System > Fabric > Nodes > Edge Transport Nodes:

    edge transport nodes

    Edge nodes en03 and en04 are the new Edge transport nodes.

    I add the new Edge transport nodes to the existing Edge cluster where they join en01 and en02:

    edge cluster 4 nodes

    Step 5 – Transition

    At this point en01 and en02 are the only Edge transport nodes with logical network configuration linked to them. While en03 and en04 are members of the same Edge cluster, they are not doing much in terms of data plane services.

    A diagram of the L3 topology in my lab from an NSX Edge perspective:

    Transitioning to the new Edge transport nodes won’t and shouldn’t alter anything in the L3 topology above. Otherwise I would consider it a bad transition.

    I’m ready to replace the current Edge transport nodes with the new ones. Unfortunately, the Replace Edge Cluster Member won’t work here as the nodes are having different configurations.

    Instead I’m going to do a manual transition and in my simple lab environment that’s a pretty straight forward process. The only service hosted in the NSX Edge besides north-south routing is a DHCP server. So this should be easy.

    Starting by placing the en01 transport node in maintenance mode:

    en01 maintenance mode

    Now en01 is not involved in any data plane operations anymore. With that in mind I’m feeling comfortable going ahead with the next step which is the removal of the Tier 0 interfaces that are linked to en01.

    My Tier 0 gateway has an active-standby HA mode which means it can’t have its configuration mapped to more than two Edge transport nodes at a time. By deleting the configuration linked to one Edge transport node I’m making room for a new Edge transport node. One at time.

    tier 0 interfaces
    confirm deletion

    Deleting the interfaces will break the Tier 0 gateway’s en01 connection with the TORs, but this is acceptable as en01 has been placed in maintenance mode and the data plane won’t experience any disruptions.

    Once the two interfaces linked to en01 have been removed we can add them again with the same name and the same IP configuration as before, but this time I link them to en03 and select the newly created transit segments:

    new transit interfaces

    Once done with deleting and adding interfaces there’s a kind of hybrid situation where two Edge transport nodes (en02 and en03) each with a different deployment mode are serving the same Tier 0 gateway:

    en02 and en03

    And it works!

    Now I repeat the same process to replace en02 with en04:

    1. Place en02 in maintenance mode (en03 takes over its duties).
    2. Delete Tier 0 interfaces linked to en02.
    3. Add Tier 0 interfaces, link them to en04 and select the new segment

    The final result is four Tier 0 gateway interfaces with the same name and IP as before, but linked to the new Edge transport nodes:

    four interfaces in new ens

    Just the DHCP service left which is pretty easy.

    I have to re-configure the DHCP service so that it uses the new Edge transport nodes. This is done under Advanced Networking & Security > Networking > DHCP > Server Profiles

    I edit the profile so that it only contains en03 and en04 as its members.

    Step 6 – Clean up

    After verifying that everything is working as it should the time has come to say goodbye to the old Edge transport nodes.

    I first remove en01 and en02 from the Edge Cluster:

    remove en01 and en02 from cluster

    And then simply delete them from the fabric:

    delete edge node

    I can also delete the PG-OVERLAY, PG-UPLINK1, and PG-UPLINK2 port groups in vSphere as they are no longer needed.

    This leaves the environment with the new en03 and en04 Edge transport nodes and the new NSX-T 2.5 recommended Edge VM design!

    Summary

    A summarization of the steps I took to go from a “three N-VDS Edge VM” design to a “single N-VDS Edge VM” design:

    1. Create trunking port groups in vSphere.
    2. Create new transit segments configured with VLAN ID.
    3. Create new uplink profile for the Edge transport node
    4. Deploy two new Edge VMs and configure them with the “single N-VDS” design.
    5. Replace the existing Edge transport nodes by doing a manual transition.
    6. Verify and clean up.

    Quite an operation but certainly doable. It might or might not be worth the effort. It comes down to wether the advantages that this new Edge VM design offers are important enough to you.

    Keep in mind that placing Edge transport nodes in maintenance mode as I did in this article will trigger a fail-over between the nodes (with active-standby mode) which in turn causes short data plane disruptions. That’s not an issue in a lab, but something to consider in a production environment. For a Tier 0 gateway with an active-active HA mode and ECMP enabled this would be less of an issue.

  • Welcome back! In part 1 we had a look at some NSX-T management plane failure scenarios and how to recover from them. In this part we continue to investigate NSX-T recoverability at the data plane and more specifically the NSX Edge.

    Quick note

    If you ever experience an issue in your NSX-T production environment, the first and only thing you should do is open a VMware support request. Highly skilled experts who are dealing with all kind of NSX-T issues on a daily basis will help you in the best possible way with your specific issue.

    NSX data plane failure & recovery

    Most will agree that failures at the data plane are more critical than for instance failures at the management plane. After all, the data plane is where the network packets that really matter are flowing around. Failures at the data plane can potentially impact service availability.

    Luckily, the NSX data plane is robust by design. Largely distributed and where it’s centralized it’s also clustered. Combine this with a proper design for the physical and logical components and you’re looking at a pretty solid solution.

    But sure, things can break down and when they do it’s important to understand how to get back on track again.

    The lab environment

    We’re still using the same small lab environment as in part 1. I just added a VM in the compute cluster for today’s article. Below is a diagram showing the main components from a high level perspective.

    The NSX Edge

    The NSX Edge is a centralized, often clustered, component. It provides a range of gateway services, but one of its main responsibilities is routing traffic between NSX logical networks and the physical network.

    The worker bees of the NSX Edge are the edge nodes. They are available in two form factors (virtual machine and bare metal) and are organized in one or more edge clusters.

    In my lab environment the NSX Edge consists of of two edge node VMs and one edge node cluster.
    Let’s have a quick look at the deployment details of one these edge node VMs.

    A pretty common NSX-T 2.4 edge node deployment configuration for the VM form factor.

    Below the layer 3 topology running on top of the NSX Edge.

    As you can see the layer 3 network is making good use of the lower layer’s redundant paths.

    Lastly, the Tier 0 gateway in this lab has been set up with an Active-Standby HA mode.

    Current state of the NSX Edge

    Life is good at the edge. The edge node VMs are up and running.

    The edge transport nodes configuration state and node status are looking good.

    The Tier 0 gateway’s BGP summary shows that BGP connections are established with both of the TORs.

    The Tier 0 gateway’s routing table contains IP routes advertised by the TORs through BGP.

    And last but not least the VM in the compute cluster can access the physical network. A “traceroute” to the PING host on the physical network shows that traffic is routed to TOR-Right (172.16.14.253) at the moment:

    North-south networking is running beautifully! What can possibly go wrong on a day like this?

    TOR down

    Not exactly an NSX Edge failure, but definitely a failure scenario that concerns the NSX Edge.

    TOR-Right broke down. What’s the impact? Let’s have a look.

    The BGP summary indeed shows us that we’ve lost connection with TOR-Right. BGP connections with TOR-Left are still intact though.

    The Tier 0’s routing table now only contains BGP routes advertised by TOR-Left.

    All of this is expected, but how is the data plane affected by this TOR failure?

    It seems to be working fine. Sure, the “traceroute” reveals that traffic is now passing through TOR-Left (172.16.13.254), but that’s about it.

    The redundant infrastructure and BGP making use of that ensured that this TOR failure had minimal impact on the NSX data plane.

    TOR down recovery

    Basically we would just rack and stack a new TOR, configure it, and restore redundancy. The only thing we need to do within NSX is verify that the BGP connections are restored.

    Edge node down

    Last time I checked there were two edge node VMs in that Edge cluster. en01 is gone!

    What’s the impact? How do we recover?

    Let’s first investigate the impact this failure has on the north-south traffic.

    Alright, none whatsoever.

    The VM can still reach the physical network. The surviving edge transport node must have taken over the duties of the failed node.

    But of course, the NSX Edge is now running on a single edge transport node and NSX Manager clearly shows us that we are dealing with a degraded state.

    Without a standby node we’re living on the edge (pun intended). We need that second transport node up and running again.

    Edge node down recovery

    In a situation like this it’s good to remember that there’s nothing unique about an edge node. During its lifetime it is much like a container receiving and executing configuration from the management plane. In other words, losing the edge node is in itself nothing traumatic. We just need to get a new one.

    The first step when recovering from a permanent edge node failure is to deploy a new edge node. Once it’s deployed three edge transport nodes are listed in the NSX Manager UI.

    • en01 with status “Unknown” is the node that is missing.
    • en02 with status “Degraded” because it can’t find its HA buddy.
    • en03 with status “Up” is alive and happy but not doing much.

    The second step is to tell the management plane that we want to replace the missing edge transport node with the one we just deployed.

    This is done under Edge Clusters in the NSX Manager UI (or via the API).

    After clicking the small gear icon we select Replace Edge Cluster Member. This starts the process of re-mapping logical network configuration from one edge transport node to another.

    In our scenario we want to re-map from en01 to en03.

    If the edge transport node would still be operational, we would put it in maintenance mode here to minimize data plane disruptions. In our failure scenario the node is already gone so maintenance mode is not relevant.

    After clicking Save the management plane comes into action and links configurations and other related logical network constructs to the new edge transport node.

    Once the process is done we can delete the orphaned edge transport node and after a minute or so we’re seeing two healthy edge transport nodes again.

    A look at the Tier 0’s logical router ports shows us what happened.

    Two of the logical router ports previously mapped to en01 have been relocated to en03.

    BGP connections are established again.

    Replacement successful! The fabric’s state is restored to normal operations.

    Summary

    Today we looked at two failure scenarios concerning the NSX Edge:

    • Failure of a top-of-rack switch
      • limited impact on the data plane
    • Failure of an NSX edge node
      1. deploy new edge node
      2. run edge transport node replace process
      3. remove orphaned edge transport node from fabric

    Not too bad. This is a small environment, but the recovery procedures will be largely the same regardless of environment size.

    Sure, more things can break. An ESXi host hosting an edge node, a physical NIC, cables, and so on. The bottom line is that unless we’re dealing with a complete meltdown, a properly designed NSX Edge will minimize the impact of component failure and make recovery a piece of cake.

  • With NSX-T 2.5 comes NSX Intelligence 1.0. This component, which is part of NSX Data Center Enterprise Plus, is something I’ve been looking forward to since it was announced.

    NSX Intelligence adds a powerful analytics engine to the NSX-T platform. It provides workload and network context that is unique to NSX. Application owners and operations people can use the NSX Intelligence interface for configuration and monitoring.

    Besides the NSX Intelligence data platform itself, this 1.0 release provides visualization and security rule and grouping recommendations.

    Cool stuff. Let’s have a look at how to get it up and running.

    Installation preparations

    The preparation and installation steps are explained in detail in the official installation documentation. I strongly recommend you follow these guides when installing NSX Intelligence. Some things to point out:

    • NSX Intelligence 1.0 requires NSX-T version 2.5. The first thing I had to do was upgrade my NSX-T lab to version 2.5. In a production environment the 2.5 upgrade requires its own planning and preparations of course
    • The NSX Intelligence installation comes as a tar-file. Its contents need to be extracted and placed on a web server somewhere that can be accessed by your NSX Manager cluster.
    • The NSX Intelligence appliance must be deployed on ESXi managed by vCenter.

    Installation

    Once the environment is prepared we can start the NSX Intelligence installation.

    In NSX Manager navigate to Plan & Troubleshoot > Discover & Take Action:

    Click on Go to system, scroll down on the Appliances page and click Add NSX Intelligence Appliance. This starts the appliance deployment wizard:

    Enter the URL to the OVF file and the appliance network configuration:

    I’m deploying the small NSX Intelligence appliance which is suitable for labs or PoCs. For a production environment you would select the large form factor.

    In the next step we configure the vSphere details for the virtual appliance:

    Configure the appliance credentials at the third and final step:

    Click on Install Appliance to start the deployment:

    Deployment took about 5 minutes to complete in my lab environment.

    First look

    Although it’s a separate virtual appliance, the NSX Intelligence UI seamlessly integrates with the NSX Manager UI. It can be found under Plan & Troubleshoot > Discover & Take Action.

    The two objects we can work with here are virtual machines and groups:

    We can choose to display only certain VMs/groups or all:

    And apply a filter based on tags, flows, and rules:

    After powering on two Windows VMs it took about 20 seconds before NSX Intelligence engine started to draw the communication paths of these VMs. Impressive!

    In full screen mode you can switch to dark mode. Much appreciated.

    To get actual firewall rule recommendations you need to start a new recommendation process:

    After clicking the Start Recommendation button you can configure some parameters. Time range being the most important:

    Click on Start Discovery to kick off the recommendation process. This process can be monitored under Recommendations:

    Once done analyzing the recommended rules, groups, and services can be reviewed and modified:

    At step 2 we choose placement for the new recommendation based security policy:

    Clicking on Publish will create the objects and enforce the security policy:

    The recommended rules are in place:

    Summary

    Installing NSX Intelligence is a straight forward process (apart from its web based OVF installation requiring a web server).

    We took the NSX Intelligence engine for a really quick test drive and deployed some recommended firewall rules including service and group objects minutes after deployment (a longer period for analyzing is strongly recommended).

    Even as version “1.0” NSX Intelligence is going to make micro segmentation very much easier and very much faster. It’s a big step towards self-driving micro segmentation operations. No to mention the slick visualization and visibility it gives us for our VMs communication paths.