
With VMware releasing new major versions of vSphere and NSX-T last week, it’s high season for nested lab deployments. My Norwegian Proact colleague Rudi Martinsen just published a great two part series on how to deploy a nested lab using vRealize Automation. My Dutch buddy at VMware Iwan Hoogendoorn is doing something very exciting with Terraform. And William Lam, who has been building nested labs since the day he was born I believe, has been busy with something too.

Ansible Playbooks
About a week ago I stumbled upon Yasen Simeonov’s GitHub repository. It contains a collection of Ansible Playbooks that automate the deployment of a nested vSphere environment. After trying it out a couple of times I decided to adopt Yasen’s, somewhat neglected, pet to give it some new love and attention (with Yasen’s blessing of course).
GitHub repository
Today I’m presenting my own GitHub repository vsphere-nsxt-lab-deploy which is largely based on Yasen’s, but with a couple of updates and additions.
First of all, I updated the code so that it can deploy the brand new vSphere 7. Then I took things one step further and added Playbooks for a complete NSX-T 2.5/3.0 deployment leveraging VMware’s new Ansible NSX-T 3.0 modules.
Runbook
I won’t go through the deployment process in detail. The repository’s single source of truth README.md is hopefully informative enough. Right now the runbook looks like this:
- Create a vSwitch and port groups on the physical ESXi host.
- Deploy and configure a vCenter Server Appliance (via automated CLI install).
- Deploy 5 ESXi virtual machines (via ISO install and KS.cfg).
- Configure the nested vSphere environment:
- Configure the ESXi hosts.
- Create and configure a VDS.
- Create Compute and Edge vSphere cluster and add ESXi hosts.
- Deploy NSX-T:
- Deploy NSX Manager.
- Register vCenter as a Compute Manager in NSX Manager.
- Create NSX-T Transport Zones (VLAN, Overlay, Edge).
- Create IP pool (TEP pool).
- Create Uplink Profiles.
- Create NSX-T Transport Node Profile.
- Deploy two NSX-T Edge Transport Nodes.
- Create and configure NSX-T Edge Cluster.
- Attach NSX-T Transport Node Profile to the “Compute” vSphere cluster (will effectively install NSX-T bits and configuration on the ESXi hosts in that cluster)
The deployment time is around 1,5 hours on my hardware. Without NSX-T it takes about 45 minutes.

The deployment is easy to modify. Change the settings in answerfile.yml to fit your needs and edit deploy.yml to control which components are being deployed. For example, if you’re not interested in deploying NSX-T you can simply comment out those Playbooks in deploy.yml.
Work in progress
The repository and its code is a work in progress and changes are committed on a regular basis. Although I’m pretty happy with what it currently does, there’s certainly room for improvement. I’m thinking about adding some optional Playbooks that set up some NSX-T logical networking constructs like Tier-0/Tier-1 Gateways, segments, and so on. I’ll keep you posted via the README.md and social media.
Summary
Feel free to use the repository as it is or let it inspire you and create something better. Just don’t forget to thank Yasen who laid the groundwork here.
What is the minimum HW for the host in this setup?
LikeLike
It’s hard to say as it depends on your deployment. For example the amount of nested ESXi VMs and their configuration. Everything can be customized in the answerfile.yml. I’m running the default deployment on a 2-CPU/256GB RAM physical server without any problems.
LikeLike
Hi,
Great blog!
What CPU’s do you have in your server where you’re labbing this? I thought I had it all good and ready….I have got 2 x HP DL380 G8’s with dual 8 core CPU’s in, 192GB RAM in each, 10G networking between them and almost 2TB of NVMe in each on PCIe cards and am really struggling with the reliability of the manager VM’s.
Any insight much appreciated.
LikeLike
Hi Richard,
Thanks for trying this out. Your hardware specifications look ok to me although I would recommend 256GB RAM.
BTW have a look at our upcoming v2 if you like at https://github.com/rutgerblom/SDDC.Lab/tree/dev-v2 it has new features and improved code. HW requirements will be roughly the same though.
LikeLike
For nested environments you need promiscuous and forged transmit enabled. Does these ansible play books takes care during deployments?
LikeLike
Hi
Yes this is configured during the deployment.
LikeLike
Hi,
This looks great. I am in process to test this in my home lab.
I have question
How its resolving nested Esxi and vCenter hostname. Do i need to deploy a Domain controller.
If yes on which physical esxi portgroup i need this DC01.
LikeLike
Hi
Good question.
A domain controller is not required but certainly nice to have. DNS is needed as the ESXi hosts are added by DNS name to vCenter. You could change this behavior in configureNestedESXi.yml if you like so that they are added by IP address.
My DNS server is on the physical network and queries from the nested environment are routed to this DNS server. This of course involves setting host routes on the DNS server pointing back to the nested environments management subnet.
I’m looking at ways to include name resolution in the deployment so that we get rid of this dependency.
LikeLike
Thanks for the response.
If i deploy DC01 and nfs on the nested network 1611 vlan and change the DNS ip in answerfile.yaml would that work.
LikeLike
Yes, that should work fine.
LikeLike
Thanks Rutger,
for prompt response.
LikeLike
HI Rutger,
sorry to bug you again. I am stucked in an issue. After running the ansible-playbook deploy.yml.
Its only deploying router and vcenter and its getting stucked and not deploying esxi. I am deploying esxi6.7. I have checked vcenter is successfully deployed and working and i can login.
is there any log i can check.
LikeLike
No problem.
The ESXi VMs are also deployed on your physical ESXi so for the deployment vCenter is not used.
Is it just stuck or do you receive any message? One option is to run the ansible-playbook command with the -vvv flag for more verbosity. You could do something like “ansible-playbook playbooks/deployNestedESXi.yml -vvv” to just run that part of the deployment and with more output.
LikeLike
Hi rutgerblom,
I am getting this message. when i am running deployNestedEsxi.yml with verbose switch,
},
“results_file”: “/root/.ansible_async/485041661882.10013”,
“started”: 1
},
“msg”: “Unable to find host \”192.168.0.21\””
}
PLAY RECAP *********************************************************************
127.0.0.1 : ok=3 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
LikeLike
Hi Lalit,
Did you resolve this problem ? “msg”: “Unable to find host \”192.168.0.21\””.
What mask is your physical network ? Different than /24 ?
BR
LikeLike
When i am running deploy.yml. Its stucking here for a long time.
TASK [Perform vCenter CLI-based installation] **********************************
task path: /root/vsphere-nsxt-lab-deploy/playbooks/deployVC.yml:32
Wednesday 29 April 2020 11:20:42 +0000 (0:00:00.165) 0:03:06.096 *******
ESTABLISH LOCAL CONNECTION FOR USER: root
EXEC /bin/sh -c ‘echo ~root && sleep 0’
EXEC /bin/sh -c ‘( umask 77 && mkdir -p “` echo /root/.ansible/tmp `”&& mkdir /root/.ansible/tmp/ansible-tmp-1588159243.1004767-4152-106655190239341 && echo ansible-tmp-1588159243.1004767-4152-106655190239341=”` echo /root/.ansible/tmp/ansible-tmp-1588159243.1004767-4152-106655190239341 `” ) && sleep 0’
Using module file /usr/local/lib/python3.6/dist-packages/ansible/modules/commands/command.py
PUT /root/.ansible/tmp/ansible-local-356458vwl3fy/tmpnop2q_ty TO /root/.ansible/tmp/ansible-tmp-1588159243.1004767-4152-106655190239341/AnsiballZ_command.py
EXEC /bin/sh -c ‘chmod u+x /root/.ansible/tmp/ansible-tmp-1588159243.1004767-4152-106655190239341/ /root/.ansible/tmp/ansible-tmp-1588159243.1004767-4152-106655190239341/AnsiballZ_command.py && sleep 0’
EXEC /bin/sh -c ‘/usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1588159243.1004767-4152-106655190239341/AnsiballZ_command.py && sleep 0’
LikeLike
Deploying vCenter can take a while (20 mins or more). The output doesn’t say much.
LikeLike
Hello Rutger,
Several steps further….
I have similar problem to Lalit. During nested esxi installation my physical host is “missing”
“msg”: “Unable to find host \”192.168.1.50\””
}
127.0.0.1 : ok=20 changed=10 unreachable=0 failed=1 skipped=14 rescued=0 ignored=0
Do you have any additional hints for this ? 🙂
Thank you in advance.
BR
Darek
LikeLike
Hi Darek,
Is 192.168.1.50 your physical ESXi host? Can you reach it from your Ansible control node?
LikeLike
Yes / Yes 🙂
LikeLike
Does it fail immediately when deploying the first ESXi VM?
LikeLike
Yes, and rather before the start.
Always in this same place
Friday 01 May 2020 16:57:58 +0200 (0:00:01.887) 0:25:39.626 ************
===============================================================================
Perform vCenter CLI-based installation ——————————- 1380.31s
Copy ISO contents —————————————————— 69.28s
Wait 30 seconds before we start checking whether the ESXi hosts are ready — 30.06s (extended for test )
Upload the ESXi ISO to the datastore ———————————– 26.84s
Wait 5 seconds for the port groups to become available —————— 5.04s
Check if VCSA is already installed ————————————– 3.79s
Deploy ESXi VMs ——————————————————— 3.38s
Create Clusters ——————————————————— 2.46s
Create custom ESXi ISO ————————————————– 2.29s
Create a management port group for the lab environment —————— 1.89s
Result check for deployment ——————————————— 1.89s
Create trunk port group for the lab environment ————————- 1.48s
Create Datacenter ——————————————————- 1.34s
Create a VMware vSwitch on the ESXi host for the lab environment ——– 1.21s
Create JSON template file for VCSA with embeded PSC ——————— 1.11s
Check if the VyOS router is already depoyed —————————– 1.04s
Unmount vCenter ISO —————————————————– 0.77s
Mount vCenter ISO ——————————————————- 0.71s
Delete the temporary template file for VyOS router ———————- 0.64s
Edit boot.cfg ———————————————————– 0.64s
LikeLike
I’m suspecting an issue with name resolution. Let me debug a bit on my side and see if I can fix a workaround.
LikeLike
Host is declared with IP only. Ping Ansible VM to Physical host during installation – 100%.
One more remark – vlan-301 for VyOS is not created if missing – It stopped me for more than hour 🙂
LikeLike
Darek, that VLAN ID should correspond to a VLAN ID in your physical network environment. It’s the public interface of the VyOS router. It should actually be on the same network as your Ansible control node. The same goes for “router_public_ip”.
Change these so they match your environment.
LikeLike
Rutger, Thank you for your replay. I have checked several times. My basic configuration seems to be OK. Today I have implemented updated script. I stopped at:
TASK [Deploy ESXi VMs]
changed: [localhost] => (item={‘key’: ‘esxi01’, ‘value’: {‘ip’: ‘172.16.1.11’, ‘mask’: ‘255.255.255.0’, ‘gw’: ‘172.16.1.1’, ‘fqdn’: ‘esxi01.lab.local’, ‘vmname’: ‘nested-esxi01’, ‘cluster’: ‘Compute’, ‘vlan’: ‘1611’, ‘vmotion_ip’: ‘172.16.12.11’, ‘vmotion_mask’: ‘255.255.255.0’, ‘vsan_ip’: ‘172.16.13.11’, ‘vsan_mask’: ‘255.255.255.0’, ‘username’: ‘root’, ‘password’: ‘VMware1!’, ‘cpu’: ‘8’, ‘ram’: ‘65536’, ‘boot_disk_size’: ‘8’, ‘vsan_cache_size’: ’90’, ‘vsan_capacity_size’: ‘180’}})
TASK [Wait 3 seconds before we start checking whether the ESXi hosts are ready]
TASK [Result check for deployment]
failed: [localhost] (item={‘started’: 1, ‘finished’: 0, ‘ansible_job_id………………results_file”: “/root/.ansible_async/322638819862.21606”, “started”: 1}, “msg”: “Unable to find host \”192.168.1.50\””}
At this moment I have full communication (pingi n both directions) between “external” network (192.168.1.0/16, VyOS, Ansible-VM) and nested VMs (172.16.1.0/24 – DC, vcenter). vcenter has no networks defined or datastores at this stage of the installation. Is it correct ?
Any advices are very welcomed 🙂
Thank you in advance.
LikeLike
Darek,
That seems to be correct. The VDS will be set up in the next step. Are your ESXi VMs deployed?
LikeLike
Rutger,
Unfortunatelly no.
After TASKS – [Create Clusters] Enable DRS, Mount ESXi ISO, Copy ISO contents, Mount ESXi ISO, Unmount ESXi ISO, Edit boot.cfg, insert customks.tgz in boot.cfg modules section, copy customks.tgz, Create custom ESXi ISO, Upload the ESXi ISO to the datastore , Delete temporary directory,
and finally….
Deploy ESXi VMs (output from 5 VMs with customized parameters, AD is working) – changed: [localhost] => (item={‘key’: ‘esxi01’, ‘value’: {‘ip’: ‘172.16.1.11’, ‘mask’: ‘255.255.255.0’, ‘gw’: ‘172.16.1.1’, ‘fqdn’: ‘esxi01.lab.local’, ‘vmname’: ‘nested-esxi01’, ‘cluster’: ‘Compute’, ‘vlan’: ‘1611’, ‘vmotion_ip’: ‘172.16.12.11’, ‘vmotion_mask’: ‘255.255.255.0’, ‘vsan_ip’: ‘172.16.13.11’, ‘vsan_mask’: ‘255.255.255.0’, ‘username’: ‘root’, ‘password’: ‘VMware1!’, ‘cpu’: ‘8’, ‘ram’: ‘65536’, ‘boot_disk_size’: ‘8’, ‘vsan_cache_size’: ’90’, ‘vsan_capacity_size’: ‘180’}})
then
TASK [Wait 3 seconds before we start checking whether the ESXi hosts are ready]
TASK [Result check for deployment]
failed: [localhost] (item={‘started’: 1, ‘finished’: 0, ‘ansible_job_id’: ‘768402499946.6246’, ‘results_file’: ‘/root………
“msg”: “Unable to find host \”192.168.1.50\””} – for all 5 hosts.
I have performed the script with IPs incuded in the script – this same issue.
All VMs are reachable – (ping in both directions) between “external” network (192.168.1.0/16, VyOS, Ansible-VM) and nested VMs (172.16.1.0/24 – DC, vcenter)
Thank you.
LikeLike
Darek,
We’ve been debugging this and it turns host the {{ PhysicalESX.host }} MUST resolve to a DNS name. You can not use an IP address here.
Can you verify that you are using a FQDN as the value for {{ PhysicalESX.host }} in answerfile.yml?
Thank you
LikeLike
Rutger,
Bingo ! Much better :). I’m a few steps ahead 🙂
Thank you for your help and patience 🙂
LikeLiked by 1 person
As a nested big fan, this is very useful.
Thanks for sharing Rutger.
LikeLike
Hello Rutger,
I’m trying to implement 1.2.8 version. vCenter 7 + NSX-T 3
After edge nodes implementation (up and working on vSAN) I’m stack here….
….{‘httpStatus’: ‘BAD_REQUEST’, ‘error_code’: 9543, ‘module_name’: ‘NsxSwitching service’, ‘error_message’: ‘Cannot create HostSwitch n-vds01 without HostSwitchMode and TransportZoneEndpoints.
I noticed that esxi-tnp is configured with vds not n-vds01
Any advices are very welcomed 🙂
Thank you in advance.
LikeLike
Hi Darek,
Good catch. I will look into this ASAP.
LikeLike
Hello Rutger,
If this script was not fully tested by other users, so I can add another remark.
Claiming disks for vSAN storage doesn’t work correctly for both clusters – for me. I have to manually claim (unused SSD) disks for vSAN and start script from task – edge node implementation.
I can’t exclude my hardware issue – of course.
Best regards
LikeLike
I can’t reproduce that issue in my environment. The issue regarding the edge n-vds is fixed now. I just pushed the fix to master tagged 1.2.83. Please pull/clone and try again.
Cheers!
LikeLike
Hello Rutger,
All is working now.
Thank you for support.
BR
LikeLiked by 1 person
Did anyone manage to optimise resources and deploy a minimal setup to explore and play with NSX-T 3.0 using provided Ansible playbooks ? Basically is it possible to install it on a single host that has only 32 or 64GB of memory ? I currently run NSX-T 2.4 on a single ESXi host that has 32G of ram. I followed the instructions on virten
https://www.virten.net/2016/05/deploy-vmware-nsx-in-homelabs-with-limited-resources/
https://blogs.vmware.com/services-education-insights/2019/02/why-everyone-needs-an-nsx-nested-lab-sandbox.html
If the playbooks support a minimal setup , which files need to be modified? I guess in addition to each ESXI host resource playbooks for NSX-T components should be modified as well .
LikeLike
Alex,
At least for the Edge nodes you could add “reservation_info” to the “nsxt_vars” (in the sections for the edge transport nodes) and instruct the deployment not to reserve any resources on the ESXi host. Have a look from line #325 in this module for more details: https://github.com/rutgerblom/vsphere-nsxt-lab-deploy/blob/master/library/nsxt_transport_nodes.py
Cheers
LikeLike
I have a double nested environment, I am running a Dell T5600 with 128gb mem, 5 terabytes of SSD storage using FreeNAS to create 3 1tb Datastores with 24 processors.
On this I have Windows 7 and then VMware Workstation on that. I’m using the internal Virtual Network Adapter for my networking and isolating my vDS and DPG’s (static vmk IP’s).
On VMware Workstation I have 3 ESXi 6.7 Host VM’s and a FreeNAS VM.
the ESXi’s VM’s are Compute, Network and Edge
On the Compute VM (192.168.101.101) I have a VCSA 6.7 appliance (192.168.101.125)
On the Network Host (192.168.101.100) I have NSX Manager (192.168.101.126) and NSX Edge-INT (192.168.101.127)
On the Edge Host (192.168.101.100) I have Edge-EXT (192.168.101.127)
My question is how can I use your process to create your environment (I can of course change the VMware Workstation VMnet adapters to reflect your IP schema). I have the VMware ISO’s for 6.7/7.0 and will modify your scripts to use just the 3 hosts (not the 5 your setup).
LikeLike
The short answer is you can’t. This script comes with its own specific requirements and won’t work in an environment like yours without making substantial changes to the code.
LikeLike
Hello,greate for you work.
i test prepareISO playbook, this end successufuly but when test the iso, esxi setup not start when select UEFI boot mode.
The normal boot work successufuly and esxi install without problem
LikeLike