I was hanging out in the manager’s office the other day with our senior network guy. Just three guys in Dockers kickin’ around life, as IT guys tend to do. The conversation turned to VDI and network security. Our VPN was going to change, which would force our remote access contractors to either be walked through installing the new client or using a machine in the VDI environment. Moving them to VDI is pretty simple, whereas the other not so much by comparison. If VDI is the best option, what is the best way to secure the machines they’ll be accessing from the other parts of the network?
The current VDI machine pools are in a separate VLAN from everything else. Applying ACLs to that VLAN was going to be a pretty major deal on the physical network. Man-power and time are at a premium. Locking down subnets is easy, but what if we wanted to get more granular with application ports? Asking for ACL changes on physical production network equipment in what could be a trial and error scenario wasn’t practical. I suggested we isolate the vendor machines and use VMware’s vShield Edge to test out port security. Done. So, here we go.
Reference the following for detailed instructions on setting up vShield- http://www.vmware.com/pdf/vshield_55_admin.pdf
For this to work, I needed to satisfy the following specifications:
- No changes on the physical network
- Create isolation using VXLAN technology
- Isolate the vendor VDI machines logically in VMware
- Create a virtual firewall
- Verify the vendor VDI machines could still contact the VMware Horizon View infrastructure
- Verify I could lock down ports from the isolated network to the production VLANs
- Guarantee DNS, DHCP and NAT services to the isolated VMs
- Guarantee Anti-Virus/Malware protection
- Verify there is some sort of High Availability
I set this up in the production VDI infrastructure, consisting of:
- vCenter 5.5
- Horizon View 5.3
- vCloud Networking and Security 5.5.3 (vCNS).
- ESXi 5.5 VMware hosts installed on B200 M3 UCS blades
- The port group isolation (PGI) concept is used as the design baseline
- Created a separate vDS with no physical uplinks to logically isolate the VMs
Features I Won’t Be Using:
- To hit specification one (no physical network changes), I can’t setup the hosts in the VXLAN to communicate with each other via private VLAN. I won’t be making any MTU changes (Read: Jumbo Frames) on the UCS infrastructure either. This decision negates the HA feature in vShield, so I have to come up with another HA solution. It also prevents VMs and the protecting Edge appliance from being on different ESXi hosts.
- Download and install the vCNS appliance. This installs the vShield software and the web interface to control it.
- Make sure you have a dVS (Distributed Virtual Switch) stretched across all the VMware hosts that will have the isolated guest VMs living on them.
Once vCNS is installed, go to the vShield Manager (VSM) web GUI. You’ll need the vCenter web address and login credentials, and IP information for accessible DNS servers (2) and one NTP server. When the connections are made, vShield will see the datacenters, clusters and hosts associated with the vCenter.
1. Prepare the hosts that will participate in the VXLAN
From the vShield web interface change the view to Networks in the top left corner. In the same column, choose the datacenter that will participate in the VXLAN. Highlight the datacenter then click the Network Virtualization tab in the center pane, four new links appear underneath. Choose Preparation and click the EDIT button to choose the host cluster that will participate in the VXLAN.
Specifying the transport attributes require the dVS that all the hosts share. I chose the dVS that was being used for vMotion because there wouldn’t be a ton of added traffic. During the preparation phase, vShield creates a port group on the dVS selected and adds a kernel NIC (vmknic) to every host. The new kernel NICs will be in the new port group created. By default, the new vmknic on each host will look to a DHCP server for an address.
Pick a private network ID (ex. 172.16.10.x/24) that hasn’t been used before and give all the new kernel NICs a static IP address. Once that is done the status of all the VMware hosts in the Connectivity section will change to normal and a green check mark.
- Verifying host connectivity on the VXLAN private network can be tricky. There are a couple of vmkping commands to try when remoted into the ESXi host console. Since the VXLAN protocol is used, normal ICMP pings will go unanswered.
VMware wants a private VLAN for VXLAN communications. By default the VLAN number is 2000, but you can choose whatever you want. VLAN 2000 is the internal VLAN used by the hosts to communicate VXLAN information to one another. Also notice the MTU after the preparation is done, it will have changed to 1600. You’ll have to switch to jumbo frames on any connected network equipment that the hosts need to talk to each other. This is the com channel the VMware hosts use to pass virtual layer 2 information back and forth via multicast.
Without jumbo frames and multicasting configured, the ability to have the Edge appliance and the VMs it’s isolating on different hosts is lost. In this case, it isn’t a big deal. It would be a huge deal if the goal were to protect VMs from each other on separate VLANs spanning hardware.
- Add the Segment ID
Add the Segment IDs and the Multicast IP Address allocation pool. This is used for creating virtual wires and host to host communication. I won’t be creating any vWires in this project, but they’re good for separating application VMs. The default settings are below.
- Add a Network Scope
Click the Network Scopes link and add the host cluster that was just prepared.
- Add a Network (Virtual Wire) OPTIONAL
This isn’t necessary for this particular project, but it does have some advantages. Click the Networks link top open the interface. Add a VXLAN Network by clicking the green plus sign. Name the vWire and click Ok. VMware constructs a “loose” wire by adding a port group to the dVS designated when preparing the hosts.
Above: The blue port group is the virtual wire, while the highlighted port group holds all the kernal NICs during host preparation.
To get a better idea on how VXLAN works, open the properties of the new port group created for the vWire. It has been assigned a VLAN (2000) and the number of ports on the port group is set to 0. The VLAN number is the same one the hosts will use to pass VXLAN info to one another and allow the VM attached to the vWire to move around the cluster. The zero ports setting turns this port “group” into a wire. At this point the wire is loose because it isn’t connected to anything and with a new port group created I can now assign it to a VM. The other end of the wire will connect to an Edge appliance forming a logical separation of the VM and the rest of the network.
At this point you may be thinking that the dVS chosen in the preparation stage now has more of an impact. In my scenario, I chose the same dVS that handles vMotion because my design will not produce enough traffic to hinder it. If this were a bigger implementation I would consider just building another dVS or use one that I had already. Network connectivity for the VM will not be affected because the Edge appliance is the gateway for the VM and will be connected on its “external” interface to a VLAN that has network access:
VM—>(vWire)—> Edge Appliance —>(External Interface)–>Network access
2. Add An Edge and Appliances to Support It
Click the last link on the right, Edges. Add an Edge by clicking the green plus sign. Right on the first page there is an option to “Enable HA”. In my scenario, this won’t work. If VXLAN is setup all the way through (VLANs, multicasting and jumbo frames) then this is an easy decision. Just check the box and keep moving forward. I’m leaving it blank and using a different form of HA.
In the Edge Appliances section add one edge appliance. I left the default settings alone (Compact size, Enable auto rule generation and high rule priority). Click the green plus sign and choose the cluster, datastore, host and folder for the appliance. The next section, Interfaces, is where it gets interesting.
The Edge appliance has 10 vNICs by default. These can be attached to port groups (real PGs and/or vWires that look like PGs).
The first interface (vNIC0) should be an Uplink connected to the port group that has external access. External could be the internet or whatever is “external” to the isolated network being built. The Connectivity Status will default to Connected. Next, configure the IPs and subnet for the external interface.
Planning ahead for this part helps. One external IP is needed to connect the Edge to the port group with external access. If one-to-one NAT rules are needed, this interface will need multiple IPs to support those. An example is below.
The starred IP is assigned to the logical uplink connected to the VDI port group. The VDI port group is on the 10.11.17.0 network, so the Edge appliance can be pinged on 10.11.17.151. Each additional IP will have a SNAT rule assigned to a VM on the internal network.
The next Edge vNIC (1) will be of type Internal and connect to the port group on the isolated dVS created to hold the VMs.
The connections would look like this:
VM–>PG Port on isolated dVS–>(Internal Edge Interface vNIC1)–>Edge Appliance–>(External Interface vNIC0)–>PG on dVS with external access
The IP for the Edge vNIC connected to the isolated port group can be anything on the private side. I’m going to use NAT rules anyway, so this private network could overlap with others.
In the Default Gateway section, click the Configure Default Gateway box and choose the name of the Edge vNIC that has the external connection. Add the default gateway IP for that VLAN. Following the example above for the external interface, the gateway for my Edge appliance would be 10.11.17.1.
Next, configure the Firewall & HA settings. Check the Configure Firewall default policy checkbox. I want all incoming traffic to be allowed, so in the Default Traffic Policy the Accept setting was used. I also disabled logging. The HA section is greyed out for me since I didn’t enable it in the first section (Name & Description).
The next window allows for review of the settings. Clicking Finish will build the appliance and power it on.
3. Manage the Edge Appliance (DNS, NAT, DHCP, Firewall Rules)
Now that I have a virtual firewall in between two port groups, it’s time to configure the services offered by the Edge to the VMs behind it. In the Edges section of Network Virtualization in the vShield web GUI, choose the new Edge device and click the Actions icon. Scroll down to Manage.
First up is DNS Configuration. I want this Edge to answer all DNS queries to the VMs behind it. Click the Change link, check the Enable button and add the external DNS servers that the Edge has access to. Click ok to commit the changes.
Near the top of the screen there are other buttons that lead to the direct configuration of services. I’m going to configure NAT next. NAT (Network Address Translation) will allow an un-routable private IP to receive external traffic by having its IP translated to one that is routable. I’ll need to map those specifically though because I can’t have multiple private IPs be translated to one (the IP address of the Edge’s vNIC0 connected to the external port group). That is to say I could set it up that way, but I shouldn’t because I need all the VMs on the isolated port group to be uniquely identifiable by an IP on the external VLAN. Sometimes it’s all about “shoulda” not “coulda”.
This is where SNAT rules come into play:
These are examples of SNAT (S=Source) rules. I have 3 machines on the private network behind the Edge appliance. Each of those VMs has a private IP on the 172.17.1.0 network. As those VMs make external connections, their “external address” becomes what is assigned in the Translated section. They are applied on the external interface of the Edge appliance, since that is the last stop before leaving the privatized network. So, any transmission from 172.17.1.10 to the external network will receive replies on 10.11.17.152, which in turn get forwarded to 172.17.1.10 via the Edge appliance.
This configuration is imperative because The View Administrator must be able to ping the VMs in the pool. Without routable IPs, no connection to these VMs can be made.
- DNAT Rules (D= Destination) are needed if one of the VMs in the private network has a service that external machines need to get to. The service would be advertised in DNS with an external IP (assigned to the external vNIC0 on the Edge appliance) and requests forwarded to the internal, private IP.
- Ensure correct DNS resolution when using SNAT on a Private Network: SNAT is great, but what IP addresses do the VMs on the private network register with external DNS when they connect to a Windows domain? Unfortunately, it’s the private IP. I think the best way to combat this problem is to turn off the auto registration process on the NIC in the Windows VM. When this change is made (unchecking the Register with DNS box in the Advanced TCP-IP v4 Properties), the existing DNS entry is removed on the Domain Controllers.
In View, set the parent VM or snapshot to not register the IP address in DNS and recompose the pool. If the VMs are full clones, it’s a manual process.
Next, add “A” records to the domain DNS matching the VM’s hostname with the configured SNAT external IP.
Note: These adjustments don’t need to be made to accommodate the View Administrator. The View agents can talk to the admin just fine with the SNAT rules alone. What may not work is third party vendor stuff like Trend Micro Security Suite or anything else that needs to ping or name resolve the VMs on the isolated network.
- DHCP Configuration: Having the Edge appliance deliver private IP addresses is one to be considered, but this isn’t an automatic “Yes!.” I chose not to configure DHCP because I need all the VMs to have static IPs and DNS entries.
- Firewall Rules: The firewall service provided in the Edge appliance is very straight forward. Click the green plus sign to add a rule. A new row will appear in the rule list. Name the rule and add the Source.
There are a lot of choices for the source. The two sections are IP Addresses and Vnic Group. If it’s a rule that will apply to all the VMs behind the Edge, use the Internal Vnic as the Source. If the rule applies to a VM, use IP Addresses, or if you’re using a vWire assign the rule to the Vnic it’s connected to.
You can add IP Address, Port and Services groups to the firewall. Click the green plus sign to start a new rule. Each section (Source, Destination, and Service) has a small plus sign in the right corner of its associated block. Click that sign to open a new window, at the bottom of which there is another link to add groups. This makes things much easier when you’re blocking multiple ports, services and IP addresses.
Remember to publish the rule after creation. The change is immediate, with no reboots of the Edge needed.
4. High Availability
With the VXLAN implementation done in a way that there isn’t any Layer 2 stretching across hosts (No VLAN, multicasting and jumbo frames added), the best way I could come up with to ensure some HA is by using the vCenter DRS Rules feature.
In the cluster settings, click the Rules section under vSphere DRS. Add a Keep Virtual Machines Together Rule and add all the isolated VMs and their Edge appliance. This way, the group will survive a HA event and still participate in the DRS schedule to keep the hosts’ resources balanced.
All the specifications were met after the configurations are complete:
- No changes on the physical network: VXLAN was not implemented completely (No VLAN, multicasting or jumbo frames were added)
- Create isolation using VXLAN technology: The Edge appliance provides logical isolation for the VMs.
- Isolate the vendor VDI machines logically in VMware: A separate dVS was created with not host adapters attached. A separate port group was created to connect to connect the isolated VMs.
- Create a virtual firewall: The Edge’s firewall service was used to create traffic rules between the isolated VMs and external networks
- Verify the vendor VDI machines could still contact the VMware Horizon View infrastructure: Using the Edge’s NAT service, individual SNAT rules allowed consistent communication to the View Administrator.
- Verify I could lock down ports from the isolated network to the production VLANs: The Edge’s firewall service was used to create traffic rules between the isolated VMs and external networks
- Guarantee DNS, DHCP and NAT services to the isolated VMs: The DNS service was configured on the Edge appliance allowing DNS forwarding. DHCP could be used, but wasn’t needed and NTA services were configured for each VM allowing external communication.
- Guarantee Anti-Virus/Malware protection: Using static IPs, SNAT rules and manual DNS entries, the isolated VMs could communicate with the Trend Micro Deep Security Suite, which provides Anti-Virus/Malware protection.
- Verify there is some sort of High Availability: Setting up DRS rules and grouping the VMs and their Edge appliance together, the group can withstand a host failure.