These notes are for the VMware admin only. I don’t cover SQL, SQL clustering or Server 2012 setups in this article. My role in these projects is to make sure the virtual servers have storage, networking and are controlled properly via vCenter. The SQL 2012 version used here is the basic Standard Edition and we will be using the older MSCS clustering with shared disks.
VMware admins are bound to some rules when it comes to SQL 2012 and clustering in general. The best practices are well documented around the web and I even picked up this book for a quick read. Ironically, if the only clustering solution available is the old MSCS then you have very few decisions to make.
The basic architecture includes a Dell Compellent SAN connected via fiber to Cisco UCS B-Series blade servers. The blade servers run VMware ESXi 5.5 and are controlled through vCenter 5.5 (Build 1945274). The Microsoft cluster will have two Server 2012 R2 nodes with SQL 2012 installed on each and using MSCS with shared disks.
The Microsoft Server 2012 R2 VMs will have a mix of VMDKs and RDMs for the disks. The operating system and installed programs will be on Thick Eager-Zeroed VMDK disks. All disks MUST be Thick Eager-Zeroed format, any other version of VMDK will cause boot problems on the secondary node. The vCenter default format is Lazy-Zeroed, so pay attention as you create the disks. Each server will get two vNICs, one for the main access IP, the other for a private heartbeat network.
If MSCS is used for clustering services, then we are relegated to using RDM disks for the servers.
1. Create the necessary volumes on the Dell Compellent
2. Choose the Replay interval
3. Map the volumes to the ESXi servers or server group
In vCenter, have the hosts scan the storage for new volumes. When the scanning has completed, map the RDM disks to the primary node in the cluster. It’s going to take some communication if another team built the servers because they know which node they want to be the primary. For the VMware admin, the node that gets the original RDM mapping is the primary because the secondary server uses RDM map points in the primary server’s folder to simultaneously connect to the disk.
Map the RDM disks to the Primary Node (Turn the VM off while following this list)
1. Edit the primary node’s VM settings. Choose Add, Hard Disk, Next
2. Choose Raw Device Mapping, select the correct LUN (If the Raw Device Mapping option is greyed out, the ESXi hosts don’t see the disks created on the SAN. Check the disk to server mappings on the SAN then re-scan the host’s storage adapters.)
3. Keep the LUN map in the same datastore as the primary node.
4. Choose Physical for the compatibility mode.
5. Choose a new SCSI controller for the RDM. For the first disk, choose 1:0, for the second, 2:0. Finish the wizard and let vCenter create the disk.
6. Back in the VM settings, click the SCSI controller to activate the options. Change the new SCSI controller type to LSI Logic SAS if it isn’t already.
7. Choose Physical for the SCSI Bus Sharing option and click OK.
Follow the above process for all RDM disks on the primary node. In the book I mentioned above, the use of Paravirtual SCSI (PVSCSI) controllers is highly recommended, but they are illegal in a MSCS cluster.
Now, the fun part (sarcasm). Once the RDMs are connected to the VM you can see them as VMDK files in the VM’s folder. What you can’t see is what zeroed format they are in. Since the default for vCenter is Thick Lazy-Zeroed, they will need to be re-provisioned with the vCLI. You can test this by trying this, VMware’s guide on determining the zero level, or by mapping the RDMs to the secondary node (later in this article) and watching it fail to boot with this error: Thin/TBZ disks cannot be opened in multiwriter mode
Change the Thick Zeroed format for RDM Disks (The VM is still off, right?)
1. Enable the SSH service on the VM’s host and login with Putty.
2. Navigate to the datastore that the primary node VM resides and open the volume so you see this disks.
3. Run vmkfstools -k against the VMDK’s absolute path. This will change the disk from lazy to eager zeroed. The time this process takes depends on the size of the disk. It took about 45 minutes for a 100GB drive in my case.
Here’s a good blog post with some pictures on changing the Zeroed level. Once the RDM disks are all changed to Eager-Zeroed, the secondary cluster node can be mapped to these RDM disks. Mapping the secondary node has the same steps as the first one, listed above, with this one exception:
– When mapping the RDM for the secondary node, choose an existing disk and browse to the primary node’s VM folder. Choose the RDM disk that is now listed as a VMDK. If you have multiple RDMs, make sure to map the same number SCSI controller to the same disk on each node. So, whatever RDM mapped to SCSI Controller 1 on the primary node is, map the same drive to SCSI Controller 1 on the secondary.
The SQL cluster will need a heartbeat network to keep track of each other. There are lots of ways to set this up and it depends on the virtual and physical network architecture. In this scenario, the VMs are physically connected by the UCS backplane and network hardware, so any heartbeat traffic will never leave the UCS even though the VMs are on different chassis. I created a portgroup named SQL-Heartbeat on one of the distributed virtual switches and connected both nodes’ secondary vNIC to it. On the servers, assign a unique host number on a private, no-route network (172.16.x.x) to the secondary vNIC. Test with a ping command to verify connectivity.
The only threshold to be adhered to is the MCSC heartbeat timeout. By default, this is set at 5 pings, which is 5 seconds. After the 5th ping failure, MSCS will change the cluster master, which kills all current connections to SQL. Keep this in mind when you choose the dVS for the heartbeat portgroup as network saturation could put MSCS in a panic state.
** Make sure the SQL heartbeat network card does not register itself with DNS. Also, verify the TCP-IPv4 bindings have the main LAN card as the primary. T0 verify the binding order, goto Network and Sharing Center, change adapter settings. In the settings window, use ALT-V to open the alternate menu bar. Choose Advanced, Advanced Settings. Move the non heartbeat NIC to the top of the order.
With MSCS and shared disks we can’t use Storage vMotion or Automatic DRS. We can take advantage of vSphere HA, but only in the broadest sense in that we have each node of the MSCS cluster on different ESXi hosts. A vMotion event will trigger a MSCS cluster failover, so we can’t move these VMs around at all. Here are the vCenter setting breakdown based on four or more ESXi hosts available:
1. Create one group for each of the cluster nodes. Each node will be the only member in its group.
2. Create two groups of ESXi hosts. Each VM will be assigned to only run on hosts in the group.
1. VM-VM Separation: Create a rule specifying to split the VMs from each other.
2. VM-Host Rule: Assign each node in the cluster to a different ESXi host group.
3. ForceAfflinePowerOn Rule: This a manual entry made in the Advanced section of the main DRS Rules section of the vSphere or web client. In a HA event, affinity rules will be ignored and the VMs could end up in the same group of hosts, or on the same one. This rule orders vCenter to power on the VMs in the specified groups only.
4. Change the DRS level for the cluster node VMs to “Partially Automated”. Change this setting in the Virtual machine Options section of DRS. This a VMware requirement for SQL clusters.