One of the main problems in this configuration was obviously the storage. Storing the VMs on the local disks of each blade would have number of critical disadvantages compared to running on a good and properly configured SAN/NAS array:
1. Reliability – SAN/NAS arrays is usually more reliable than local disks.
2. Space – local disks are much more limited in space and ability to expand.
3. Redundancy – with VM stored on SAN, any ESX SAN-connected server can run it.
The VM can be moved to another ESX in the case its current server needs maintenance
or in an event of a crash.
Although we had SAN array which had enough capacity for the project we could not connect the blades to the SAN network as the enclosure does not support HBAs and SAN switches modules.
Here is what I did to configure the blades to use SAN:
- Installed 2 Linux servers with 2 HBAs each. Connected to SAN in the classic redundant schema - 2 SAN switches forming 2 separate fabrics, each connected to 2 separate SPs of the SAN array and each HBA of each server connected to a separate switch.
- The servers are allocated the same set of SAN LUNs to be used for storing VMs.
- The servers have IET (iscsi-target) software installed and exporting the same set of the LUNs as iSCSI targets. IMPORTANT – the exported target for the same LUN should have the same target ID on both servers. In this way iSCSI initiators will “see” the LUN as the same resource on both IET servers and thus will have 2 possible paths for reaching it.
- The servers will be connected with 2 NICs to 2 separate network switches and have Linux NIC Bonding implemented for the network connection failover. Alternatively, if more network bandwidth is required, the 2 NICs can be connected to the same switch and the ports configured with Port Aggregation for NIC load balancing and failover. With 4 NICs on the server the two configurations above can be combined to have NIC and Switch load balancing and failover.
- Blade servers’ enclosure or any standalone server on which ESX will run is connected with at least two NICs to separate network switches.
For each ESX server:
- In ESX network configuration, the NICs are attached to the same virtual switch and grouped with NIC Teaming for failover.
- In ESX Storage Adapters configuration:
* In CHAP configuration, choose “Use CHAP” and fill in the UserName and Secret
as it configured on the iSCSI servers above (assume that the configuration is
similar on both servers. Otherwise, separate CHAP configurations will need
to be configured for each target)
* In Static Discovery tab, choose Add and fill in the server IP and the target ID.
Add all the targets you need to be accessible from this ESX server. For each
target, add it twice – once from each IET server. Again, while the servers’ IPs
are different - the target IDs should be identical for the same LUN being exported.
* For CHAP and Advanced settings choose “Inherit from Parent” or adjust manually
if required.
* Usually Rescan is required after the change.
* You will start seeing the devices and the list of possible paths to them once you get
back to the Storage Adapters Configuration window.
- In ESX Storage configuration:
* In the wizard window choose Disk/Lun
* Next window will show new targets. Choose the target to add.
* For VMFS mount options specify “Keep the existing signature”
* Finish the wizard and the target will appear in the list of the data stores
- Select the new data store and go to Properties.
- Choose Manage Paths – here you can choose the protocol to use for LB/FailOver (RR/MRU/Fixed).
This configuration is No Single Point of Failure (NSPF) assuming SAN box is NSPF device (which is usually the case for most enterprise level SAN boxes).
The solution is using existing (relatively old) hardware and does not require any financial investment. Yet, its providing obvious benefits and serving its purpose quite well.
Here is the diagram of the implementation above: