4U VMware cluster

From Hackerspace ACKspace
Revision as of 13:41, 9 August 2014 by Danny Witberg (talk | contribs) (The hardware)
Jump to: navigation, search
Project: 4U VMware cluster
Featured:
State Active
Members Danny Witberg
GitHub No GitHub project defined. Add your project here.
Description Make a mini VMware cluster
Picture
No project picture! Fill in form Picture or Upload a jpeg here

Introduction

VMWare and virtual computing in general had many benefits over a physical server farm. One VMWare host can accomodate multiple virtual servers. Shared resources means better use of the actual hardware. In a cluster of hosts, several automation processes can be achieved such as auto restart upon failed VM, and automatic recovery after a failed host. To experiment with such a system, you will have to have a minimum of 3 host servers, and a shared storage system or datapool. My goal with this project is to set up the hardware in a mini 4U cluster system.

DL360G5.jpg

The hardware

The cluster is comprised of 4 HP DL360G5 1U servers. They are fairly cheap to come by, can easily be upgraded and are compact. This is an overview of the 4 systems:

1) DONE

  • Hardware platform: HP DL360 G5
  • CPU: 2x Quadcore Xeon E5430 2.66GHz 64 bit 12MB cache
  • Memory: 26GB PC2-5300F ECC Fully buffered memory
  • Harddisk: none installed
  • Network: 2xBCM5705 gigabit Ethernet with offload engine + 2x dual gigabit Intel PRO1000PT
  • Power: 2x 700W hotswap power supply
  • Optical drive: DVD/CD rewriter
  • Management port: ILO2 100Mbit Ethernet
  • Boot disk: 16GB USB stick

2) DONE

  • Hardware platform: HP DL360 G5
  • CPU: 2x Quadcore Xeon E5430 2.66GHz 64 bit 12MB cache
  • Memory: 26GB PC2-5300F ECC Fully buffered memory
  • Harddisk: none installed
  • Network: 2xBCM5705 gigabit Ethernet with offload engine + 2x dual gigabit Intel PRO1000PT
  • Power: 2x 700W hotswap power supply
  • Optical drive: DVD/CD rewriter
  • Management port: ILO2 100Mbit Ethernet
  • Boot disk: 16GB USB stick

3) DONE

  • Hardware platform: HP DL360 G5
  • CPU: 2x Quadcore Xeon E5420 2.5GHz 64 bit 12MB cache
  • Memory: 26GB PC2-5300F ECC Fully buffered memory
  • Network: 2xBCM5705 gigabit Ethernet with offload engine + 2x dual gigabit Intel PRO1000PT
  • Power: 2x 700W hotswap power supply
  • Harddisk: none
  • Optical drive: DVD/CD rewriter
  • Management port: ILO2 100Mbit Ethernet
  • Boot disk: 16B USB stick

4) Pending...

  • Hardware platform: HP DL360 G5 DONE
  • CPU: 1x Quadcore Xeon E5405 2GHz 64 bit 12MB cache DONE
  • Memory: 22GB PC2-5300F ECC Fully buffered memory DONE
  • Network: 2xBCM5705 gigabit Ethernet with offload engine + 2x dual gigabit Intel PRO1000PT DONE
  • Harddisk: 3-5x SATA 500-1000GB 7200RPM 2.5inch drive + 256GB SSD <-- Still open for suggestions
  • Power: 2x 700W hotswap power supply DONE
  • Optical drive: CD rewriter DONE
  • Management port: ILO2 100Mbit Ethernet
  • Boot disk: 64B USB stick DONE

This gives us a combined total of 62,56 GHz CPU power and 78GB of memory for VMWare! The storage unit also has plenty of memory (22GB) for ZFS caching.

DL360 inside.jpg

DL360 stack.jpg

Spare mem.jpg

Upgrades

Upgrades: Memory upgrades can be from 6x4GB+2x1GB = 26GB to a configuration of 8x4GB=32GB. I believe it's best to keep the storage server on a fast dual core CPU, like 3,7GHz, because the file server process is mostly single thread. For now it will run on a 2GHz quadcore. 22GB should be plenty to run a good ZFS system to host all VM data storage with iSCSI connections.

Interconnects

All four systems are planned with 6x gigabit ethernet hooked up to 2 Dell Powerconnect 5324 switches. Two of the gigabit can be used for the iSCSI connection, two for the VM's connection, and two for the shared vMotion/management interface. The host OS for the iSCSI server is still undetermined, but it has to support ZFS with ZIL and L2ARC capabilities at good speeds. Hopefully an SSD drive will be a positive influence to the fileserver's speed.

Cluster layout.png

ZFS storage terminology

The ZFS file system is an advanced modern file system used when data has to be secure, fast available and efficient. There are a number of things that can speed up read data access that ZFS uses in terms of caching. First of all it tries to use RAM as a cache. This is called L1ARC (level 1 adaptive replacement cache). RAM memory is the fastest available storage in the computer, and the ZFS system will try to use this to speed up read access. If some portion of data is accessed a lot, it places this in RAM to be extremely fast. If data is not cached in RAM, it has to revert to the storage disks. The L2ARC places a layer between these, and can contain very fast storage disks compared to the storage disks, but slower than the RAM. This used to be very fast SCSI disks, but nowadays SSD drives are preffered due to the very low access latency.

If data has to be stored onto a ZFS system, basically is has to write to the disks and will be bound to the speed of those disks. The ZIL or "ZFS intent log" can speed up this by caching the write transactions onto the cache. Often this is a RAM disk, or an SSD drive. If the ZIL is full, the ZFS system commits this data to the storage disks.

With ZFS we can set up sparse volumes. This means a volume can be advertised to the ZFS client at a different capacity than it really is. Lets say, I have a 2TB storage pool available, but I advertise it as a 4TB size volume. When data is filling up the 2TB storage to almost full capacity, additional storage space can be added to the volume without expanding the volume.

Another neat feature is deduplication. When the same data is stored multiple times, the file system can recognise this and only store this data is single time, with multiple references to this. However, this feature can consume a great amount of RAM.

Initial testing

For an initial test, the ZFS server is set up with NAS4free, a ZFS pool is added, and a CIFS share and a iSCSI target is attached to the ZFS. On one VMWare ESXi host, the vCenter appliance is installed. It all works remarkably well! The CIFS share is pulling around 65MB/s, being the maximum of my desktop hardware (yes slow harddrive) with no noticable congestion on the iSCSI side. The used memory of the storage is about 44% of caching activity. It seems the NAS4free system is truly multithreaded, because load is divided onto all 4 cores. Of course the SMB and the iSCSI process is single threaded, but it gets divided all onto seperate cores.

Nas4free1.png

Vmwareesxi1.png

NAS4free configuration

Setting up NAS4free as an iSCSI target is fairly simple, once you understand the basics. First of all, you'll have to set up an ZFS volume, since this is the underlying storage system we want to use for iSCSI. An ZFS volume can be created from a ZFS pool, which in itset can contain one or more ZFS Virtual Devices or "vdevs". A vdev can be purposed for storage as well as caching, it being a ZIL (Log) or L2ARC (Cache). For storage, a number of options can be configured. A stripe (RAID0 for non-ZFS people), a mirror (RAID1) and a number of distributed parity options being RAIDZ1 (RAID5), RAIDZ2 (RAID6) or RAIDZ3. Also, standby drives can be appointed as Hot spare. Once you've set up a ZFS volume, it can be referenced by an iSCSI extent. From this extend, you specify an iSCSI target. In a diagram, this all looks like so:

Nas4free overview.gif