Over the years I’ve come across a few things I think every VMware admin should know how to do. These are tasks that are important to know when building VMware environments or troubleshooting issues. It is sometimes surprising to see the number of admins that are lacking in some of these skills.
This is a somewhat hidden gem to many VMware admins, especially those that come from a Windows server background. ESXTOP is a performance statistics reporting tool much like the top version included with Linux-based distributions. The VMware version is focused on presenting a ton of virtualization-specific details.
The statistics are primarily focused on details around virtual machines, disks, datastores, network, CPU, and memory, which are all important resources and items with any VMware environment. Should you be experiencing a performance issue or just want to check up on things, I think that firing up ESXTOP should be one of the first things you do.
The image below shows the simple manner in which stats are provided for you. Once you understand the data that is being presented, you can use this powerful tool to find issues in a short amount of time.
This is another topic that seems pretty easy and I would assume that most VMware admins know about it. Within vCenter there are many alarms that can be set to alert you when thresholds are reached. This can be a life saver if you have no other monitoring tools or even as an early warning method before your normal toolset begins to warn people.
You could easily set up an alarm for datastore space. This is easy but a very important item, because if you run out of space you will likely have VMs crashing. The alarm can be set to warn you when the datastore reaches a certain percentage of capacity.
I cannot emphasize this item enough. Over the years I have seen many VMware clusters with improperly configured cluster settings. The importance of proper high availability and DRS settings is the difference of a smoothly running cluster and one that has trouble servicing all VMs. The other issue is that upon a hardware failure you might not be able to restart all of your virtual machines.
The ability to analyze your cluster resources against what your virtual machine requirements are is the key. You then need to select an HA protection method that will meet your requirements and understand how it will operate should one or more hosts fail. This is important because VMs may not be able to auto restart if the settings are not ideal.
Its 2014 and I’m still seeing a lot of vSphere standard switches (VSS). Now, that not the end of the world, but when the vSphere Distributed Switch (VDS) has so much to offer, I hate to see people ignore it. If you are not purchasing Enterprise plus licensing, then the VDS is out of your reach, but I do see a lot of customers who own the licensing and who are still not taking advantage of it.
I would recommend that admins get up to speed on what the VDS has to offer and learn how to properly set it up. In the long run it can make setting up and supporting your clusters easier, and it offers many more options for advanced networking features for monitoring and troubleshooting issues.