Network Time Protocol… the Most Common Oversight in a VMware Environment

October 5th, 2016

Chris Minnis
Enterprise Architect

Over the years, we have done many health checks for our customers here at Mainline. One of the most popular is the VMware Healthcheck. Whenever we’re doing this one for the first time, I make myself a little wager that Network Time Protocol (NTP) will be one of the things we find out of compliance with best practices. I have yet to lose (how one loses a bet with one’s self is another topic entirely). So, I’ve taken some time here to explain a little bit of NTP in a VMware environment… hopefully, to inspire you to address it in your environment, or to contact us at Mainline about addressing it for you.

History
NTP was developed over 30 years ago by David L. Mills from the University of Delaware. Clock synchronization was My experience, which began some 10 years later, after having issues managing 3,000 Windows95 PCs and finding NTP a great way to help with that job. I particularly remember relying on the Navy’s servers tic.navy.mil and toc.navy.mil to set our local NTP server’s clock.

The Problem
NTP synchronizes clocks. It is that simple. Accuracy is less than 30 milliseconds and is generally far better than needed. When you remember that many devices on your network allow for manual setting of time, you realize the need for syncing. When you realize how much of your systems and software management rely on timestamps for accuracy (if not for execution), you realize the need for syncing.

In a VMware environment, the virtual machine gets synchronized by VMware Tools, with every reboot of the VM, snapshot, vMotion, and more… If NTP isn’t configured properly, that sync happens with the ESXi host.

How Bad Can It Be?
When the time, on multiple systems, begins to drift to minutes of difference, or even hours of difference when the time zones are wrong as well, all sorts of issues can occur. Active Directory is so reliant on clock that simply logging into AD will become unreliable with time drift. Remember, we’re not only discussing Windows clock management, but VMware’s ability to set time on those Windows virtual machines, as mentioned above.

Every systems manager relies on system time to write logs and execute automated jobs. When shift happens, logs fail to write or are overwritten… and automated jobs fail.

Troubleshooting
This can be the worst. In general, things don’t work, and you don’t know why. Logs don’t always say “I failed to write.” Login failures don’t always identify a cause. We’ve seen the lack of NTP being right contribute to larger issues, making them worse. The worst being a lack of reliable logs during a completely different problem. This is where experience and/or that Healthcheck come in handy.

What Can I Do?
In a small to medium environment, NTP is easily addressed. You just need an NTP architecture that ultimately syncs every physical and virtual machine to a set of master clocks. Validation (and periodic re-validation) is needed as well, due to change and the addition of new systems.

This is also where our VMware Healthcheck comes in. This Healthcheck validates that NTP is enabled and configured correctly.

If you’d like to discuss a lingering issue (whether or not you suspect NTP), contact your team at Mainline. Our account executives, solution architects and engineers continue to invest in the entire VMware stack, as well as a great portion of what has become the larger still X86 virtualization business. Use our experience to your advantage.

Please contact your Mainline Account Executive directly, or click here to contact us with any questions.

Submit a Comment

Your email address will not be published. Required fields are marked *