Updated with Patch! Massive Bug Found in VMWare ESX 3.5U2

Updated with the Express Patch information and download! Click "Read More" for the links!!

This was first reported yesterday evening on both the VMWare Community forums and on DeployLinux.com. From what anyone can tell, there is a bug in the VMWare License Management code and it is causing any system that is running ESX 3.5U2 to not be able to boot this morning. VMware is attempting to figure out what happened and put out a patch, but the more important question is, "Why wasn't this caught before it shipped?" As Matt Marlowe posted:

OK, while we're all remaining calm....just imagine the implications that bugs like this can occur and get past QA testing....5 years down the road, nearly all server apps worldwide pretty much running in VM's (pretty easy prediction)......some country decides to initiate cyberwarfare and manages to get a backdoor into whatever is the prevaling hypervisor of the day.....boom. All your VM's belong to us. [...]

I'd love to find out what happened here. Don't they do any regression testing on new releases to check for date based bugs? I thought that would be pretty obvious.

There have been some updates on this situation since it broke last night:

1. Frank Wegner's suggested workaround:

* Do nothing
* Turn DRS off
* Avoid VMotion
* Avoid to power off VM's

I'd council against turning DRS off as that actually deletes resource pool settings....instead, set sensitivity to 5 which should effectively disable it w/ minimal impact.

2. VMware has stated they will have fixes available in 36hrs at the earliest.

3. Anand Mewalal's suggested workaround:

We used the following workaround to power on the VM's.
Find the host where a VM is located
run ' vmware-cmd -l ' to list the vms.
issue the commands:
service ntpd stop
date -s 08/01/2008
vmware-cmd /vmfs/volumes/
service ntpd start

4. It's reported that there are no easily seen warnings in logs/etc or VC prior to hitting the bug. VC will continue to show the hosts as licensed and no errors will appear in vmkernel log file until you try to start up a new vm, reboot a vm, or reboot the host.

Any more info we get will be added as we find it!

UPDATE 1: According to the new FAQ posted:

Resolution:

VMware Engineering has isolated the root cause and is working to produce an express patch for impacted customers today. The target timeframe is 6pm, August 12, 2008 PST.

That's excellent news for those affected!

UPDATE 2: The Express Patch has been released to fix this issue:

Express Patch Download

Special Notice: Please Read

An issue has been uncovered with ESX/ESXi 3.5 Update 2 that causes the product license to expire on August 12, 2008.

Follow the steps below to correct this issue:

1. Read the following Knowledge Base articles first:
* Fix of virtual machine power on failure issue, refer to KB 1006716
* For VI 3.5, refer to KB 1006721 for deployment consideration and instruction
* For VI3.5i, refer to KB 1006670 for deployment consideration and instruction
2. Download and apply the express patch according to the product(s) you have:
* VMware ESXi 3.5 Update 2 Express Patch
* VMware ESX 3.5 Update 2 Express Patch

News source: All your VM's belong to us
Link: BIG bug in ESX 3.5 Update 2
Link: KB 1006716: Unable to Power On virtual machine with "A General System error occurred: Internal error"
Download: Express Patches for ESX 3.5U2

Previous Story
Driving Under the Influence of Technology
Next Story
Zune Goes to Hollywood