Spectre/Meltdown Vulnerabilities
Incident Report for Green Cloud Defense
Resolved
This incident has been resolved.
Posted Mar 01, 2018 - 10:02 EST
Update
In our continuing effort to keep our partners informed, we'd like to provide you with a quick update. Intel has released updated information on the previously mentioned issue with their CPU microcode updates ( https://newsroom.intel.com/news/root-cause-of-reboot-issue-identified-updated-guidance-for-customers-and-partners/ ). As the page indicates, there is a fix being tested with a general release imminent. We will bring this update in house for testing as soon as it is made available to us. Upon positive confirmation from our testing as well as from intel and industry peers we will roll this update out to our platforms as quickly as is prudent. Industry peers and press have indicated the microcode updates when operational can have adverse effects on the stability of some applications. For this reason OS and application patching continues to be critical to the success of this overall remediation strategy. Thank you for your continued patience as the industry deals with this very complex and wide reaching issue. Green Cloud knows this is of great concern to you and your customers. We are working with your best interest in mind to manage the many variables involved in planning and executing the right plan.

As always, please contact us at 877-465-1217 or send an email to support@gogreencloud.com if there is any support we can provide during this remediation. We will provide another update when we reach the next step in our testing
Posted Jan 22, 2018 - 16:14 EST
Update
We have slowed plans to deploy the VMware and Intel Microcode patches to our platforms. As noted in our previous update Intel and VMware have discovered and are working to resolve issues with specific Intel Haswell and Broadwell family chips that can lead to platform reboots when the Spectre protections are engaged ( https://kb.vmware.com/s/article/52345 and https://newsroom.intel.com/news/intel-security-issue-update-addressing-reboot-issues/ ).

"The [Reboot] issue can occur when the speculative execution control is actually used within a virtual machine by a patched OS. At this point, it has been recommended that VMware remove exposure of the speculative-execution mechanism to virtual machines on ESXi hosts using the affected Intel processors until Intel provides new microcode at a later date."

These processor families are deployed widely throughout our environments so the impact to stability could be significant. This creates a new exploit vector in that the original Spectre exploits could be used to cause denial of service reboots to our infrastructure. Also, although the issues sighted have been found in those specific families there has of yet been no indication that other families of CPUs are not also vulnerable to this bug. We feel the risk to our stability is too high in comparison to the risk of exploitation.

With or without this firmware patch the OS level protections installed during Microsoft and Linux patching are required to protect your system from any exploitation. These software protections do incur a performance penalty, primarily in IO intensive situations. The microcode patch should provide relief to this impact when a stable alternative is released by Intel. Also when the microcode update is re-released the OS level patches will still be required to enable the functionality at the hardware level. For this reason we continue to advocate aggressive OS level patching by our partners. This is the best path to protect your customers.

At this time the actions we can take with our hypervisor layer will only provide protections for an as yet un-exploited vulnerability that could be used on an un-patched guest OS to see data within another guest OS. We are currently holding off on this patching as it also introduces another layer of performance impact, and given the impact at the OS level already present we are working to balance performance and risk on your behalf.

We are in close communication with Cisco, VMware and Intel on this issue, and will take immediate action on your behalf as options become available. We know this is a stressful environment for you, your customers and Green Cloud as a whole. We are taking this issue very seriously, and are evaluating our best course of action daily. We will provide more updates on the state of these issues and any actions we take as soon as there is information to share. Thank you for your patience and your continued support.

Again, please contact us at 877-465-1217 or send an email to support@gogreencloud.com if there is any support we can provide during this remediation.
Posted Jan 17, 2018 - 12:21 EST
Monitoring
Green Cloud continues to work aggressively on your behalf on these issues. Extensive testing has been conducted with our vendor partners as it relates to the impact of the various firmware, hypervisor, and operating system patches required to fully remediate the issue. We intend to begin patching our firmware and hypervisor software in our data centers no later than Monday, January 15th. This patching will be conducted in one data center at a time beginning with our Phoenix, Arizona data center. We will let these patches "soak" in production in this site for 24-48 hours and proceed throughout the network during the week. As always, we will continue to provide you with ongoing updates as they occur.

As stated in the previous update, it is imperative that you continue to patch your operating systems and virus scanning software parallel to our infrastructure patching. You will find information on our performance testing and information related to OS patching below.

Please contact us if you have any questions or concerns related to these vulnerabilities. We understand the pressure placed on your staff by this additional workload and are here to help however we can. Please contact us at 877-465-1217 or send an email to support@gogreencloud.com if there is any support we can provide during this remediation.

What we know at this point:

Our testing of the initial hypervisor patches provided by VMware along with the OS level patches provided by Microsoft, and the Linux community, have shown surprisingly high performance impacts. These range from 10-25% CPU performance degradation. This impact has been a core reason we have delayed the rollout of our hypervisor patches since they account for a 5-10% performance hit on their own. The OS level performance hit is unavoidable until firmware updates are applied. We felt it best to balance the risk of exploit (which remains low) with the impact to performance. Intel has released firmware and microcode updates which we have also tested in addition to the above patches. These updates appear to solve much of the performance impact by providing protections against the exploit (Spectre) in hardware vs limiting capabilities at the OS level to avoid it. Unfortunately, there have been reports (https://newsroom.intel.com/news/intel-security-issue-update-addressing-reboot-issues/) of this microcode update leading to stability problems with the CPU and causing unpredictable reboots of the compute platform. We continue to monitor the situation with Intel and our other vendors to balance mitigation of the issue with performance and reliability of our platforms. We are hoping we can deploy these firmware patches along with the hypervisor patches next week, but will not do so unless Intel provides an update on the stability.

As it relates to the required Microsoft OS updates, it is important to note the order of operations required for automatic updates to apply. In an attempt to mitigate the BSOD possible when virus scanning software conflicts with the new OS patches, Microsoft has added a registry key that must be toggled by the virus scanner vendor before an automatic application of the patch can occur. Because of this it is imperative that you upgrade your virus scanning software to a supported version and that the vendor toggle this registry key before you can assume automatic updates are occurring. See https://support.microsoft.com/en-us/help/4072699/january-3-2018-windows-security-updates-and-antivirus-software for more details. You can also review https://docs.google.com/spreadsheets/d/184wcDt9I9TUNFFbsAVLpzAtckQxYiuirADzf3cL42FQ/htmlview to see the status of your vendor updates and any caveats related to the registry key.

Again, please contact us at 877-465-1217 or send an email to support@gogreencloud.com if there is any support we can provide during this remediation.
Posted Jan 12, 2018 - 17:22 EST
This incident affected: IaaS (IaaS - Nashville, TN, IaaS - Greenville, SC, IaaS - Houston, TX, IaaS - Atlanta, GA, IaaS - Phoenix, AZ) and Security.