Atlanta | Maintenance Impact Out of Scope | 28 March 2023
Incident Report for Green Cloud Defense
Postmortem

CUSTOMER INCIDENT REPORT (CIR)

Customer: Mutli
Change Number: 17602
Product: IaaS
Impact Start: 28 March 2023, 07:30 UTC
Data Center: Atlanta
Impact Resolution: 28 March 2023, 14:55 UTC
Region: US
Date of CIR: Posted 13 April 2023

IMPACT STATEMENT

A subset of customer Virtual Machines (VMs) in the Atlanta data center experienced High Availability (HA) events and then entered an error state which showed as “Inconsistent State” in vCloud Director (vCD).

EVENT OVERVIEW

On 28 March 2023, at approximately 07:30 UTC, during a planned non-impacting Unified Computing Systems (UCS) upgrade maintenance, a subset of VMs in the Atlanta data center experienced HA events and then entered an error state which showed as “Inconsistent State” in the vCloud Director (vCD).

Engineers identified that two devices that were part of the UCS upgrade came back online without their configuration data. This lack of configuration data resulted in multiple HA events. Following the HA events, a subset of VMs unexpectedly came back online in an error state. Engineers compiled a list of affected VMs and went through a manual process to power the VMs off and then on again to resolve and clear the “Inconsistent State” error message in vCD. All customer impact was resolved by 14:55 UTC.

ROOT CAUSE ANALYSIS

Engineers identified that two devices that were part of the UCS upgrade came back online without configuration data. This lack of configuration data resulted in multiple HA events. Following the vMotions, a subset of VMs unexpectedly came back online in an error state.

ACTION ITEMS: Actions taken to resolve the issue, preventative actions or process improvements:

Action: Engineers compiled a list of affected VMs and then worked to manually resolve and clear the errors shown in vCD.

Posted Apr 13, 2023 - 13:19 EDT

Resolved
Engineers continued to monitor for stability throughout the day. This incident is now resolved.
Posted Mar 28, 2023 - 20:35 EDT
Monitoring
Services have been restored for the affected VMs and engineers are continuing to monitor for stability. If you have any further issues, please contact a member of your support team for assistance.
Posted Mar 28, 2023 - 10:58 EDT
Update
Engineers are continuing work to clear the "Inconsistent State" error on the subset of affected VMs. Please contact a member of your support team if you need immediate assistance with a VM listed in "Inconsistent State".
Posted Mar 28, 2023 - 10:37 EDT
Update
We are continuing to work on a fix for this issue.
Posted Mar 28, 2023 - 10:00 EDT
Identified
Our engineers are working to restore services for a portion of IaaS customers in the Atlanta data center. During this time, a portion of customers may see an error (Inconsistent Status) for the affected VMs. Currently, this must be resolved by our support teams. If you see this error, please contact a member of your support team for assistance.
Posted Mar 28, 2023 - 09:29 EDT
This incident affected: IaaS (IaaS - Atlanta, GA).