ioBasis
Cloud made simple

Amazon EC2 network connectivity issues in N. Virginia

StaffGeneral

On October 9th, 2020, Amazon EC2 had network connectivity issues affecting the North Virginia region.

AWS-Connectivty-Issues-In-Virginia

This issue was finally solved before midday PDT. And AWS reported the details of the incident on AWS status.

This report won’t be available after October 9th. For this reason, we kept a record of the incident. You might use it to understand the events that occurred during the incident. And if your service SLAs weren’t met, you can request Service Credits in your account. You should create a support incident to get them.

Here are the details that were reported by AWS.

1:48 AM PDT We are investigating networking connectivity issues for a small subset of newly launched EC2 instances within a single Availability Zone (use1-az2) in the US-EAST-1 Region. We have identified root cause and are working towards resolution. Network connectivity for existing instances is not affected by this issue. For newly launched instances that are affected, relaunching a new instance may resolve the issue.

2:47 AM PDT We continue to work toward recovery for the networking connectivity issues affecting a small subset of newly launched EC2 instances within a single Availability Zone (use1-az2) in the US-EAST-1 Region. Network connectivity for existing instances remains unaffected by this issue. For newly launched instances that are affected, relaunching a new instance may resolve the issue.

4:53 AM PDT We are still working toward recovery for the networking connectivity issues affecting a small subset of newly launched EC2 instances within a single Availability Zone (use1-az2) in the US-EAST-1 Region. Network connectivity for existing instances remains unaffected by this issue. For newly launched instances that are affected, launching a replacement instance may resolve the issue.

6:35 AM PDT We are still working towards recovery for the ongoing networking connectivity issues. These affect a small subset of EC2 instances launched after October 08, 2020, at 9:37 PM PDT within a single Availability Zone (use1-az2) in the US-EAST-1 Region. For instances that are affected, customers can launch replacement instances in another Availability Zone.

9:57 AM PDT We wanted to provide you with some more details on the issue affecting network connectivity for a subset of EC2 instances in a single Availability Zone (use1-az2) in the US-EAST-1 Region. The issue is affecting the subsystem responsible for updating VPC network configuration and mappings when new instances are launched or Elastic Network Interfaces (ENI) are attached to instances, within the affected Availability Zone. This subsystem makes use of a cell-based architecture, which subdivides the Availability Zone into smaller cells, with each cell being responsible for the VPC network configuration and mappings for a subset of instances within the Availability Zone. At 9:37 PM PDT on October 8th, a single cell within this subsystem began experiencing elevated failures in updating VPC network configuration and mappings for instances managed by the affected cell. These elevated failures cause network configuration and mappings to be delayed or to fail for new instance launches and attachments of ENIs within the affected cell. The issue can also cause connectivity issues between an affected instance in the affected Availability Zone and newly launched instances within other Availability Zones in the US-EAST-1 Region, since updated VPC network configuration and mappings are not able to be updated within the affected Availability Zone. We have identified the root cause and have been working to resolve the issue and restore the updating of VPC network configuration and mappings within the affected cell. For instances that are affected by this issue, relaunching the instance within the affected Availability Zone may mitigate the issue. If possible, relaunching the instance in other Availability Zones will mitigate the issue. We will continue to provide updates as we work towards full resolution.

11:11 AM PDT We have taken steps to address the issue affecting network connectivity for some instances in a single Availability Zone (use1-az2) in the US-EAST-1 Region. As of 10:20 AM PDT, we started to see recovery for affected instances and continue to work toward full resolution of the issue.

11:54 AM PDT Starting at 9:37 PM PDT on October 8th, we experienced increased network connectivity issues for a subset of instances within a single Availability Zone (use1-az2) in the US-EAST-1 Region. This was caused by a single cell within the subsystem responsible for the updating VPC network configuration and mappings experiencing elevated failures. These elevated failures caused network configuration and mappings to be delayed or to fail for new instance launches and attachments of ENIs within the affected cell. The issue has also caused connectivity issues between an affected instance in the affected Availability Zone(use1-az2) and newly launched instances within other Availability Zones in the US-EAST-1 Region, since updated VPC network configuration and mappings were not able to be updated within the affected Availability Zone(use1-az2). The root cause of the issue was addressed and at 10:20 AM PDT on October 9th, we began to see recovery for the affected instances. By 11:10 AM PDT, all affected instances had fully recovered. The issue has been resolved and the service is operating normally

In case of any future availability issues, you can check the latest updates using AWS status