This morning between 7:40 AM and 7:42 AM EST, a small number of customers experienced an internal server error stating that Help Scout services were unavailable. This prevented some customers from accessing various pages in the app, including mailbox and conversations views.
All in all, the service interruption lasted roughly two minutes and was the result of a brief connectivity problem with our web hosting provider. The problem was limited to a single availability zone and only impacted roughly 16% of Help Scout customers.
Here's the official explanation from AWS (Amazon Web Services) during the incident:
"We are investigating a period of elevated error rates for the EC2 APIs and new instance launches in the US-EAST-1 Region. During this period some instances experienced impaired connectivity and some EBS volumes experienced degraded IO performance in a single Availability Zone."
As many SaaS products do, we rely heavily on Amazon Web Services to keep Help Scout online. However, we maintain infrastructure in all East Coast availability zones to minimize the impact if a single zone has a problem. That's precisely what happened today. These types of outages at the provider level are super rare, but when they do happen, there's a chance that a small percentage of customers will experience service interruptions. We're very pleased it was cleared up in a couple of minutes.