Today at 4:45pm EST, roughly 1/6th of Help Scout customers started seeing internal server errors on some pages. We had a couple of servers, a production node and it's redundant backup in the same shard, stop functioning. Our Ops team was not alerted to the issue because the instances were still up although they were reporting errors.
Once the issue was discovered at 8:30pm EST, the Ops team restarted the instances and restored service fully to the impacted customers by 8:55pm.
We don't believe any data to be missing from the impacted accounts. However, as a precaution, we're reindexing all data from the last 24 hours just to be sure. If you see anything abnormal in your account, please don't hesitate to reach out.
Two things need to be improved on our side to make sure this never happens again: