This morning between 8:10am and 8:35am EST, a small number of Help Scout customers experienced connection failures that looked like this.
For the last few weeks, our Operations team has been slowly rolling out new infrastructure that supports all functionality built on search. About 75% of our customers were running on the new infrastructure this morning when a production server and it's redundant backup stopped reporting. They seemed to be running just fine, but our hosting provider said they were down. This caused roughly 1/6 of page loads hitting the new search infrastructure to see the error. A restart of both servers resolved the problem.
It's going to take a bit more research over the weekend to figure out exactly what happened and how we'll fix it. The redundancy plan didn't work out in this case and we're going to re-evaluate it.
Although the engineering team responded quickly to the problem, we did come up with some ways to improve alerts and monitoring of the new infrastructure.
We continue to monitor everything and have been back to normal since the brief issue this morning. Our apologies for the intermittent trouble.