One of our client instances generated high volume of errors, this triggered circuit breaker situation when all nodes of internal gateway cluster stopped accepting further requests and removed themselves from list of load balancers, this situation also propagated to front-facing load-balancers. Normally this situation shouldn’t occur to all nodes of a cluster – circuit breaker concept designed to gracefully handle situation when single node loses connectivity to other services, so requests rerouted to healthy nodes. We adjusted default thresholds for circuit breaker logic, recycled cluster and added more nodes to list of load balancers in order to handle such situations gracefully.
Best Regards,
Worldticket Team