AWS Support initially pushed back and suggested it's because of high replication lag but they were looking at metrics that were more than 24 hours old. What kind of failure did you encounter? I really want to understand what edge case we triggered in their failover process - especially since we could not reproduce it in other regions.