[RESOLVED] Rules not automatically rescheduling

josh · October 24, 2023, 3:17pm

TLDR : Recurrence of the Timer Scheduler issue on Sunday, 2023-10-22. We’ve added new checks to catch future occurrences. Some rules may need to be manually reset.

Issue Recurrence:
The ‘State Stays’ issue reappeared on Sunday, 2023-10-22, affecting certain timer and state-stays rules.

Previous Fixes:
The health check added in September didn’t catch this issue. This is partly because we rely on a core Google Service that’s difficult to simulate for testing.

New Fixes:
We’ve set up a two-tiered health check system:

Readiness Check: Runs every 5 seconds. If it fails twice, the affected server is removed.
Liveness Check: Runs every 30 seconds. If it fails four times, a new server is spawned.

Both checks have backup systems for added reliability.

Next Steps:
We are identifying the affected rules to automatically reschedule them. For now, you can manually reset any impacted timers by simply opening and saving the associated rule.

We’re also monitoring the new checks closely and will tweak them as needed.