IETF - Email processing outage – Incident details

Email processing outage

Resolved
Operational
Started 8 days agoLasted about 4 hours

Affected

Mail Services

Major outage from 11:30 PM to 2:42 AM, Operational from 2:42 AM to 3:31 AM

Updates
  • Resolved
    Resolved

    At 23:00:00 UTC March 10, we performed what we expected to be a very
    safe maintenance reboot/move of our new mail service instance onto a
    more performant processor and file system. The reboot did not go as
    smoothly as expected and we had to perform some low-level recovery steps
    that took some time. We had exceptional help with that recovery from
    both our new hosting provider (Panix) and our new Sysadmin provider (New
    Machine Futures).

    While we were recovering, mail was spooled and delayed - we believe that
    no messages were lost during this event.

    Recovery was complete around 0200 UTC March 11. Messages queued from the
    datatracker were resubmitted for delivery by 0230 UTC.

    After the unexpected disruption, the result is mail operations on a much
    more powerful underlying system in preparation for IETF 122.

    We continue to monitor the system closely. If you see anything that
    doesn't look as expected, please send a report to support@ietf.org.

  • Monitoring
    Monitoring
    We implemented a fix and are currently monitoring the result.
  • Investigating
    Investigating
    We are currently investigating this incident.