Resolved

Both power supplies were replaced around 16:30 this afternoon and services have been stable since then.

Please accept our apologies for the inconvenience caused.

Our final analysis of the failure is the same as our initial assessment - both power supplies are deemed to be faulty.

The 2nd power supply output failed at 14:09:04. Normally, this should not cause an operational problem as one power supply is supposed to be more than sufficient to power the system, however the remaining power supply was then unable to supply the required power on its own.

We were on site at 14:35 and power-cycled the entire system.

The 1st PSU then indicated that it was supplying power to the router, and the router appeared to boot normally.

Plans were made to replace both power supplies from spares this evening until it became clear that the system was not operating normally after being rebooted and so this work was instead expedited to be carried out at the earliest opportunity.

Once the power supplies were replaced the system stabilised immediately, however due to the fact that it was self-evidently operating in an undervoltage condition for some time immediately before the power supplies were replaced, we will be reloading rt3.the between 00:00 and 01:00 to ensure that the operating system and data structures on the routing engine and line cards are in a consistent state to avoid any latent effects from any currently undetected memory corruption.

A separate emergency maintenance window will be posted shortly to track this work.

Avatar for Phillip Baker
Phillip Baker
Investigating

There appears to be a simultaneous fault in the two power supplies in rt3.the, causing loss to directly connected services and also causing disruption as the router powercycles itself

Engineering resource is on the way

Apologies for the inconvenience

Avatar for Phillip Baker
Phillip Baker
Began at:

Affected components
  • Core Network Functions
    • Layer 3 (THE)
    • Layer 2 (THE)