skip to primary navigationskip to content
 

Forward actions in response to operational incidents

Recently there have been two operational incidents: on 31 January 2015, when water penetrated the Central Network Hub, and on 23/24 February 2015, when a West Cambridge power incident disrupted some systems at the Soulsby building overnight. Neither incident caused significant disruption, but in both cases the potential for serious repercussions was severe. This report sets out the actions to ameliorate the recurrence of such events.

Central Network Hub

In the water-leakage incident (described in http://www.uis.cam.ac.uk/reports/central-network-hub-incident-30-january-2015) the back-up of a storage system supporting the Schools of Arts and Humanities and Humanities and Social Sciences was destroyed. Urgent actions were taken to ensure a regular backup copy of the live system (based in the Soulsby) is transferred to the storage platform at the West Cambridge. With the agreement of the two Heads of Schools, University approval and procurement procedures have been fast-tracked to place an order for a new permanent backup storage system, which will be relocated at the Roger Needham building.

The University's Internet connection is located in the Central Network Hub and the 31 January incident is considered a "near miss". The contractors have been reminded of the imperative of sustaining this service but there are still many months of refurbishment work to do on the Arup building, and there is a possibility of recurrence. Had the Janet hub been damaged, the outage, depending on the situation, had the potential to have been prolonged. In this context discussions have taken place that have involved the Networks Division and the Head of Janet and members of his team. This has resulted in the following agreement:

  • to put in place an emergency contingency arrangement that will allow rapid activation of an alternative Internet connection route from the William Gates building, and
  • to provide an interim additional Janet connection within around two months that will provide a "live standby".

Finally, the longer term plan (which was already in place) to provide a second Janet connection direct to the West Cambridge data centre is being expedited, with preparatory work underway.

Soulsby incident

In the Soulsby building power incident of 23/24 February (http://www.uis.cam.ac.uk/reports/incident-report-west-cambridge-power-issue-23-february-2015) two successive power fluctuations caused failure of the air handling units and consequent building management system shut down due to rising temperature. 

  • Planned upgrades of the power system on the weekend 14-15 March will include an investigation of these issues.
  • There will also be a review of the air handling unit power configuration, including an assessment of whether this too can be sustained by the UPS/generator in the event of future power issues.
  • Documentation of manual restarting of UPS after thermal shutdown, and investigation of automated alerts.

Last updated: 13 March 2015