skip to primary navigationskip to content
 

IT service disruptions following flooding on 17 July

Page last updated by hvs1001 at 10.15 on 21-Jul-2015

Background

Flooding on the New Museums Site, caused by heavy and sustained rainfall early on the morning of Friday 17 July 2015, resulted in the University's Internet connection being cut. Other internal (departmental) networks were also affected, as were a number of UIS and other services.

Internet connectivity for the University was restored using a contingency link shortly after midday and other affected services were restored by the end of the day.

This page will be updated with further information about service disruptions caused by this significant incident. An interim, incident report was published on 20 July with a further update given on 24 July.

21 July 2015

09.45 Work undertaken after 17.00 on 20 July to address the slow connectivity affecting some users was completed shortly after 20.00 on Monday. If you are still suffering from slow connectivity please contact the service desk:  or phone 01223 (7)62999.

The Streaming Media Service (SMS) started processing new video uploads during the afternoon of 20 July following the successful relocation of hardware.

20 July 2015

09.50 Some users of the network may be experiencing slow connectivity. This is a result of the Network Address Translation service running on reduced hardware, following the loss of a unit on Friday. We are investigating options to resolve this problem but this will take some time to implement and we are currently unable to provide an estimated time to fix. (Posted at 11:15)

19 July 2015

17.35 A faulty network patch cable was replaced at approximately 17.00. Disruption had been observed by some users of the network whose systems rely on Network Address Translation.

17 July 2015

Flooding on the New Museums site cut the University's network connectivity on the morning of 17 July 2015. Connectivity was restored shortly after midday and all affected services have been restored.

18.45 eduroam/University wireless returned to full capacity. Approximately 300 of 3250 wireless access points had not returned to normal operation. Due to the distribution of affected access points some areas suffered higher levels of disruption than others.

Status of systems at 10.15, 21 July 2015

ServiceDescriptionPriorityStatus
lists.cam Mailing lists High Restored
sftp.ds SFTP service for Desktop Services Low Restored
bes++ Platforms hardware inventory Medium Restored
passwords.csx UIS password changing service (for Raven/Managed Cluster Service etc.) Medium Restored
tokens.csx Password token service for VPN and University wireless/eduroam Medium Restored
Tissue Tracker (HTA) Human Tissue Act tissue tracker Low Restored
Google authentication Authentication plaform for access to Google calendars High Restored
authdns0 Updates to authoritative DNS Low Restored
wikis Managed wiki service Medium Restored
Streaming Media Service (SMS) SMS platform Medium Restored

JANET traffic accounting

Collection of traffic statistics for re-charging purposes Low Traffic since 17.7.15 not being accounted
UTBS training server Training server for the University Training Booking Service Low Restored
Userforms UIS User Administration forms Medium Restored
git.csx Managed git repository Medium Restored
cookoo.csx TechLinks site / Institution Strategy Medium Restored
Training feedback Feedback for UTBS booked courses Low Restored
CamGRID stats ? Restored
Probing suite Network friendly probing system Low Restored
Lapwing console University wireless management console ? Restored

Please note that some services are running with slightly lower levels of resilience than usual.

Earlier updates

13.30 The following services have been brought back into operation:

  • Mailing lists on lists.cam (please be aware that there is a backlog that the system is working through)
  • Tokens: the token database for eduroam/University wireless
  • Google calendars: the authentication system for Google calendars has been fixed
  • Streaming Media Service (SMS): web interface is available. New submissions are not currently being processed.

13.00 Major flooding on the New Museums site cut the University's network connectivity and affected a number of services run from equipment hosted on that site. Staff from University Information Services and Janet have now restored internet connectivity, using our contingency link. We are continuing work to bring other affected services online.

12.30: Our contingency external network connection is online and access to and from the Internet should be available for most users. Staff from Information Services continue work to understand which services have been affected, and to bring applications and other services back up in prioritised order.

11.35: Following manual reconfiguration we believe that the tokens service and the Streaming Media Service (SMS) web interface are back online.

11.25: Staff from Information Services are making progress to bring our contingency connection online. The following services have been affected:

  • Email: internal email was disrupted overnight but should be returning to normal operation.
  • Mailing lists on lists.cam: the mailing list system is currently unavailable, any mail sent to mailing lists will currently NOT be forwarded to users' email accounts.
  • Passwords: it is currently not possible to change passwords for UIS-managed services, such as Raven and the Managed Cluster Service.
  • Tokens: the token database for eduroam/University wireless is working but it is not possible to get new tokens. Devices already configured to use the wireless network should continue to work but new devices cannot be added.
  • Google calendars: the system underpinning access to Google calendars has been broken. When the network is restored it is possible that you will be unable to access Google calendars.
  • Wikis: the managed wiki service (wiki.cam.ac.uk) is offline
  • Streaming Media Service (SMS): the web interface is offline

10.15: We are aware that some services (including mailing lists) are also affected and that functionality of some services is degraded.

8.25: Engineers are pumping water out of the server room and they will then assess damage to equipment.

8.15: We currently have no estimated time for restoration of services but will update as information becomes available.