Page last updated by hvs1001 at 10.15 on 21-Jul-2015
Background
Flooding on the New Museums Site, caused by heavy and sustained rainfall early on the morning of Friday 17 July 2015, resulted in the University's Internet connection being cut. Other internal (departmental) networks were also affected, as were a number of UIS and other services.
Internet connectivity for the University was restored using a contingency link shortly after midday and other affected services were restored by the end of the day.
This page will be updated with further information about service disruptions caused by this significant incident. An interim, incident report was published on 20 July with a further update given on 24 July.
21 July 2015
09.45 Work undertaken after 17.00 on 20 July to address the slow connectivity affecting some users was completed shortly after 20.00 on Monday. If you are still suffering from slow connectivity please contact the service desk: service-desk@ucs.cam.ac.uk or phone 01223 (7)62999.
The Streaming Media Service (SMS) started processing new video uploads during the afternoon of 20 July following the successful relocation of hardware.
20 July 2015
09.50 Some users of the network may be experiencing slow connectivity. This is a result of the Network Address Translation service running on reduced hardware, following the loss of a unit on Friday. We are investigating options to resolve this problem but this will take some time to implement and we are currently unable to provide an estimated time to fix. (Posted at 11:15)
19 July 2015
17.35 A faulty network patch cable was replaced at approximately 17.00. Disruption had been observed by some users of the network whose systems rely on Network Address Translation.
17 July 2015
Flooding on the New Museums site cut the University's network connectivity on the morning of 17 July 2015. Connectivity was restored shortly after midday and all affected services have been restored.
18.45 eduroam/University wireless returned to full capacity. Approximately 300 of 3250 wireless access points had not returned to normal operation. Due to the distribution of affected access points some areas suffered higher levels of disruption than others.
Status of systems at 10.15, 21 July 2015
Service | Description | Priority | Status |
---|---|---|---|
lists.cam | Mailing lists | High | Restored |
sftp.ds | SFTP service for Desktop Services | Low | Restored |
bes++ | Platforms hardware inventory | Medium | Restored |
passwords.csx | UIS password changing service (for Raven/Managed Cluster Service etc.) | Medium | Restored |
tokens.csx | Password token service for VPN and University wireless/eduroam | Medium | Restored |
Tissue Tracker (HTA) | Human Tissue Act tissue tracker | Low | Restored |
Google authentication | Authentication plaform for access to Google calendars | High | Restored |
authdns0 | Updates to authoritative DNS | Low | Restored |
wikis | Managed wiki service | Medium | Restored |
Streaming Media Service (SMS) | SMS platform | Medium | Restored |
JANET traffic accounting |
Collection of traffic statistics for re-charging purposes | Low | Traffic since 17.7.15 not being accounted |
UTBS training server | Training server for the University Training Booking Service | Low | Restored |
Userforms | UIS User Administration forms | Medium | Restored |
git.csx | Managed git repository | Medium | Restored |
cookoo.csx | TechLinks site / Institution Strategy | Medium | Restored |
Training feedback | Feedback for UTBS booked courses | Low | Restored |
CamGRID stats | ? | Restored | |
Probing suite | Network friendly probing system | Low | Restored |
Lapwing console | University wireless management console | ? | Restored |
Please note that some services are running with slightly lower levels of resilience than usual.
Earlier updates
13.30 The following services have been brought back into operation:
- Mailing lists on lists.cam (please be aware that there is a backlog that the system is working through)
- Tokens: the token database for eduroam/University wireless
- Google calendars: the authentication system for Google calendars has been fixed
- Streaming Media Service (SMS): web interface is available. New submissions are not currently being processed.
13.00 Major flooding on the New Museums site cut the University's network connectivity and affected a number of services run from equipment hosted on that site. Staff from University Information Services and Janet have now restored internet connectivity, using our contingency link. We are continuing work to bring other affected services online.
12.30: Our contingency external network connection is online and access to and from the Internet should be available for most users. Staff from Information Services continue work to understand which services have been affected, and to bring applications and other services back up in prioritised order.
11.35: Following manual reconfiguration we believe that the tokens service and the Streaming Media Service (SMS) web interface are back online.
11.25: Staff from Information Services are making progress to bring our contingency connection online. The following services have been affected:
- Email: internal email was disrupted overnight but should be returning to normal operation.
- Mailing lists on lists.cam: the mailing list system is currently unavailable, any mail sent to mailing lists will currently NOT be forwarded to users' email accounts.
- Passwords: it is currently not possible to change passwords for UIS-managed services, such as Raven and the Managed Cluster Service.
- Tokens: the token database for eduroam/University wireless is working but it is not possible to get new tokens. Devices already configured to use the wireless network should continue to work but new devices cannot be added.
- Google calendars: the system underpinning access to Google calendars has been broken. When the network is restored it is possible that you will be unable to access Google calendars.
- Wikis: the managed wiki service (wiki.cam.ac.uk) is offline
- Streaming Media Service (SMS): the web interface is offline
10.15: We are aware that some services (including mailing lists) are also affected and that functionality of some services is degraded.
8.25: Engineers are pumping water out of the server room and they will then assess damage to equipment.
8.15: We currently have no estimated time for restoration of services but will update as information becomes available.