skip to primary navigationskip to content
 

Techlink CUDN update: it's time to plan your PoPs

UIS's Bob Franklin provided a round-up of developments in the Cambridge University Data Network (CUDN) for the April Techlink seminar, including an outlook for PoP upgrades during 2016–2017.

UIS Networks combines the CUDN, the ACN and the Hostmaster group

The biggest change since our previous update on the CUDN is, of course, the merger of UCS and MIS to form UIS. As part of the restructuring, the UCS Network Division became UIS Networks and inherited responsibility for the Administrative Computing Network (ACN). We're planning to refresh the edge equipment in the ACN and absorb the backbone into the CUDN (probably using an MPLS service). This  has little direct impact on the CUDN except that our resources will be divided between the two networks in the interim.

We've installed a second connection to Janet

Last year, we installed a second connection to Janet – this time in the West Cambridge Data Centre (WCDC) rather than the centre of town. This doubles our throughput and gives us a back-up in the event of a fault at the primary site. Most traffic is routed through the primary link – including all non-NATed global addresses, IPv6, the bottom half of the NAT and UniOfCam. The secondary link is used for the upper half of the NAT and eduroam.

CUDN diagram for update

We've upgraded the NAT service

The Network Address Translation (NAT) service, which translates CUDN-wide private addresses into public IP addresses, has been upgraded to newer equipment, supporting approximately ten times the traffic levels of the old boxes. This will allow the service to cope with the increase in traffic levels and the greatly increased number of devices since the service was introduced in 2009. It will also resolve the issues seen during the July 2015 incident where one box was unable to cope while the other was offline.

'Multipathing' has been enabled

We've also enabled 'multipathing' on the CUDN, which should noticeably increase the speed of the network at peak times. Traffic used to be carefully steered to use most links: traffic from institutions to Janet was carried via one core router; while traffic from UCS servers to Janet and institutions, as well as inter-institution traffic, flows through the other core. Now traffic takes all paths, so both links down to an institution are used – allowing for a possible throughput of 2Gbit/s.

The VPN has been replaced with a higher-performance service

Our new VPN service, introduced in December 2014, is based on open-source software running on Linux hardware. It supports built-in clients on common OSs and uses the same network access token as eduroam.

We followed this with the introduction of a managed VPN service in March 2015, which has a dedicated client IP address pool and can be restricted to a subset of users, by Lookup group.

The old VPDN service was retired in April 2015.

CUDN upgrade plans for 2016: new equipment from core to border

We've adopted the 'tick-tock' approach to upgrading the CUDN – that is, we've decoupled equipment upgrades from topology changes, rather than performing both simultaneously. We're currently in a 'tick' phase: replacing equipment while retaining the same configuration so we can resolve any issues that arise based on our experience of running the network. This avoids the risks inherent in a 'big bang' upgrade.

It's the right time to replace the network equipment. Our current equipment is 12 years old, and support has ended for some modules. The rest will reach this milestone within 12 months.

The border routers were the first to be upgraded. We've used the Cisco Catalyst 6880-X, with Supervisor 2T and sixteen 10GE port line cards. The first was (rather rapidly!) deployed during the flooding incident in July 2015 at a secondary location. The second was installed in November to replace the flood-damaged equipment.

We've replaced the Cisco Catalyst 6509-E and Extreme Summit X460 routers and switches in the data centre with Cisco Nexus 7010, 56128P and 2Ks – a platform specifically designed for data centre deployments.

Next on our list are the core/distribution routers, which we're planning to replace with Cisco Catalyst 6807-XLs during the summer vacation this year. We'll be replacing one router at a time, using the redundancy inherent within the CUDN to maintain service while the work progresses.

We'll also have one of the first networks in the world to use the brand new Catalyst 6800 Supervisor 6T as the router's 'CPU', as well as 32 new 10GE SFP+ ports (which also support 1GE connections) for institutional connections. Having a consistent module across all routers will simplify spares.

The backbone links will initially run at the same 10Gbit/s speed, but late in 2016 we'll be upgrading to 40Gbit/s using the 40GE links on the Supervisor 6T. This, combined with the new multipathing configuration will give us a core bandwidth of between 40 and 80Gbit/s.

Top of the PoPs: preparing for the upgrade

Our plans for the PoPs are probably the most eagerly awaited part of this update. The current equipment is 8 years old and software support for it ended recently. We can support some 10GE connections on it, but the limited capacity makes it necessary for us to justify the need and prioritise accordingly. Add to that a bulky separate redundant power unit (which is bigger than the PoP!), and you have a less than ideal solution.

We're aiming to buy new PoPs in late 2016, while the backbone rollout is in progress. They'll have integrated redundant power supplies, optional UPS, and BGP will be priced lower to reflect the lack of a switch. We're also exploring the possibility of offering two PoP switches that are split across sites but function as a single logical switch. This has the advantage of physical redundancy – albeit with the added expense of additional hardware.

As a result, we'll have a number of options (1GE and 10GE) for institutions to choose from, depending on their bandwidth requirements (see table below). Institutions should expect to be asked to make a choice before the end of the calendar year. They'll need to think about what their requirements are likely to be for the next few years, so that we can avoid procuring an excessive number of unwanted switches.

Likely PoP options

Model Uplinks 10GE downlinks 1GE downlinks
1GE 2× 1GE Multiple (~20×) PoE+
10GE 2× 10GE 2× SFP+ Multiple (~20×) PoE+
Multiple 10GE 2× 10GE Multiple (~8×) SFP+
BGP 1GE 2× 1GE
BGP 10GE 2× 10GE

Further reading

The following technical articles provide further details on the services mentioned above.

Contact us to report an IT-related incident, such as phishing, a computer behaving unpredictably or online harassment. You can also raise a concern about IT use under the Prevent duty.

Fill in this form