Covid-19 made public authorities look beyond traditional disaster recovery plans, writes Spencer Sutton, cloud solution architect at Rackspace Technology
Among the consequences of the Covid-19 pandemic has been a sharp shift in perceptions around disaster recovery and business continuity in the public sector, with the focus moving from the former towards an emphasis on the latter.
It comes down to the fact that organisations, almost as a whole, had not laid plans for dealing with a crisis in which most staff could not go to any office.
Many have done a terrific job in the way they have responded, harnessing cloud services, collaboration tools and remote connectivity to enable staff to work from home and maintain operations when their premises off limits. But they have also acknowledged the need for new thinking and different plans going into the long term.
I recently took part in the UKA Live discussion on the issue with Carol Williams, head of IT at Walsall Council, Owen Powell, chief information officer at Central North West London NHS Trust, Colonel Chopsey Connell, assistant head of Army Digital Services, and UKAuthority publisher Helen Olsen Bedford.
It conveyed a shared experience that during any time of crisis or adversity we accelerate the changes in the way we do things; and for business continuity this means a complete rethink of the way we work.
An audience poll during the discussion indicated that the focus is no longer solely on disaster recovery: 24% of the respondents said there is now a need to build in resilience, and 76% that there should be a shift to ‘continuity as default’.
Cost versus risk
A number of years ago it would have been far too expensive for almost any organisation to have the kind of infrastructure needed to support a ‘active from anywhere’ approach. When this fed into the cost versus risk equation it meant the former tended to have the upper hand in making plans.
But now the technology is evolving quickly and there are a lot of services you can use to take gradual steps and develop more cost-effective business continuity plans. The elements are bound to vary between organisations, but there is consensus that cloud services are now a crucial feature. The pandemic has shown it is possible to quickly scale up the use of services such as Teams and Zoom on a major scale, to take the load off in-house legacy applications, and handle the major surge in network traffic as everyone shifts to home working.
Another example is VMWare running on AWS or Azure. It can be used for a gradual transition into using cloud services, giving an organisation the chance to build the required skills at a manageable pace, and providing a highly resilient, ‘active-active’ environment.
The first step is to identify how you can create a single infrastructure that is centralised and can be accessed from outside the MPLS network, removing all of the potential single points of failure. Then there are a series of incremental steps for moving forward.
Much of this is about selecting the right cloud infrastructure for specific operations, which for some organisations could make wholesale use of public cloud, for others a hybrid of public and private. In an audience poll during the discussion 72% of respondents identified the use of cloud as a major contributor strengthening resilience – more than any of the other technologies mentioned.
Application resilience
Among the factors to take into account is that the applications used, especially those that are mission-critical, have to be highly resilient. As things stand not all are resilient by design, but this is forcing the vendors either to repurpose them as software-as-a service, or design resilience into the applications without additional licensing costs. Organisations should regard this as the new norm and make it a requirement in signing up to use any new applications.
They also need robust testing arrangements for business continuity plans. Most have a small testing team, but scale is always an issue, and synthetic testing does not provide the real experience. They have to look at their capacity in the availability of hardware and remote connections and make sure it is possible to scale up a response to an episode to ensure continuity.
The pandemic has fuelled the case for scoping for peak demand, and it is notable that the hyperscale cloud providers managed the surge in demand from the pandemic with a minimal impact. It has shown the industry is capable of responding to a global emergency.
Organisations can build on this by testing their BC plans to scale, on the assumption that everybody is working and using the range of applications from home. This can help to identify any potential problems in advance, in factors such as employees’ home connectivity, or the scope for widespread remote access to a virtual private network, or the capacity of individual applications to be heavily used through internet connections.
They will also need to think about their success criteria and how they measure their performance, if possible developing metrics that can be used in the testing.
Robotic potential
Over time, robotic process automation is likely to provide another significant element. It could ensure that many routine processes continue regardless of employees not being able to access systems and sharply reduce the risk of disruption for the relevant operations.
There is also scope for a new a new strategy of site reliability engineering, in which organisations can introduce automated testing processes, fail fast, understand the failure tolerances or an application and use those as measurements of uptime. It involves a lot of work, but it could be supported by automation tools and foster a new mindset in organisations.
In dealing with all this it helps to have an expert partner such as Rackspace to manage the technology and provide the support in utilising cloud services, handling the ‘heavy lifting’ around IT and dealing with features such as IT modernisation and cloud optimisation. This supports an IT department through episodes such as the pandemic and frees it up to be more focused on business process and change.
While the pandemic looks set to be with us for a while, public authorities are seeing that disaster recovery will no longer be enough, and that they need to focus on business continuity and align the two more closely.
The public sector is now moving to a point where it should expect business continuity to be embedded into its ‘business as usual’.
You can view the full UKA Live episode here
UKAuthority and Rackspace Technology have collaborated on a series of reports and papers in 2020 investigating key issues for public sector technology, digital and data. You can read more about how Covid has necessitated a switch in focus from recovery to continuity in the briefing note, 'Covid, a new dimension for business continuity'. Complete the form to the left to download
Image from iStock, Natali Mis