Skip to the content

Managing the public sector’s data explosion

09/09/24

Cribl Industry Voice

Get UKAuthority News

Share

There are ways of managing the ever-growing volume of data whilst maintaining flexibility and cost control, writes Berwyn Jones, head of UK public sector at Cribl.

It is no secret that there has been exponential growth in the volume of data used in the delivery of public services, and there is every indication that it is going to continue over the next few years.

Today's public sector organisation's face the challenge of managing ever growing volumes of IT and security data. According to IDC this is growing at an annual rate of 28%.

This latter point has assumed a new importance with the Government’s plan for a new Cyber Security and Resilience Bill, which aims to expand the remit of regulation to cover more digital public services and their supply chains – inevitably demanding oversight of even more data.

All this data has to be stored and managed, which inevitably incurs cost. Logic dictates that the more data we use and store, the more it will cost. But as the financial outlook for the public sector indicates that it will not receive additional funding to respond to these demands, the sector will have to develop new approaches to managing its data within existing budgets.

This is no simple task. Apart from the need to analyse and use data from multiple sources the scenario is further complicated by the widespread use of hybrid cloud environments, with organisations using different combinations of public and private clouds and on-premise facilities for storing and processing data.

Infinite data, not infinite budget

Getting control of rising data storage costs is fast becoming a major issue for digital teams in the sector, and provided the focus of a recent UKA Live discussion in which I took part with Richard Woolham, lead product owner for performance monitoring and data analytics at DWP Digital, Stuart Bowell, global head of observability at NETbuilder, and UKAuthority publisher Helen Olsen Bedford.

The discussion identified some of the major challenges in the explosion of public services data and measures that could be taken to make it manageable within the expected constraints on budgets.

A prime challenge has emerged around the entirely human response of data hoarding, deriving from organisations holding onto massive volumes on the basis that ‘it might be needed in the future’, but most of which is unlikely to be of any real use. This comes at a significant cost and complicates the demands on making the right data available and re-usable at the right time.

Organisations are having to make compromises that some fear could create operational blind spots and undermine their security postures and the reliability of systems. Making the right choices involves understanding not just existing but potential future uses of the data, which is very difficult with all the unknowns around cyber threats and the evolving possibilities with AI.

Avoid lock-in

There are real – and not unfounded - worries about data being locked into specific platforms or solutions, with the risk that effective control lies in the hands of the suppliers. And once within a proprietary platform it could be difficult to extract core data for reuse in others.

In addition, many such platforms are licenced on data ingress, meaning that users were paying to work with data that was easily discarded in subsequent workflows – but the storage costs remained.

Flexible plumbing

The sheer number of sources from which data must be collected, often in different formats and for different purposes, is adding to the complexity. Meanwhile, the pressure is intensified by the growing need for real-time data to support operations and cyber security.

Trusted 'plumbing' to direct and sort this flow of data is fast becoming an essential if the sector is to own and control its data, have freedom to make decisions on where it is stored and put only essential volumes into costly processing platforms, as and when needed. 

Democratising data

There are techniques for achieving this. One is to get a clear view of prime uses for the data to determine where and how (and at what cost) it is stored. For observability and security purposes for example it is likely to be needed and processed in real time, but for governance it would be in line with audit cycles and therefore easier to store at low cost when there would be no urgency in retrieving it.

Another is to ensure that any agreement with suppliers recognises that the data belongs to the public sector body and that it has the right to recover, use and move it at any time. This comes with ensuring there are no technical barriers, making sure that data will be stored in accessible formats over the long term, and that the systems have the necessary degree of interoperability.

This should be a crucial element of future procurements, although there is a challenge in breaking free of legacy or proprietary terms and conditions when it can otherwise seem easier to stick with the incumbent supplier.

Going as far as possible with the standardisation of data can also provide benefits. There is a shortage of data science skills needed to build data pipelines, but standardising the structure and format of data whilst surfacing key information makes it easier to train people to collect and use data without expensive external support.

If these principles are fed into a data strategy, aligned with the organisation’s broader goals for the long term, they can do a lot to future proof the data, ensuring that it will retain its value and keep the costs in check. 

Empowering public sector data management

There is a new generation of tools designed to support the complex data management needs of the public sector. For example, Cribl Stream can seamlessly integrate from multiple proprietary sources, transforming and routing it to various destinations as required. This is made possible through its extensive range of pre-built integrations and a unique capability to standardise multiple data formats between source and destination.

In addition to Cribl Stream, the Crible suite of products includes other powerful tools that further enhance data management capabilities:

  • Cribl Edge facilitates data collection from edge devices, ensuring that data from various sources is captured accurately and in real time.
  • Cribl Search helps users to perform efficient and powerful searches across all data, enabling quick insights and decision making.
  • Cribl Lake provides scalable and cost-effective storage for data, enabling organisations to store that data in open formats for future use and compliance.

This new generation of solutions can help democratise data by enabling natural language querying and generation of pipelines without input from highly skilled data scientists - a resource often in short supply in the public sector. Cribl Copilot, for example, provides such an interface, effectively hiding the complexity and ‘messiness’ of the data streams from users where appropriate - whilst maintaining integrity and control of data flows.

Ultimately, the latest tools can be harnessed to regain control of your data, give you choice in where you store and use it, and the flexibility to react in real-time.

In a world of ever-expanding data the public sector is tasked with mining maximum value from its data, controlling costs and forging solid foundations for future AI - having a granular level of control over its data will be crucial.

Would you like to understand more? Cribl has recently published a report titled Navigating the Data Current that can be accessed from here

Catch up with the full discussion below - UKA Live: Set your data free

Image source: istock.com/liulolo

Register For Alerts

Keep informed - Get the latest news about the use of technology, digital & data for the public good in your inbox from UKAuthority.