Feature: Organisations including the Open Data Institute are now looking closely at privacy enhancing technologies
PETs have appeared on the public sector horizon, as a handful of key organisations are taking steps to encourage the assessment and deployment of privacy enhancing technologies.
There has been talk about them for some time as a future tool for providing secure access to sensitive data, but there is limited understanding of the detail and few examples of them being used in public services anywhere.
They have been widely seen as something for the future, but a series of recent initiatives should provide encouragement for pioneer organisations to begin using the technologies.
The Information Commissioner’s Office (ICO) has published draft guidance on PETs and encouraged their use; it has worked with the Centre for Data Ethics and Innovation (CDEI) on a cost-benefit analysis tool; NHS England has sounded out the market on a service for healthcare; and the CDEI has joined forces with US public authorities to launch competitions in the development of solutions.
The Open Data Institute (ODI) has also addressed the issue, with a programme to demystify PETs and promote their use as a major feature in data protection.
Tools and practices
It has defined them as tools and practices that enable greater access to data that may otherwise be kept closed for reasons of privacy, data protection, commercial sensitivity or national security.
The older types of PETs have emphasised the anonymisation of individuals through methods such as the pixellation of faces and disguising voices, but ODI senior researcher Calum Inverarity says a broader interpretation has emerged with an emphasis on using digital technology to minimise the sharing of data.
This includes the concept of federated learning – previously highlighted by the organisation – which uses data from different sources to train machine learning models without moving it from those sources, and has been used with sensitive patient data in healthcare.
The ICO guidance outlines the other main types as:
- differential privacy - a property of a dataset or database that guarantees people’s indistinguishability;
- synthetic data - which replicates patterns and statistical properties of real data through synthesis algorithms;
- homomorphic encryption - which makes it possible to perform computations on encrypted data;
- zero knowledge proofs - protocols through which one party can prove to a verifier that they are in possession of a secret, such as being able to prove their age without actually revealing it;
- trusted execution environments - a secure area inside a device’s central processing unit that runs code and accesses information in a way that is isolated from the rest of the system;
- secure multi-party computation (SMPC) - a protocol that allows different parties to jointly process their combined information without needing to share all of it with each other;
- and private set intersection - a type of SMPC that allows two parties with their own datasets to find elements they have in common without revealing or sharing them.
Inverarity says these have a significant potential in public services when dealing with highly sensitive data that people are worried about sharing. He has previously pointed to how federated learning has been used by Moorfields Eye Hospital to help with training of machine learning algorithms to help doctors in diagnosing patients with age related macular degeneration.
“It allowed them to use a much larger dataset of scans of eyes, which has helped with the training of the algorithm so it is more accurate and generalisable,” he says.
“And there is an example of where they used secure multi-party computation to identify instances of individuals committing tax fraud. Essentially it allowed for cross-referencing of databases with highly sensitive information on individuals.
“There is another one from the Netherlands where they used secure multi-party computation to identify instances of human trafficking using referencing between a couple of highly sensitive datasets.”
Workarounds and misuse
He adds that, despite the emphasis on using the technologies for the public good, there is some potential for workarounds that could lead to misuse in the cross-referencing of datasets and analysis on sensitive data, and there will be a need for some checks and balances.
Similarly, they might be used in surveillance in the future, which would also require safeguards.
Overall, he sees a positive potential for PETs in public services but acknowledges there are barriers to overcome; and it is notable that the ICO’s case studies of existing use are focused purely on financial services.
The challenges for public sector include that using PETs requires a technical expertise that is currently in short supply. Overcoming this will include a learning curve and, at least into the medium term, contracting external support.
“Organisations that will use these technologies will have their technical experts and the people who are more responsible for data governance decision making,” Inverarity says. “There will have to be a degree of familiarity across the organisation and people involved in the process.
“Usually there will be particularly technical people on the implementation side of it, but it does require those who are thinking about the data governance processes, data stewardship, to still have a very good understanding of what the technologies can do and some of their limitations, but perhaps without having themselves to get into the real weeds of the technical aspects.”
Increasing expertise
He acknowledges the need for external support, but adds: “At the same time there are pockets in government and arm’s length bodies looking into the use of these. The Office for National Statistics has a team working on it.
“Some of the technologies like synthetic data and federated learning and analytics have been tried out so there is some internal expertise. It’s about growing it.”
Costs are also a barrier, with the financial squeeze on the sector and a perception for now that PETs are something of a luxury, especially when they are yet to be fully tried and tested for public services. Again, Inverarity says it is a significant issue but makes an optimistic case for the long term.
“There are examples of where efficiencies could be gained from using some of these technologies, and as the evidence base continues to grow we will see organisations have more confidence that it is worth the investment.”
It raises the possibility that, despite growing awareness of PETs in the sector, most ideas for deployment could be on the back burner for some time.
Raising the profile
But Inverarity responds with the point that there are efforts to raise the profile of these technologies and get people thinking about how they might actively use them. This is behind the ODI’s programme, which involves increasing awareness of different types of PETs, increasing understanding of the challenges and supporting their application to deal with economic and social challenges.
It is also looking at how to ensure policy makers, funders and other actors can overcome the barriers to adoption.
Inverarity says that one option for public authorities looking at the technology is to speak with the ODI, and that there are plenty of other sources. He also points to the ICO’s development of a sandbox for experiments with PETs solutions.
The structures for sharing knowledge are currently quite informal, but Inverarity says the volume of material and number of organisations involved are increasing, and that while it is work in progress “it is being fleshed out”.
“It is a bourgeoning ecosystem in which there is a lot you can find out, increasingly for different audiences,” he says. “There are a lot of academic papers, and for more practical considerations there’s the work by the ODI, the CDEI. Also, a lot of these organisations are companies that can provide PETs and are very willing to chat and explain their products and answer questions.”