International Developments in Privacy Enhancing Technologies (PETs)

International

In March the Organisation for Economic Co-Operation and Development (“OECD”) became the latest in a long line of organisations / public bodies to publish a report (Emerging privacy enhancing technologies: Maturity, opportunities and challenges (oecd-ilibrary.org)) focused on the benefits of PETs (the technology tool, rather than furry kind).  The OECD report reviews recent technological advancements and evaluates the effectiveness of different types of PETs, as well as the challenges and opportunities they present. It also outlines current regulatory and policy approaches to PETs, with the intention of helping privacy enforcement authorities and policy makers better understand how they can be used to enhance privacy and data protection, and to improve overall data governance.

It is increasingly clear that regulators worldwide are looking to promote the advancement and adoption of this technology.  In this article we highlight some of the recent developments, and explore some of the factors influencing the push to develop and better leverage the use of PETs and the emerging UK regulatory approach.

UK developments

ICO draft guidance on PETs

The Information Commissioner’s Office (“ICO”) published draft guidance on PETs in September 2022, the finalised version of which is expected to be published in late spring 2023.  It focuses on the different ways businesses can utilise technologies to engage in responsible and lawful data sharing for their business needs.  It also provides a useful overview for businesses of some of the more common types of PETs:

  • Homomorphic encryption (“HE”)
    • HE allows encrypted computations to be performed on data without it being decrypted.
  • Secure multiparty computation (“SMPC”)
    • SMPC is a protocol which allows two parties to process combined data without each set being revealed individually to the other.
  • Private set intersection (“PSI”)
    • PSI is a type of SMPC which allows two parties to find the intersection of their individual datasets without each being revealed to the other.
  • Federated learning (“FL”)
    • FL allows different parties to train AI models with their respective models to create one more accurate model without the individual parties’ training data being shared with the other parties.
  • Trusted execution environments (“TEE”)
    • A TEE is made up of software and hardware and is an isolated space within a central processing unit where code can be run independently of the rest of the system.
  • Zero-knowledge proofs (“ZKP”)
    • A ZKP is a protocol by which a party can verify that they possess secret information unknown to the verifier, without the verifier needing access to that information. Examples include confirming someone’s age without revealing a DOB, or proving ownership of an asset without displaying transaction data.
  • Differential privacy (“DP”)
    • DP is a way of measuring the output of a computation to assess how much of an individual’s revealing data is involved.
  • Synthetic data (“SD”)
    • SD is artificial data generated by algorithms to look like and behave like real data such that analysis of the synthetic data will produce results similar to analysis of real data.

The ICO guidance takes a fairly practical and pragmatic approach, and recognises that perfection is not necessarily achievable or required for an appropriate standard of anonymity to be achieved.  Instead the guidance focuses on whether, having deployed the PET(s), the likelihood of someone being identified or re-identified from the resulting data is reduced to a sufficiently remote level.  In order to demonstrate this an organisation will need to demonstrate that the likelihood of the following is sufficiently remote:

  • singling out (i.e. individuals cannot be differentiated or singled out from the data set);
  • linkability (i.e. multiple records about the same individual, from the same or across different data sets, can’t be combined in order to link them together in relation to a particular individual);
  • inferences (i.e. information from various sources can’t be used to deduce something about an individual from the data set),

having regard to the means reasonably likely to be used (including by a motivated intruder).  The anonymisation processes used need to address identifiability risk in the relevant context – including by taking due account of the nature, scope, context and purposes of the proposed processing, as well as the risks it poses.

From a practical perspective the guidance (consistent with Recital 26 UK GDPR) indicates that relevant objective criteria to take into account when considering the extent to which identification is technically and legally possible include:

  • how costly identification is in human and economic terms;
  • the time required for identification; and
  • the state of technological development at the time of processing (i.e. the techniques being used to anonymising the data, and/or when the data set is being shared with another person); and
  • future technological developments (i.e. as technology changes over time).

Identifiability exists on a spectrum - the status of information can change depending on the circumstances of its processing. The extent of disclosure / potential recipients is also a relevant factor. 

If (and for so long as), in the hands of a recipient organisation the identifiability risk is sufficiently remote, then from its perspective it can be considered effectively anonymised.

Employing PETs is a way for an organisation to demonstrate that technical and organisational measures are in place which enable the organisation to process (and potentially transfer) personal data in full compliance with its data protection obligations. PETs can demonstrate ‘data protection by design and by default’ by providing an adequate level of security, by minimising the risk of personal data breaches in scrambling/removing/replacing the personal elements of the data such that only those authorised to access them are able to and by ensuring that only the data required for your specific purpose is being processed.

For the ICO, PETs afford an opportunity to reduce the risk to individuals of their personal data being breached while simultaneously enabling meaningful representative data analysis without having to give a processor access to the original personal data.  Notwithstanding this, the circumstances of processing anonymous data requires ongoing oversight and management.  The use of PETs and anonymous data derived through their use needs to form part of an organisation’s wider data governance framework; and risk assessments and decision-making processes will need to be reviewed at appropriate intervals.

Emerging Technology Research Hub

In March 2023 the Financial Conduct Authority (FCA) published information about the Emerging Technology (EmTech) Research Hub, for which synthetic data and PETs are an area of focus. The FCA will shortly publish the findings from the March 2023 industry-academia roundtable which it co-hosted with the Alan Turing Institute.

The EmTech Hub works closely with the Competition and Markets Authority (CMA), Office of communications (Ofcom) and Information Commissioner’s Office (ICO) through the Digital Regulation Cooperation Forum (DRCF) and its horizon scanning programme.

In connection with this, the FCA has also stated that in May 2023 they will launch a synthetic data expert working group to bring together perspectives and case studies to enable the responsible evolution of synthetic data. The group will be a framework for collaboration across industry, regulators, academia and wider civil service members.

Royal Society publishes report on PETs in data governance

At the end of January 2023 the Royal Society published a report into PETs which it developed in collaboration with the Alan Turing Institute and which results from consultations with a wide variety of privacy and data stakeholders. The key finding of the report is that trust in PETs remains low which, as with any new technology, means the potential they offer for collaboration and analysis is not being explored by potential users who stand to benefit most, including public bodies.

One particularly salient public body use case highlighted by the report is employing federated learning technology on biometric data for health research and diagnostics, which would be particularly powerful in the UK with its centralised healthcare system. However, without quality mechanisms or external standards against which to measure PETs, which would allow organisations and individuals to weigh up the data security risks against the potential benefits of collaborative data analysis or communal data sharing, users are understandably reluctant to take any data protection risks. Indeed, the financial and reputational risks to the NHS of misusing and mis-sharing sensitive healthcare data would be enormous.

The report suggests that technical standards would give users comfort as to the safety and security of data, while a set of process standards would improve users’ understanding of how best to use PETs to their benefit.

 

US developments

OSTP Unveils Strategy to Advance Data Privacy Tech

In March 2023 the White House launched a roadmap for both public and private sector entities to navigate the use of privacy enhancing technologies. The National Strategy to Advance Privacy-Preserving Data Sharing and Analytics (PPDSA) was released by the Office of Science and Technology Policy (OSTP) to support PPDSA technologies and methods to “maximize their benefits in an equitable manner, promote trust, and mitigate risks.”

 

Canadian developments

OPC publishes blog on synthetic data

The Office of the Privacy Commissioner of Canada (“OPC”) published a blog in October 2022 specifically addressing the pros and cons of synthetic data. This was published against the backdrop of Canadian legislation first tabled in June 2022 (with a second reading in progress) to update Canada’s federal private sector privacy law (“Bill C-27”). Bill C-27 would add legal definitions for anonymised and de-identified data not currently present in the existing law. Under Bill C-27, data that is modified to such an extent and in such a manner that it is no longer classed as personal information can be considered “anonymised data”, while de-identified data would still be considered personal information.

The OPC blog considers the recent enthusiasm for synthetic data to be generally well-justified. For the OPC, the benefits of synthetic data include: (i) protection against traditional re-identification attacks; (ii) the possibility of capturing the statistical properties of high-dimensional data sets (where the number of variables observed is higher than the number of observations, making the number of dimensions so high that calculations become extremely difficult); and (iii) possibilities for automation of the de-identification process.

The blog also raises several concerns about synthetic data including: (i) the possibility of re-identification if records in the source data still appear in the synthetic data; (ii) the risk of membership inference attacks where an attacker tries to glean whether an individual’s data was present in the source data thereby potentially undermining the privacy of the data subject on which the source data was based; and (iii) an inability to protect against disclosure of confidential attributes without the need for the individual or their record to be identified. The blog also highlights issues surrounding bias where data sets are used for training, validating and testing AI and machine learning systems. Synthetic data may not address such problems and it is possible that it reproduces biases in AI and machine learning systems.

 

International developments

UN publishes guide on PETs for official statistics

In February 2023, the United Nations published a guide (United Nations Guide on Privacy-Enhancing Technologies for Official Statistics) to help national statistical offices employ PETs when dealing with sensitive data. The UN’s PETs Task Team founded the UN PET Lab to experiment with pilot projects and real-world use cases. The report emphasises the importance of public events and training on PETs as well as the provision of support in their practical employment. It highlights a number of case studies including the Boston Women’s Workforce Council’s employment of SMPC to measure gender and racial wage gaps and a Dutch project using SMPC and HE to measure an eHealth solution without sharing patient information.

The report gives practical direction for national statistical offices, recommending standards and codes of practice from associations including the Institute of Electrical and Electronics Engineers (IEEE) and the International Organisation for Standardisation (ISO) relating to encryption and security techniques specifically, but also highlighting standards for AI, data quality, governance and cloud computing which impact upon the environments in which PETs are deployed.

Future of Privacy Forum at the 2022 Global Privacy Assembly (“GPA”) in Turkey

The Future of Privacy Form (“FPF”) hosted sessions on PETs with regulators and privacy leaders at the GPA at the end of October 2022. The key takeaway from regulators was the opportunities PETs present to businesses for innovation and the challenges inherent to balancing the promotion of PETs and their oversight and regulation to protect individuals’ privacy. It was noted that the role of regulators is to assist businesses in the responsible development and deployment of PETs. From the perspective of practitioners, a need was felt for greater clarity from regulators. Representatives from eBay and Logitech highlighted their use of PETs and work with PETs developers to comply with their data protection obligations. Overall the mood surrounding PETs at the GPA was one of cautious excitement at the opportunities they present.

EU-U.S. Trade and Technology Council Joint Roadmap and synthetic data project

On 5 December 2022, the EU-U.S. Trade and Technology Council (“TTC”) published the Joint Roadmap on Evaluation and Measurement Tools for Trustworthy Artificial Intelligence and Risk Management and issued this joint statement.

A founding principle of the TTC is that democratic ideals should shape a coordinated international approach to digital transformation in the interests of the global economy. The aim of the Joint Roadmap is to shape collaboration between the US and EU to develop tools, methodologies and approaches to AI risk management and trustworthy AI.

At the same meeting, the TTC outlined a pilot project which will assess the use of PETs and synthetic data in healthcare and medicine, but no further information has yet been given on this project.

It is a rarity that significant collaborative global developments are made in relation to synthetic data and standards for artificial intelligence, so the Roadmap and the TTC’s pilot project mark an interesting move towards global collaboration in the PETs space.

Winners announced in first phase of UK-U.S. privacy-enhancing technologies prize challenges

In July 2022 the UK and US governments launched a set of prize challenges for innovators to develop privacy-preserving federated learning solutions with a prize pool of £1.3 million. There were two tracks to the challenge, both involving the use of synthetic data. The first set of innovators worked with synthetic global transaction data to come up with solutions to fight international money laundering. The second set of innovators used synthetic health data to come up with pandemic readiness solutions. Prizes were awarded to 12 teams from the U.S. and UK in November 2022.

 

Comment

These recent developments are further evidence of the increasing momentum behind the development and use of PETs, and the corresponding interest of governments and public bodies, regulators, and businesses in that regard - both from a data protection and privacy, and an innovation perspective.

The draft ICO guidance, progression of C-27 in Canada, proposed EU AI Act and steps towards an AI Bill of Rights in the USA indicate that this interest is translating into legislative and regulatory change.  This will inevitably shape the development of PETs and their use as a tool to unlock the potential of data and put data protection by design and by default into practice.  Further development and awareness of technical standards and quality assessment measures has the potential to promote increased use of PETs by public and private sector bodies, and more use cases (and transparency around these) will further develop our overall understanding of and approach to the risks and benefits of the technologies.  Nonetheless, care will be needed to ensure that the developing regulatory frameworks and any new standards are proportionate and do not have the effect of stifling innovation or competition.