All publications should have at least one open access link. Get in touch if you have any problems accessing any of the materials.

2024

We look at case studies of Hugging Face, GitHub and Civitai's moderation practices in relation to uploaded AI models and show significant tensions in successfully moderating that space, and suggest ways forward that may be more sustainable.

  • Micklitz H.-W., Helberger N., Kas B., Namysłowska M., Naudts L., Rott P, Sax M., Veale M. Towards Digital Fairness (2024) 13 Journal of European Consumer and Market Law 24

2023

In this chapter, I consider the rise of confidentiality-preserving techniques and infrastructures in online advertising, and how they might reconfigure power and maintain some of the hazards of the contemporary tracking ecosystem.

Many jurisdictions have passed very broadly drafted laws to tackle academic integrity issues, criminalising the provision or advertising of contract cheating or essay mills, such as the Skills and Post-16 Education Act 2022 in England and Wales. Recently, AI models such as chatGPT have amplified academic concerns. Here, we look at the intersection between these phenomena. We review academic cheating laws, showing that several may apply even to general purpose AI services like chatGPT, without knowledge and intent. We identify a range of illegal adverts for AI-enhanced essay mills, and illustrate how difficult it is to draw the line between writing an essay and supporting it, such as by generating bone fide references. We also outline the consequences for intermediaries hosting these ads or providing these services, which may be significantly affected by these primarily symbolic laws. We conclude with a series of recommendations for policymakers, legislators, and education providers.

  • Veale M. (2023) Verification Theatre at Borders and in Pockets Forthcoming in Colleen M. Flood, Y.Y. Brandon Chen, Raywat Deonandan, Sam Halabi, and Sophie Thériault (eds.) Pandemics, Public Health, and the Regulation of Borders: Lessons from COVID-19 (Routledge, forthcoming)

The COVID-19 pandemic saw the creation of a wide array of digital infrastructures, underpinning both digital and paper systems, for proving attributes such as vaccination, test results or recovery. These systems were hotly debated. Yet this debate often failed to connect their social, technical and legal aspects, focussing on one area to the exclusion of the others. In this paper, I seek to bring them together. I argue that fraud-free “vaccination certificate” systems were a technical and social pipe-dream, but one that was primarily advantageous to organisations wishing to establish and own infrastructure for future ambitions as verification platforms. Furthermore, attempts to include features to ostensibly reduce fraud had, and risks further, broader knock-on effects on local digital infrastructures around the world, particularly in countries with low IT capacities easily captured by large firms and de facto excluded from and by global standardisation processes. The paper further reflects on the role of privacy in these debates, and how privacy, and more specifically confidentiality, was misconstrued as a main design aim of these systems, when the main social problems could manifest even in a system built with state of the art privacy-enhancing technologies. The COVID-19 pandemic should sharpen our senses towards the importance of infrastructures, and more broadly, how to use technologies in societies in crises.

Data has been seen as central to understanding privacy, informational power, and increasingly, digital-era competition law. Data is not unimportant, but it is misunderstood. I highlight several assumptions in need of challenge. Firstly, that data protection is distinct from privacy, and has a broader role correcting digitally-exacerbated power asymmetries. Secondly, contrary to economic received wisdom, data is not fully non-rivalrous due to the infrastructural implications of its integration. Thirdly, data can be less important than capacity for experimentation and intervention, which is not simple to ‘open up’. Lastly, data is increasingly unimportant due to large firms’ investments in confidential computing technologies, facilitating distributed analysis, learning, and even microtargeting. In the right conditions, data can be economically substituted for the ability to orchestrate a protocol — an infrastructural capacity unrecognised sufficiently in competition or other fields. This substitutability also requires the ability to force users to adhere to a protocol, bringing further privacy concerns. In sum, privacy, data protection and power need to be considered more closely entwined than at present, and all fields need to consider the infrastructural dimensions of large platforms, more than focussing on the data they accumulate.

Profiling of individuals has long been a concern to scholars and civil society, and a lucrative way for platforms to shape markets and extract value. However, people and their environments are not just computed, but they are increasingly also expected to become agents in large scale computations. The strengthening of privacy and data protection law has been used as a reason to move more and more advanced computation concerning individuals, groups and environments onto people’s devices, in a shift called ‘local processing’. This sees individuals’ devices and software work together to undertake collective computations, which often claim to be confidential with regard to the data of each person involved. For example, using technologies such as secure multi-party computation, phones may work together to create models or analysis of spoken language, without revealing anything that any user said to any other person. Such privacy-enhancing technologies equate privacy with confidentiality, and have interesting potential, but seeing an individual as participants in a computation raises new challenges. What autonomy do people have to shape such participation, given the limited technical and practical control over the devices in their pockets? Does their contribution to ethically questionable computation bring some responsibility, and should they be facilitated to refuse to participate in it? How does the individual relate to the overarching forces that orchestrate their ‘personal’ computers? Here, I present a guide and an agenda to navigate these issues, and analyse emerging regimes, such as the ex ante provisions in the Digital Markets Act and the Data Act, as well as interpretations of the General Data Protection Regulation, to understand how, if at all, they support those that unwillingly, unknowingly and perhaps even unidentifiably facilitate, rather than become the subject of, controversial computation.

There are increasing demands and hopes for data access provisions to open accountability of systems. In parallel, major technology platforms and stacks are encrypting their business models and moving increasingly to privacy-enhancing technology approaches such as 'confidential computing'. These go beyond encrypting the content of communication to encrypting the very approaches that underpin their constitutive algorithmic systems, such as those used in recommendation and allocation. The motivations for these range from concerns around privacy to desires to avoid liability by seeing, hearing — and then presumably doing — no evil. Approaches to studying and improving these are nascent even for developers of these systems, who can struggle with a lack of telemetry and feedback data when trying to tweak, adjust and increase functionality of the systems they are deployed. Given that platforms themselves lack data on these systems, the task is doubly hard for external researchers. This paper characterises those challenges and suggests pathways and approaches legal regimes could take to ensure they remain accountable

Artificial intelligence (AI) is a salient but polarizing issue of recent times. Actors around the world are engaged in building a governance regime around it. What exactly the “it” is that is being governed, how, by who, and why—these are all less clear. In this review, we attempt to shine some light on those questions, considering literature on AI, the governance of computing, and regulation and governance more broadly. We take critical stock of the different modalities of the global governance of AI that have been emerging, such as ethical councils, industry governance, contracts and licensing, standards, international agreements, and domestic legislation with extraterritorial impact. Considering these, we examine selected rationales and tensions that underpin them, drawing attention to the interests and ideas driving these different modalities. As these regimes become clearer and more stable, we urge those engaging with or studying the global governance of AI to constantly ask the important question of all global governance regimes: Who benefits?

Academic and policy proposals on algorithmic accountability often seek to understand algorithmic systems in their socio-technical context, recognising that they are produced by ‘many hands’. In- creasingly, however, algorithmic systems are also produced, de- ployed, and used within a supply chain comprising multiple actors tied together by flows of data between them. In such cases, it is the working together of an algorithmic supply chain of different actors who contribute to the production, deployment, use, and functional- ity that drives systems and produces particular outcomes. We argue that algorithmic accountability discussions must consider supply chains and the difficult implications they raise for the governance and accountability of algorithmic systems. In doing so, we explore algorithmic supply chains, locating them in their broader technical and political economic context and identifying some key features that should be understood in future work on algorithmic gover- nance and accountability (particularly regarding general purpose AI services). To highlight ways forward and areas warranting atten- tion, we further discuss some implications raised by supply chains: challenges for allocating accountability stemming from distributed responsibility for systems between actors, limited visibility due to the accountability horizon, service models of use and liability, and cross-border supply chains and regulatory arbitrage.

The European Commission proposed a Directive on Platform Work at the end of 2021. While much attention has been placed on its effort to address misclassification of the employed as self-employed, it also contains ambitious provisions for the regulation of the algorithmic management prevalent on these platforms. Overall, these provisions are well-drafted, yet they require extra scrutiny in light of the fierce lobbying and resistance they will likely encounter in the legislative process, in implementation and in enforcement. In this article, we place the proposal in its sociotechnical context, drawing upon wide cross-disciplinary scholarship to identify a range of tensions, potential misinterpretations and perversions that should be pre-empted and guarded against at the earliest possible stage. These include improvements to ex ante and ex post algorithmic transparency; identifying and strengthening the standard against which human reviewers of algorithmic decisions review; anticipating challenges of representation and organising in complex platform contexts; creating realistic ambitions for digital worker communication channels; and accountably monitoring and evaluating impacts on workers while limiting data collection. We encourage legislators and regulators at both European and national level to act to fortify these provisions in the negotiation of the Directive, its potential transposition, and in its enforcement.

2022

This chapter outlines some of the challenges when schools are exposed to the business models of technology platforms. It outlines their most salient impacts, and critically evaluates approaches schools can take to ensure their pedagogical autonomy and student privacy is safeguarded.

This paper outlines some of the practical challenges of deploying Bluetooth contact tracing technologies, drawing on the team's experience deploying DP-3T.

Policy challenges of machine learning and sustainability share significant structural similarities, including difficult to observe credence properties, such as data collection characteristics or carbon emissions from model training, and value chain concerns, including core-periphery inequalities, networks of labor, and fragmented and modular value creation. We apply research on certification systems in sustainability, particularly of commodities, to generate lessons across both areas, informing emerging proposals such as the EU’s AI Act.

This paper summarises and analyses a decision of the Belgian Data Protection Authority concerning IAB Europe and its Transparency and Consent Framework (TCF). We argue that by characterising IAB Europe as a joint controller with RTB actors, this important decision gives DPAs an agreed-upon blueprint to deal with a structurally difficult enforcement challenge. Furthermore, under the DPA’s simple-looking remedial orders are deep technical and organisational tensions. We analyse these “impossible asks”, concluding that absent a fundamental change to RTB, IAB Europe will be unable to adapt the TCF to bring RTB into compliance with the decision.

This paper analyses the extent to which practices of real-time bidding are compatible with the requirements regarding (i) a legal basis for processing, transparency, and security in European data protection law. We conclude that, in concept and in practice, RTB is structurally difficult to reconcile with European data protection law.
Covered in the Guardian and by the Norwegian Consumer Council. Cited extensively and extracted upon in the decision of the Belgian Data Protection authority concerning IAB Europe, itself analysed in Veale, Nouwens and Santos (2020) above.

2021

Little attention has been paid to the GDPR's Article 22 in light of decision-making processes with multiple stages, potentially both manual and automated. We diagramatically identify and analyse 5 main complications: the potential for selective automation on subsets of data subjects despite generally adequate human input; the ambiguity around where to locate the decision itself; whether ‘significance’ should be interpreted in terms of any potential effects or only selectively in terms of realised effects; the potential for upstream automation processes to foreclose downstream outcomes despite human input; and that a focus on the final step may distract from the status and importance of upstream processes.

We present an overview of the proposed EU AI Act and analyse its implications, drawing on scholarship ranging from the study of contemporary AI practices to the structure of EU product safety regimes over the last four decades. We find that some provisions of the draft AI Act have surprising legal implications, whilst others may be largely ineffective at achieving their stated goals, including the enforcement regime and the effect of maximum harmonisation on the space for AI policy more generally.
Covered in POLITICO, the New York Times, Engineering and Technology, by the European Parliament, and consultation responses including Amnesty International, EDRi, Verbraucherzentrale Bundesverband, European Center for Not-for-Profit Law, EPIC and Access Now.

Digital vaccination certification involves making many promises, few of which can realistically be kept. In this paper, we demonstrate how this phenomenon constitutes various forms of theatre – immunity theatre, border theatre, behavioural theatre and equality theatre – doing so by drawing on perspectives from technology regulation, migration studies and critical geopolitics. A shorter version of this paper is available as a LexAtlas blog.

There is growing evidence that SARS-CoV-2 can be transmitted beyond close proximity contacts, in particular in closed and crowded environments with insufficient ventilation. To help mitigation efforts, contact tracers need a way to notify those who were present in such environments at the same time as infected individuals. Neither traditional human-based contact tracing powered by handwritten or electronic lists, nor Bluetooth-enabled proximity tracing can handle this problem efficiently. In this paper, we propose CrowdNotifier, a protocol that can complement manual contact tracing by efficiently notifying visitors of venues and events with SARS-CoV-2-positive attendees. We prove that CrowdNotifier provides strong privacy and abuse-resistance, and show that it can scale to handle notification at a national scale. This protocol has since been adapted and deployed by national responses including Germany's CoronaWarnApp.
Covered in Netzpolitik, Die Zeit, Le Soir.

The Lex-Atlas: Covid-19 (LAC19) project provides a scholarly report and analysis of national legal responses to Covid-19 around the world. There are nearly 200 jurists participating in the LAC19 network and who have contributed to writing national country reports. The Oxford Compendium of National Legal Responses to Covid-19 launched on 21 April 2021 with 19 Country and Territory Reports and a further 41 will be added on a rolling basis across the Spring and Summer of 2021. More information is available on the LexAtlas website.

This chapter elaborates on challenges and emerging best practices for state regulation of electoral disinformation throughout the electoral cycle. It is based on research for three studies during 2018-20: into election cybersecurity for the Commonwealth; on the use of Artificial Intelligence (AI) to regulate disinformation for the European Parliament; and for UNESCO, the United Nations body responsible for education.

2020

An introduction to the use, possibilities, limitations and considerations of the use of data protection transparency provisions as a research method.

A short chapter on the interaction of platforms, technology and contact tracing systems.
Read more: op-ed in the Guardian.

The DP-3T Bluetooth proximity tracing protocol white paper, for supporting contact tracing efforts during COVID-19. See more on the GitHub.
See Wikipedia page for more information and impact.

A conceptual guide to the concept of cybersecurity over time, from multiple disciplinary angles.

A book resulting from a study of cybersecurity in an electoral context undertaken for the Commonwealth. Contains recommendations and best practices.

A critical view on the EU HLEG-AI's recent guidelines, highlighting the lack of focus on power, infrastructure, the pervasive technosolutionism, the problematic representativeness of the group, and the reluctance to talk about funding of regulators, among other issues.

A guide to data rights, recent case law, challenges and trajectories to feed into the European Data Protection Board's drafting process for their data rights guidance.

We show using a scrape of the top 10,000 UK websites that only 11.8% of websites using the 5 big providers of consent pop-up libraries have configured them in ways minimally compliant with the GDPR and ePrivacy law.
Coverage in the BBC, TechCrunch, DR (Danish public broadcaster), Fast Company, Les Echos, cited by the Irish Data Protection Commissioner, the American Bar Association, Mayer Brown, Orange, Stiftung Neue Verantwortung, and Facebook, credited with changing regulatory guidance around cookies and consent in Denmark.

2019

This PhD thesis unpacks the provisions and framework of European data protection law in relation to social concerns and machine learning’s technical characteristics to identify tensions between the legal regime and machine learning practice; and draws on empirical data of machine learning in use in public sector institutions around the world to identify tensions between scholarship on fair and transparent machine learning and the social routines attempting to deploy it responsibly on the ground.

This report examines the England and Wales landscape around criminal justice and the increasing and varied use of algorithmic systems within it, proposing an array of policy recommendations.

This chapter asks and attempts to answer aspects of three main questions: What are the drivers and logics behind the use of machine learning in the public sector, and how should we understand it in the contexts of administrations and their tasks? Is the use of machine learning in the public sector a smooth continuation of ‘e-Government’, or does it pose fundamentally different challenges to the practice of public administration? How are public management decisions and practices at different levels enacted when machine learning solutions are implemented in the public sector?

This chapter focuses on the extent to which sophisticated profiling techniques may end up undermining, rather than enhancing, our capacity for ethical agency - and how, if at all, personalisation and recommendation systems may be responsibly designed in light of this.

2018

Recent 'model inversion' attacks from the information security literature indicate that machine learning models might be personal data, as they might leak data used to train them. We analyse these attacks and discuss their legal implications.
Coverage and citation by the Information Commissioner's Office in their AI Auditing Framework, the Council of Europe, the European Parliament, the Royal Society, Chatham House, the Future of Privacy Forum.

Where 'debiasing' approaches are appropriate, they assume modellers have access to often highly sensitive protected characteristics. We show how, using secure multi-party computation, a regulator and a modeller can build and verify a 'fair' model without ever seeing these characteristics, and can verify decisions were taken using a given 'fair' model.
Coverage in the Financial Times.

We interviewed 27 public sector, machine learning practitioners about how they cope with challenges of fairness and accountability. Their problems are often different from those in FAT/ML research so far, including internal gaming, changing data distributions and inter-departmental communication, how to augment model outputs and how to transmit hard-won social practices.
Coverage and use by the Australian Human Rights Commission; the Royal United Service Institute for Defence and Security Studies (RUSI), the European Parliament, the European Commission, the Boston Consulting Group, the United Nations Economic and Social Commission for Asia and the Pacific, the Alan Turing Institute/UK Office for AI, UNESCO, the German Government's Expert Council for Consumer Affairs, the Council of Europe, and the Centre for Data Ethics and Innovation.

We presented participants in the lab and online with adverse algorithmic decisions and different explanations of them. We found strongly dislike of case-based explanations where they were compared to a similar individual, even though these are arguably highly faithful to the way machine learning systems work.

In this workshop paper, we argue that sense-making is important not just for experts but for laypeople, and that expertise from the HCI sense-making community would be well-suited for many contemporary privacy and algorithmic responsibility challenges.

The General Data Protection Regulation has significant effects for machine learning modellers. We outline what human-computer interaction research can bring to strengthening the law, and enabling better trade-offs.

Data protection law gives individuals rights, such as to access or erase data. Yet when data controllers slightly de-identify data, they remove the ability to grant these rights, without removing real re-identification risk. We look at this in legal and technological context, and suggest provisions to help navigate this trade-off between confidentiality and control.

In-store tracking, using passive and active sensors, is common. We look at this in technical context, as well as the European legal context of the GDPR and forthcoming ePrivacy Regulation. We consider two case studies: Amazon Go, and rotating MAC addresses.

We outline the European 'right to an explanation' debate, consider French law and the Council of Europe Convention 108. We argue there is an unmet need to empower third party bodies with investigative powers, and elaborate on how this might be done.

We critically examine the Article 29 Working Party guidance that relates most to machine learning and algorithmic decisions, finding it has interesting consequences for automation and discrimination in European law.

2017

FAT/ML techniques for 'debiasing' machine learned models all assume the modeller can access the sensitive data. This is unrealistic, particularly in light of stricter privacy law. We consider three ways some level of understanding of discrimination might be possible, even without collecting such data as ethnicity or sexuality.

We consider the so-called 'right to an explanation' in the GDPR and in technical context, arguing that even if it (as manifested in Article 22) was enforced from the non-binding recital in European law, it would not trigger for group-based harms or in the important cases of decision-support. We argue instead for the use of instruments such as data protection impact assessment and data protection by design, as well as investigating the right to erasure and right to portability of trained models, as potential avenues to explore.
Coverage and use by the Article 29 Working Party (the official group of European regulators) in their guidance on the regulation that the paper itself analysed; The Information Commissioner's Office described it as "important to the development of the [regulator's] thinking" in this area; by the Council of Europe [1, 2, 3, 4], the European Commission (DG JRC, DG JUST, DG COMP), the European Parliament [1, 2, 3], the UN Special Rapporteur on Extreme Poverty and Human Rights Philip Alston, the Privacy Commissioner of Hong Kong, the German Government's Expert Council for Consumer Affairs, Amnesty International, RUSI, Access Now, Privacy International, the US Federal Trade Commissioner Noah Williams, the Centre for Data Ethics and Innovation, ARTICLE 19, the Nuffield Foundation, the New Economics Foundation, and the Government of New Zealand; the House of Lords (13/11 2017 vol 785 col 1862--4 & 13/12 2017 vol 787 col 1575--7) with amendments based upon it; profiled in the Journal of Things We Like (Lots), awarded a Privacy Papers for Policymakers prize by the Future of Privacy Forum at the US Senate in 2019.

I authored the case studies for the Royal Society and British Academy report which led to the UK Government's new Centre for Data Ethics and Innovation. I also acted as drafting author on the main report.

This is a preliminary version of the 'Fairness and accountability design needs' CHI'18 paper above.

We considered the detection of offensive and hateful speech, looking at a dataset of 1 million annotated comments. Taking gender as an illustrative split (without making any generalisable claims), we illustrate how the labellers' conception of toxicity matters in the trained models downstream, and how bias in these systems will likely be very tricky to understand.

2016

A conference paper on public sector values in machine learning, and public sector procurement in practice.

2015

This paper argues that performance-based sustainability standards, using a case study from the sugar-cane sector, have significant benefits over technology-based standards, and suggests directions in which this can be explored