Security, Privacy, and Provenance for Generative AI

by admin · June 14, 2026

Jaiden Fairoze

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2026-126

May 14, 2026

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2026/EECS-2026-126.pdf

As generative AI transitions from novelty to critical infrastructure, ensuring its provenance, security, and privacy has become paramount. This dissertation studies all three concerns through the lens of cryptography, using its tools to construct provable safety guarantees and expose theoretically-grounded limitations.

Provenance asks whether AI-generated content can be reliably attributed to its source. We construct the first publicly-detectable and unforgeable watermarking scheme for language models. For images, where robustness is a greater concern, we show that it is possible to construct a watermarking scheme that is simultaneously publicly-detectable and robust; however, robust embedding models are infeasible to instantiate with current machine learning capabilities.

Security asks whether lightweight guardrails can reliably defend language models. We show that adversaries can encode malicious intent into computationally hard-to-detect structures, exploiting the resource asymmetry between a guardrail and the model it protects. Our attack, controlled-release prompting, succeeds at near-perfect rates against the production chat interfaces of major AI platforms that resist baseline jailbreaks, and a systematic evaluation of open-weight prompt guards further supports the asymmetry hypothesis.

Privacy asks what happens when a model is tasked with keeping a secret in its context. We study inadvertent context leakage through a predicate-inference game where an adversary tries to recover a secret from model outputs. We find that proprietary models are broadly susceptible to bit-level secret leakage. Leakage grows with model capability, indicating it is intrinsic to stronger instruction-following rather than an incidental flaw.

Advisors: Sanjam Garg

Security, Privacy, and Provenance for Generative AI

Jaiden Fairoze

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2026-126

May 14, 2026

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2026/EECS-2026-126.pdf

You may also like...

Resources

Search

News

Events

Archives

Sponsored By

InterSystems

MySQL/Oracle

Supporters

McObject

Raima

Scality

TIAA

Undo

Volt Active Data