1 of 63

What We Need From Watermarking�Research to Scale Image Provenance��Pierre Fernandez,

Meta, FAIR

APAI @ CVPR2026

2 of 63

Outline

Introduction to Watermarking�
Watermarking AI-Generated Content

Post-hoc (After Generation)
At Generation-time�

Roadblocks and Discussions

2

3 of 63

- Part 1 -

Introduction to Watermarking

4 of 63

Watermarking

010100 … 100

Watermark

Watermarked content�(Transformed)

Watermarked content�(Slightly modified)

010100 … 100

Watermarking

Transformed when �transmitted from user to user

Reveal 🔍

Content

Traditionally used for IP protection.

4

5 of 63

Watermarking

010100 … 100

Watermark

Watermarked content�(Transformed)

Watermarked content�(Slightly modified)

010100 … 100

Watermarking

Transformed when �transmitted from user to user

Reveal 🔍

Content

Traditionally used for IP protection.�Can it be used to for Generative AI? Why? How?

5

6 of 63

GenAI comes with risks

Opinion swaying, scam, fraud, internet pollution

6

7 of 63

GenAI comes with mushrooms

Opinion swaying, scam, fraud, internet pollution

7

8 of 63

GenAI comes with mushrooms

Opinion swaying, scam, fraud, internet pollution

8

9 of 63

Forensics / Deepfake detection

Passive detection: hard, and will get harder

‘AI generated?’ ✔ / ✗

🔍

Midjourney

Classifier

9

10 of 63

Embedding provenance in metadata

Metadata:

Gen AI

Photography

Social platforms

Author: …

Date: …�Location: …

AI-generated: …

🔗

10

11 of 63

Fingerprinting

Map each data point to a vector representation, that serves as “fingerprint”

1/ Save the fingerprint of every generated image

2/ Detect if there is a match

�Database

of Gen-AI

content

Match?

→ Needs to store everything

→ Does not scale well

11

12 of 63

Watermarking

1/ Embed a watermark in all generated content

2/ Detect the watermark

Watermarked ✔ / ✗

🔍

Midjourney

WM Extractor

Watermarked image

12

13 of 63

Regulation

[White House Executive order, June 2023] → commit to label AI-generated content. Canceled

[EU AI Act, March 2024] → compulsory watermarking for general purpose AI (GPAI) by May 2025� Currently: Code of Practice

[California Provenance, Authenticity and Watermarking Standards, June 2024] → “requires a GenAI provider to place an imperceptible and maximally indelible watermark into synthetic content”

13

14 of 63

Watermarking can be done at different stages

Digital forensics

Training data

Model training

Model inference

Output data

✗

AI generated?�✔ / ✗

Detect

Generator

14

15 of 63

Watermarking can be done at different stages

Training data

Model training

Model inference

Output data

AI generated?�✔ / ✗

Post-hoc watermarking

Metadata

✗

Embed

Extract

Generator

15

16 of 63

Watermarking can be done at different stages

Training data

Model training

Model inference

Output data

✗

Generation-time WM�Out-of-model

AI generated?�✔ / ✗

Extract

Generator

16

17 of 63

Watermarking can be done at different stages

Training data

Model training

Model inference

Output data

✗

Generation-time WM�In-model

AI generated?�✔ / ✗

Extract

Generator�with WM

Generator

17

18 of 63

Watermarking can be done at different stages

Training data

Model training

Model inference

Output data

✗

AI generated?�✔ / ✗

Extract

Generator

with WM

18

19 of 63

Watermarking can be done at different stages

Digital forensics

Training data

Model training

Model inference

Output data

Post-hoc watermarking

✗

Metadata

✗

Generation-time WM�Out-of-model

Generation-time WM�In-model

How can we do it?
What are the motivations behind each of them?
What is currently used? What are the limits?

19

20 of 63

- Part 2 -

Watermarking AI-Generated Content

21 of 63

Post-hoc

22 of 63

Watermarking with deep neural networks

[📄 Zhu, Jiren, Russell Kaplan, Justin Johnson, et Li Fei-Fei. “HiDDeN: Hiding Data with Deep Networks”. In ECCV, 2018.]

[📄Ahmadi, Mahdi, Alireza Norouzi, Nader Karimi, Shadrokh Samavi, and Ali Emami. "ReDMark: Framework for residual diffusion watermarking based on deep networks." Expert Systems with Applications (2020).]

Jointly trains 2 deep neural networks to embed/extract watermarks:

Embedder

Watermarked

Original

0100101001

Augmented

Random transform

0100101001

Extractor

22

23 of 63

Watermarking with deep neural networks

[📄 Zhu, Jiren, Russell Kaplan, Justin Johnson, et Li Fei-Fei. « HiDDeN: Hiding Data with Deep Networks ». In ECCV, 2018.]

[📄Ahmadi, Mahdi, Alireza Norouzi, Nader Karimi, Shadrokh Samavi, and Ali Emami. "ReDMark: Framework for residual diffusion watermarking based on deep networks." Expert Systems with Applications (2020).]

Jointly trains 2 deep neural networks to embed/extract watermarks:

Embedder

Watermarked

Original

0100101001

ℓ_watermark

Augmented

Random transform

ℓ_percep

0100101001

Extractor

Imperceptibility

Robustness

23

24 of 63

Embedder-extractor based approaches

A lot of works expand this setup:

SynthID-Image (DeepMind)
TrustMark (Adobe)
Pixel/VideoSeal (Meta)

24

25 of 63

PixelSeal training and inference pipeline

[📄 Soucek et al. “Pixel Seal: Adversarial-only training for invisible image and video watermarking”]

Embedder

Resize�to 256x256

Resize �to original

JND

Extractor

Resize�to 256x256

0101..001

Augment

ℓ_msg

Discriminator

ℓ_adv

25

26 of 63

Example

256 bits - target PSNR ≈ 48dB

26

27 of 63

How do we do detection?

Generated by our model

Ext.

m : 11111…01

Hidden message �m’ : 11111…11

→ 95 /100

27

28 of 63

How do we do detection?

Generated by our model

Ext.

m : 11111…01

Random m: 10100…11

Natural image

Ext.

Hidden message �m’ : 11111…11

→ 95 /100

→ 51 / 100

28

29 of 63

How do we do detection?

Ext.

Test: bit accuracy (m,m’) > 𝜏 ex: 𝜏 = 80%

Generated by our model

Ext.

m : 11111…01

Random m: 10100…11

Natural image

Hidden message �m’ : 11111…11

→ 95 /100

→ 51 / 100

𝜏

bit accuracy

FPR

29

30 of 63

Video watermarking with neural nets

Video Seal

30

31 of 63

Audio watermarking with neural nets

Same!�[📄 WavMark] [📄 SilentCipher] [📄 Maskmark]

Embedder

Watermarked

Original

0100101001

ℓ_watermark

Attacked

Random transform

ℓ_percep

0100101001

Extractor

31

32 of 63

AudioSeal: Localized audio watermarking

Extractor

110110001

WM?

✔ or ✗

1.0

time steps

watermark detection�probability

0.0

✔

✗

Detector

📄 San Roman et al., Proactive Detection of Voice Cloning with Localized Watermarking , ICML 2024

32

33 of 63

Aside: Attacks on the watermark extractor

Watermark

Extractor

White-box: we know everything

Extremely easy to attack

Black-box: we know nothing

Hard to attack �(but not impossible)

33

34 of 63

Example of attacks

Diffusion model

+

DiffPure

Auto-encoder

VAE Regeneration�/ Neural compression

34

35 of 63

Semantic watermarking

Post-hoc encoder/decoder:�→ high PSNR regimes,�→ limited in the amount of pixel change,�→ harder for adversarial attacks

What if we could semantically watermark the image?��Low PSNR - harder to remove

35

36 of 63

Generation-time

37 of 63

Watermarking for LLMs

← WM happens here

FA^(-5)IR ^(-4) is ^(-3) a ^(-2) great ^(-1)

Sample

Context (tokens)

lab ⁽⁰⁾

research⁽⁰⁾

WM �Sample

Generation with LLMs

LLM

logits l= ( l₁…l_V)

…

lab research

…

banana

37

38 of 63

Watermarking for LLMs

📄 Kirchenbauer et al., A Watermark for Large Language Models, ICML 2023

📄 Aaronson et al., Watermarking GPT Outputs, 2022

38

39 of 63

Stable Signature

Diffusion

Model

WM�Extractor

“Fine-tune LDM decoder s.t. �every generated image is directly watermarked”

‘Tahiti mountains, in the style of Gauguin’

…

WM�Decoder

AI generated?

✔ / ✗

Original�Decoder

❄️

Fine-tuned �before distribution

Watermarked

[📄 Fernandez et al. , The stable signature: Rooting watermarks in latent diffusion models, ICCV23]

39

40 of 63

Generation quality

Quantitative results:��10k generated �512x512 images

Original model

Watermarked

Difference

40

41 of 63

DistSeal

Diffusion

Model

WM�Extractor

‘Tahiti mountains, in the style of Gauguin’

AI generated?

✔ / ✗

Watermarked

WM�Embedder

Decoder

…

0100101001

[📄 Rebuffi et al. , Learning to Watermark in the Latent Space of Generative Models , ICML26]

41

42 of 63

Tree-Ring Watermarks

📄 Wen et al., Tree-Rings Watermarks: Invisible Fingerprints for Diffusion Images, �NeurIPS 2023

‘Tahiti mountains, in the style of Gauguin’

…

Decoder

AI generated?

✔ / ✗

Watermarked

WM

Invert diffusion process (iterative)

42

43 of 63

Same thing for auto-regressive models

Watermark Detection

Detokenizer

2

9

1

4

7

?

8

Autoregressive

Model

Watermarked�Sampling

Tokenizer

Watermarked!

p-value: 1.2 x 10^-7

Generated Content

…

The Black Minorca is a rare and historic breed originating from the Balearic Islands of Spain. Here's an image of a Black Minorca hen, showcasing its signature green sheen on its feathers:

What

is

Black

Minorca

?

2

9

1

4

7

8

📄 Jovanovic et al., Watermarking AR Image Generation, NeurIPS 2025�📄 Wu et al., Robust Distortion-Free Watermark for Autoregressive Audio Generation Models, NeurIPS 2025�📄 Tong et al., Training-Free Watermarking for Autoregressive Image Generation, arXiv 2025�📄 Hui et al., Autoregressive Images Watermarking through Lexical Biasing: An Approach Resistant to Regeneration Attack, arXiv 2025�📄 Müller et al., On the Robustness of Watermarking for Autoregressive Image Generation, arXiv 2026�📄 Yilmaz et al., UniMark: Unified Adaptive Multi-bit Watermarking for Autoregressive Image Generators, arXiv 2026

43

44 of 63

Qualitative

Without watermarking

With watermarking

📄 Jovanovic et al., Watermarking AR Image Generation, NeurIPS 2025

44

45 of 63

Qualitative: Semantic Watermarking

Without watermarking

With watermarking

45

46 of 63

Qualitative: Semantic Watermarking

Without watermarking

With watermarking

→ More green tokens

46

47 of 63

Robustness (vs SOTA post-hoc)

→ Robust to removal attacks (Adv.) and neural compression (NC)� (SOTA post-hoc watermarks often break here)

Autoencoder

NC

Diffusion steps

Adv.�(DiffPure)

+

47

48 of 63

- Part 3 -

Roadblocks and Discussions

49 of 63

Invisible watermarking in prod.

✔

✗

✔

✗

Other companies:

49

50 of 63

Invisible watermarking in prod.

✔

✗

✔

✗

Other companies:

→ Post-hoc almost all the time (except for text). Why?

50

51 of 63

Revisiting the 3 criteria

Imperceptibility

Payload

Robustness

51

52 of 63

Revisiting the 3 criteria

Imperceptibility

Payload

Robustness

?

52

53 of 63

Example of Tree-Ring

⚠️ Meta updates the generative model.� → x2 detection compute

Develops a generative model and want to watermark the outputs.�⚠️ WM team convinces that it’s better to tweak GenAI model/ sampling. � → Uses Tree-Ring.

⚠️ Detection? iterative, ≈10s � → Scaling detection to FB/IG is impossible� → Giving API access to detection very costly

⚠️ Meta owns photo devices ( ). � Will have to develop post-hoc models anyway

…

Generative model

Generative model 2

Generative model

53

And to be clearer, I will take the example of TreeRing. let’s say we’re in a real use case. ��So, Meta develops a diffusion model, and wants to watermark outputs. There will be different roadblocks. ��The first one would be the following, let’s say I am the watermarking team. I will have to convince others that it’s better to tweak the sampling. But in practice, people will not really be happy with it, so it will be hard to convince them.��But okay, let’s say I’ve been able to convince the team that it’s needed to change the generative model. Now, to run detection — because it’s iterative and takes approximately 10 seconds per image — scaling the detection to Facebook or Instagram is impossible. And giving access to an API for detection would also be very costly. So this is the second roadblock.��Let’s say Meta doesn’t care about this and still decides to do TreeRing. Now, a year or maybe six months later, they create a new generative model. What this means — because in TreeRing we have to invert the generative model — is that if I have some images and I want to check both models, I will have to run the detection on two different generative models. So it would be two times the detection compute.��Let’s say we are still doing it. Meta, and Google or other companies as well, also has photo devices. In our case, it’s the Meta Ray-Bans. But for Google it would be, for instance, the google Pixels. They would have to develop post-hoc models anyway, because these models will have to watermark photos that already exist. What is the value of having 2 very different kinds of watermarking methods?��

54 of 63

Example of Tree-Ring

⚠️ Meta updates the generative model.� → x2 detection compute

Develops a generative model and want to watermark the outputs.�⚠️ WM team convinces that it’s better to tweak GenAI model/ sampling. � → Uses Tree-Ring.

⚠️ Detection? iterative, ≈10s � → Scaling detection to FB/IG is impossible� → Giving API access to detection very costly

⚠️ Meta owns photo devices ( ). � Will have to develop post-hoc models anyway

Organizational

Efficiency

Adaptability (backward compatibility)

Adaptability (accross use-cases)

…

Generative model

Generative model 2

Generative model

54

55 of 63

Adaptability: Organizational roadblocks

The company must be able to change the generative model.

�� → any fine-tuning diffusion or autoregressive model needs to be done � at the end adds complexity to the generative pipeline�� → changing generative model is harder: � people doing the generative models do not like when you touch the sampling

→ Post-hoc watermarking

55

So I’ll start by talking a bit about the organizational roadblocks. There are a few things that are important to know when developing watermarking methods, and that we have to take into account.��First, the company must be able to change the generative model, any method specific to one model will be very hard to use.

Also, any fine-tuning on the diffusion or the autoregressive model needs to be done at the end of the generative model pipeline, which will add complexity to the generative pipeline. ��And finally, if we want to change the generative model, it’s much harder to do it by changing the generative models or the sampling, because the people that develop these models don’t like when you change these parameters.

And all of these points really point toward post-hoc watermarking being the method of the industry.

56 of 63

Adaptability: Modularity

In companies, constraints are not the same depending on where the WM is embedded.�→ on device, e.g. cameras, phones, for marking real content�→ on GPUs for marking AI-generated content ��Could we have different embedders trained with a single extractor?

Embedder�(Real)�(small, on device)

Embedder�(GenAI)

Watermarked

Original

0100101001

Single Extractor

Watermarked

AI Gen

0100101001

?

56

So now that we have this in mind, let’s see what are the main blockers of post-hoc watermarking methods at the moment, and what are the important things that would still need to be fixed for post-hoc watermarking to work at the scale of the industry. ��First, modularity. In companies, the constraints are not the same, depending on where the watermark is embedded. For instance, if you have cameras or phones, and want to watermark real content, then the model needs to be very small, to be very fast and needs to have very few parameters. When you watermark AI-generated content, you, most of the time, are working on GPUs that are already super big because they have to run your AI generative model. So you’re much less limited. For instance, one first question that we can have is: could you have different embedders trained with a single extractor?

57 of 63

Adaptability: Updates and backward compatibility

Company updates the embedder:

to be robust to a new augmentation (codec, attack, etc.)
to be better, faster, etc.

Could it keep the same extractor as before?��

57

58 of 63

Security issues

Watermark

Extractor

White-box: we know everything

Extremely easy to attack

Black-box: we know nothing

Hard to attack �(but not impossible)

58

59 of 63

Example of attacks

Regeneration attacks�e.g. DiffPure

Diffusion model

+

Gradient-based�Adversarial attack

Watermark

Extractor

+

optimize such that:�- the message is removed�- the message is forged

59

60 of 63

Robustness ≠ Security

Robustness

“Degradation is due to a classical content processing (compression, low-pass filtering, noise addition, geometric attack… ).”

📄 Cayre et al.: Watermarking Security: Theory and Practice, 2005

Security

“the inability by unauthorized users to access [i.e. to remove, to read, or to write the hidden message] the communication channel”

60

If we talk about black-box attacks, I think one thing needs to be very clear: robustness is different from security in watermarking. This idea comes from the work ‘Watermarking Security: Theory and Practice’ from 2005, which defines robustness as degradation due to classical content processing — let’s say compression, low-pass filtering, noise addition, geometric attacks — and security as the inability by unauthorized users to access, remove, read, or write the hidden message in the communication channel.

Even though security is very important, we care more about robustness than security.

In practice, what will happen is that 99.9% of the modifications that will happen to your images will be unintentional. And maybe some intentional attacks will appear as well. But if you are not robust to this 99.9% of modifications, then the method is really not going to be used.��And one thing that I’m really thinking about is geometric attacks. Some works around mostly generation-time watermarking methods are not robust to, let’s say, crops or rotations or perspective changes. And it’s really important for these methods to be used in practice to be robust to this. So when developing these methods, we should really care about having a way to make them robust to geometric attacks as well.

61 of 63

What about white-box attacks?

→ Interoperability��Detection of 3^rd party content requires cross-industry collaboration

61

62 of 63

Public facing detectors

How to communicate detection results:

→ API. What are the best ways to protect?

→ OSS extractor. Is this even possible?

62

Even with these challenges, one direction we can explore is asymmetric watermarking — and in particular, how we best communicate detection results.

For example, we could expose the extractor through an API that simply returns yes or no. In that case, the key questions are: how do we protect it? Do we add rate limiting or authentication to the endpoints? Do we keep a public detector that’s different from the private one we use internally?

The problem is even harder for open-source extractors. We don’t yet know if it’s possible to protect against white-box attacks in that setting. The goal would be to give everyone access to the extractor, without compromising security — meaning no one could cheaply forge or remove the watermark.

This is a critical area of research, because if we solve it, watermark detection could scale across the industry and become truly accessible to everyone.�These are exactly the kinds of practical challenges we need to solve if watermarking is going to work at scale.

63 of 63

Main takeaways

Watermarking:

Increasingly important: asked by regulators and backed important companies�Greatly enhances detection of GenAI content for all modalities (image, text, audio)
Generation-time watermarking: more secure to adversarial removal�Post-hoc: much much more flexible
Many new problems to explore and focus on� → find answers to practical issues!

THANKS!

slides are on my website

MetaSeal website

63