Copyright Infringement Detection in Text-to-Image Diffusion Models via Differential Privacy

Xiafeng Man1, Zhipeng Wei3,4, Jingjing Chen2†

1College of Future Information Technology, Fudan University, Shanghai, China
2Institute of Trustworthy Embodied AI, Fudan University, Shanghai, China
3International Computer Science Institute, CA, USA
4UC Berkeley, CA, USA

TL;DR: We formalize the concept of copyright infringement and its detection from the perspective of Differential Privacy (DP), and introduce a novel post-hoc detection framework D-Plus-Minus (DPM). It simulates the inclusion or exclusion processes of a specific training data point to be detected by fine-tuning models in two opposing directions: learning or unlearning branch.

To facilitate standardized benchmarking, we also construct the Copyright Infringement Detection Dataset (CIDD), a comprehensive resource for evaluating detection across diverse categories.

Method

We reinterprete the detection of copyright infringement as the compliance with or violation of conditional differential publicity. Specifically, when a particular concept, such as the neighborhood images of a target image, is present or absent in the training data, it can significantly alter the model’s output in response to prompts associated with that concept. This leads to the definition of a new metric, conditional sensitivity, a principal metric for quantifying the extent of publicity and standardizing the confidence score of copyright infringement: CS(M,x^i)=maxD,D:DD{x^i}|M(D)M(D)|,CS(M,\hat{x}_{i})=\max_{D,D^{\prime}:D\triangle D^{\prime}\leq\{\hat{x}_{i}\}}\left|M(D)-M(D^{\prime})\right|, where DD and DD^{\prime} are neighboring datasets that differ by the inclusion or exclusion of the conditional datapoint x^i\hat{x}_{i}, and the function M(D)M(D) denotes the output of a query function when trained on dataset DD.

Fig 1: D-Plus-Minus Method. Given the neighbourhood images U(xi)U(x_{i}), i.e., several images of similar semantics extracted from the target image, of the target image xix_{i} as the training subset, we fine-tune the text-to-image model GG towards two branch: learning branch GD+G_{D^{+}} and unlearning branch GDG_{D^{-}}. Experimental results show that infringed samples lead to a significant shift in sensitivity metric, whereas non-infringed samples only cause minor changes.

We visualize the discrepancy in conditional sensitivity in Fig.1, where the larger change observed in infringed samples compared to non-infringed ones validates its use as a reliable measurement.

Results

Class SD1.4 SDXL-1.0 SANA-0.6B FLUX.1
AUC ↑ SoftAcc ↑ AUC ↑ SoftAcc ↑ AUC ↑ SoftAcc ↑ AUC ↑ SoftAcc ↑
Human Face 0.9011 0.8058 0.7011 0.6289 0.8062 0.7285 0.7531 0.6419
Architecture 0.8021 0.7106 0.9256 0.8488 0.9043 0.8224 0.9500 0.8606
Arts Painting 0.8555 0.7604 0.8881 0.8550 0.8140 0.7204 0.7326 0.6935
Weighted Average 0.8584 0.7644 0.8170 0.7523 0.8398 0.7571 0.8122 0.7247
Merged Total 0.8071 0.6726 0.7800 0.7234 0.7914 0.6855 0.8257 0.7039
Table 1: Quantitative Detection Metrics. Models are run separately on the classes of CIDD dataset in different models. Merged Total means that the ΔCS(·) are normalized altogether, while others are normalized within the class.
Fig 2: Qualitative visualization of the Unlearning Branch and Learning Branch across different timesteps. Models tend to learn and unlearn faster with infringed samples, while slower on non-infringed ones, and cannot learn exact elements in the target images.

Dataset

Copyright Infringement Detection Dataset (CIDD) contains several classes of orthogonal prompts and three image classes that are most likely to be infringed: human face, architecture, and arts painting.

Crucially, CIDD includes both infringed and non-infringed concepts, each of which is annotated with a binary infringement label based on its source and content provenance, and is paired with 3 to 6 neighbourhood images, enabling robust learning and evaluation under weak and probabilistic assumptions.

As the paper is under review, dataset will be made publicly available following publication.

BibTex

@misc{man2025copyrightinfringementdetectiontexttoimage,
      title={Copyright Infringement Detection in Text-to-Image Diffusion Models via Differential Privacy}, 
      author={Xiafeng Man and Zhipeng Wei and Jingjing Chen},
      year={2025},
      eprint={2509.23022},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.23022}, 
}