Continual Personalization

Abstract

Personalized text-to-image diffusion models have grown popular for their ability to efficiently acquire a new concept from user-defined text descriptions and a few images. However, in the real world, a user may wish to personalize a model on multiple concepts but one at a time, with no access to the data from previous con- cepts due to storage/privacy concerns. When faced with this continual learning (CL) setup, most personalization methods fail to find a balance between acquiring new concepts and retaining previous ones – a challenge that continual person- alization (CP) aims to solve. Inspired by the successful CL methods that rely on class-specific information for regularization, we resort to the inherent class- conditioned density estimates, also known as diffusion classifier (DC) scores, for CP of text-to-image diffusion models. Namely, we propose using DC scores for regularizing the parameter-space and function-space of text-to-image diffusion models, to achieve continual personalization. Using several diverse evaluation setups, datasets, and metrics, we show that our proposed regularization-based CP methods outperform the state-of-the-art C-LoRA, and other baselines. Finally, by operating in the replay-free CL setup and on low-rank adapters, our method incurs zero storage and parameter overhead, respectively, over the state-of-the-art.

The flaw within C-LoRA

We show that C-LoRA (the current SOTA method for continual personalization) allows for a more general degeneracy where any learning on new tasks pushes the LoRA weight values toward zero.

*C-LoRA's self-regularization loss for incremental tasks.*

Method

*Derivation of diffusion classifier (DC) scores for FIM computation in parameter-space consolidation.*

*Illustration of function-space consolidation with diffusion classifier (DC) scores.*

Results: Custom Concepts

Qualitative results for Custom Concept setup with 6 tasks.

Results: Landmarks

Qualitative results for Landmarks setup with 10 tasks.

Results: Textual Inversion

Qualitative results for LoRA-based methods on Textual Inversion dataset setup with 9 tasks.

Results: Celeb-A 256x256

Qualitative results for LoRA-based methods on Celeb-A 256x256 setup with 10 tasks.

Results: Custom Concepts 50 tasks setup

Qualitative results for 50 tasks Custom Concept setup.

Results: Multi-concept generation

Multi-concept generation results: the upper row images are generated using the prompt “A photo of V 1 plushie tortoise. Posing in front of V 2 waterfall” while the lower row images are generated using the prompt “A photo of V 1 plushie tortoise. Posing with V 2 sunglasses” .

Results: VeRA on Custom Concept

VeRA results: we preserve our EWC DC framework and replace LoRA with Vector-based Random Matrix Adapters (VeRA).

BibTeX

                @inproceedings{
                    jha2025mining,
                    title={Mining your own secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models},
                    author={Saurav Jha and Shiqi Yang and Masato Ishii and Mengjie Zhao and christian simon and Muhammad Jehanzeb Mirza and Dong Gong and Lina Yao and Shusuke Takahashi and Yuki Mitsufuji},
                    booktitle={The Thirteenth International Conference on Learning Representations},
                    year={2025},
                    url={https://openreview.net/forum?id=hUdLs6TqZL}
                    }