VisiLock: Authorizing Instruction-based Image editing with Dual Score Distillation

Abstract

While open-sourcing instruction-guided image editing models accelerates research, it surrenders control over their capabilities to anyone who downloads the weights. Existing protection methods (text passwords, watermarks, fingerprints) are reactive: they verify ownership after generation, but the underlying model remains fully functional for unauthorized users. We introduce VisiLock, where access control is baked into model weights, rendering the model unusable without a visual trigger in the input. The challenge is training a model that retains editing capability for authorized input and remains unusable for unauthorized input, without destabilizing training. Naive multi-task objectives create gradient conflicts that collapse training, while contrastive approaches like FMLock destroy the denoising manifold. We develop Diverged Score Distillation, a dual-teacher framework where the degraded teacher defines locked behavior and a clean teacher guides editing quality, eliminating gradient interference through separate frozen targets. By initializing the student from the degraded teacher, released models begin locked by default and must relearn editing through distillation. Unauthorized users receive a degraded model producing unusable outputs, and adversarial fine-tuning cannot recover full editing capability. Evaluation on InstructPix2Pix shows authorized edits maintain baseline quality (CLIP-I: 0.779, DINO: 0.593) while unauthorized attempts degrade substantially (CLIP-I: 0.669, DINO: 0.360) with 11% and 23% drops in image and semantic similarity. The lock remains robust to key corruptions, spatial perturbations, and adversarial unlock fine-tuning. Code and data will be available for research purposes.

VisiLock: Authorizing Instruction-based Image editing with Dual Score Distillation

Abstract

System Figure

Diverged Score Distillation