FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy

1 CyberAgent, AI Lab  2 University of Tsukuba
CVPR 2025

Banner image

FreeUV generates a complete UV texture from a single face image without requiring ground-truth UV supervision during training. The method captures intricate details, such as facial hair, wrinkles, occlusions, and makeup, while demonstrating robustness across diverse scenarios, achieving high fidelity and coherent texture recovery.
Top to bottom: input face images, recovered UV textures, and FLAME model-based rendering.

Abstract

Recovering high-quality 3D facial textures from single-view 2D images is a challenging task, especially under the constraints of limited data and complex facial details such as wrinkles, makeup, and occlusions. In this paper, we introduce FreeUV, a novel ground-truth-free UV texture recovery framework that eliminates the need for annotated or synthetic UV data. FreeUV leverages a pre-trained stable diffusion model alongside a Cross-Assembly inference strategy to fulfill this objective. In FreeUV, separate networks are trained independently to focus on realistic appearance and structural consistency, and these networks are combined during inference to generate coherent textures. Our approach accurately captures intricate facial features and demonstrates robust performance across diverse poses and occlusions. Extensive experiments validate FreeUV's effectiveness, with results surpassing state-of-the-art methods in both quantitative and qualitative metrics. Additionally, FreeUV enables new applications, including local editing, facial feature interpolation, and texture recovery from multi-view images. By reducing data requirements, FreeUV offers a scalable solution for generating high-fidelity 3D facial textures suitable for real-world scenarios.

Key Idea

Selective domain utilization in FreeUV's texture recovery. Our Cross-Assembly strategy highlights how realistic appearance from in-the-wild images and structural consistency from 3DMM are selectively combined. FreeUV targets a UV-to-UV mapping with a Realistic and Consistent combination for optimal texture generation.

Key Idea Illustration

Method Overview

Method Overview

FreeUV leverages two modules, the Flaw-Tolerant Detail Extractor (left) and the UV Structure Aligner (middle), to separately capture realistic appearance and structural consistency. Combined during the Cross-Assembly inference phase (right), these modules produce high-quality UV textures from single-view images, without requiring ground-truth UV data.

Datasets (Coming Soon ...)

The generation of training data was performed using the following FLAME-based fitting method:

Makeup Extraction of 3D Representation via Illumination-Aware Image Decomposition,
Xingchao Yang, Takafumi Taketomi, Yoshihiro Kanamori,
Computer Graphics Forum (Proc. of Eurographics 2023)

If you find our models and dataset useful, please consider citing the following paper:

BibTeX

@misc{yang2025_freeuv,
        title={FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy}, 
        author={Xingchao Yang and Takafumi Taketomi and Yuki Endo and Yoshihiro Kanamori},
        year={2025},
        eprint={2503.17197},
  }