TY - JOUR
T1 - High-fidelity facial reflectance and geometry inference from an unconstrained image
AU - Yamaguchi, Shugo
AU - Saito, Shunsuke
AU - Nagano, Koki
AU - Zhao, Yajie
AU - Chen, Weikai
AU - Olszewski, Kyle
AU - Morishima, Shigeo
AU - Li, Hao
N1 - Funding Information:
We would like to thank Bagel-mo for her grumpy face; Mr. Sirnif and Aviral Agarwal for the renderings; Carrie Sun, Jens Fursund, Mike Seymour, and Emily O’Brien for being our models; and Chen Li for the albedo inference comparisons. This research is supported by Adobe, the Andrew and Erna Viterbi Early Career Chair, the JST ACCEL Grant Number JPMJAC1602, the Waseda Research Institute for Science and Engineering, and the U.S. Army Research Laboratory (ARL) under contract W911NF-14-D-0005. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018
Y1 - 2018
N2 - We present a deep learning-based technique to infer high-quality facial reflectance and geometry given a single unconstrained image of the subject, which may contain partial occlusions and arbitrary illumination conditions. The reconstructed high-resolution textures, which are generated in only a few seconds, include high-resolution skin surface reflectance maps, representing both the diffuse and specular albedo, and medium- and highfrequency displacement maps, thereby allowing us to render compelling digital avatars under novel lighting conditions. To extract this data, we train our deep neural networks with a high-quality skin reflectance and geometry database created with a state-of-the-art multi-view photometric stereo system using polarized gradient illumination. Given the raw facial texture map extracted from the input image, our neural networks synthesize complete reflectance and displacement maps, as well as complete missing regions caused by occlusions. The completed textures exhibit consistent quality throughout the face due to our network architecture, which propagates texture features from the visible region, resulting in high-fidelity details that are consistent with those seen in visible regions.We describe how this highly underconstrained problem is made tractable by dividing the full inference into smaller tasks, which are addressed by dedicated neural networks. We demonstrate the effectiveness of our network design with robust texture completion from images of faces that are largely occluded. With the inferred reflectance and geometry data, we demonstrate the rendering of high-fidelity 3D avatars from a variety of subjects captured under different lighting conditions. In addition, we perform evaluations demonstrating that our method can infer plausible facial reflectance and geometric details comparable to those obtained from high-end capture devices, and outperform alternative approaches that require only a single unconstrained input image.
AB - We present a deep learning-based technique to infer high-quality facial reflectance and geometry given a single unconstrained image of the subject, which may contain partial occlusions and arbitrary illumination conditions. The reconstructed high-resolution textures, which are generated in only a few seconds, include high-resolution skin surface reflectance maps, representing both the diffuse and specular albedo, and medium- and highfrequency displacement maps, thereby allowing us to render compelling digital avatars under novel lighting conditions. To extract this data, we train our deep neural networks with a high-quality skin reflectance and geometry database created with a state-of-the-art multi-view photometric stereo system using polarized gradient illumination. Given the raw facial texture map extracted from the input image, our neural networks synthesize complete reflectance and displacement maps, as well as complete missing regions caused by occlusions. The completed textures exhibit consistent quality throughout the face due to our network architecture, which propagates texture features from the visible region, resulting in high-fidelity details that are consistent with those seen in visible regions.We describe how this highly underconstrained problem is made tractable by dividing the full inference into smaller tasks, which are addressed by dedicated neural networks. We demonstrate the effectiveness of our network design with robust texture completion from images of faces that are largely occluded. With the inferred reflectance and geometry data, we demonstrate the rendering of high-fidelity 3D avatars from a variety of subjects captured under different lighting conditions. In addition, we perform evaluations demonstrating that our method can infer plausible facial reflectance and geometric details comparable to those obtained from high-end capture devices, and outperform alternative approaches that require only a single unconstrained input image.
KW - Facial modeling
KW - Image-based modeling
KW - Texture synthesis and inpainting
UR - http://www.scopus.com/inward/record.url?scp=85056641761&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056641761&partnerID=8YFLogxK
U2 - 10.1145/3197517.3201364
DO - 10.1145/3197517.3201364
M3 - Article
AN - SCOPUS:85056641761
SN - 0730-0301
VL - 37
JO - ACM Transactions on Graphics
JF - ACM Transactions on Graphics
IS - 4
M1 - 162
ER -