Full-range yaw prediction: A multi-view approach for 3D head model pose estimation using convolutional neural network
Visualitza/Obre
Data de publicació
2023ISBN
978-1-64368-449-9
Resum
Head pose estimation, a crucial task in computer vision, involves determining the orientation of a person’s head in 3D space through yaw, pitch, and roll angles. While recent techniques present excellent results in estimating head pose from a single 2D RGB image when the head faces the camera directly, few methods exist for pose estimation from arbitrary viewpoints. This problem is emphasised when the input data is in 3D, such as heads reconstructed models from magnetic resonances, where an accurate estimation of the pose is necessary for diagnostic purposes. To overcome these limitations, we make a first step by proposing a method for fine-grained head pose estimation across the full-range of yaw angles using 3D head synthetic models. Our approach involves transforming the 3D pose estimation problem into a multi-class 2D image classification problem by representing 3D head models as multi-view projection images. Leveraging a fine-tuned ResNet50 convolutional neural network, we tackle the task of head pose estimation with fine granularity of 5°, effectively discretizing the 360° yaw orientations. For the evaluation of our proposal, we train and test our models with the publicly available FaceScape and 3D BIWI datasets obtaining promising results.
Tipus de document
Article
Versió del document
Versió publicada
Llengua
Anglès
Matèries (CDU)
004 - Informàtica
61 - Medicina
62 - Enginyeria. Tecnologia
Paraules clau
Pàgines
4 p.
Publicat per
IOS Press
Publicat a
Proceedings of the 25th International Conference of the Catalan Association for Artificial Intelligence
Aquest element apareix en la col·lecció o col·leccions següent(s)
Drets
© L'autor/a
Excepte que s'indiqui una altra cosa, la llicència de l'ítem es descriu com http://creativecommons.org/licenses/by-nc/4.0/