Full-range yaw prediction: A multi-view approach for 3D head model pose estimation using convolutional neural network
View/Open
Publication date
2023ISBN
978-1-64368-449-9
Abstract
Head pose estimation, a crucial task in computer vision, involves determining the orientation of a person’s head in 3D space through yaw, pitch, and roll angles. While recent techniques present excellent results in estimating head pose from a single 2D RGB image when the head faces the camera directly, few methods exist for pose estimation from arbitrary viewpoints. This problem is emphasised when the input data is in 3D, such as heads reconstructed models from magnetic resonances, where an accurate estimation of the pose is necessary for diagnostic purposes. To overcome these limitations, we make a first step by proposing a method for fine-grained head pose estimation across the full-range of yaw angles using 3D head synthetic models. Our approach involves transforming the 3D pose estimation problem into a multi-class 2D image classification problem by representing 3D head models as multi-view projection images. Leveraging a fine-tuned ResNet50 convolutional neural network, we tackle the task of head pose estimation with fine granularity of 5°, effectively discretizing the 360° yaw orientations. For the evaluation of our proposal, we train and test our models with the publicly available FaceScape and 3D BIWI datasets obtaining promising results.
Document Type
Article
Document version
Published version
Language
English
Subject (CDU)
004 - Computer science and technology. Computing. Data processing
61 - Medical sciences
62 - Engineering. Technology in general
Pages
4 p.
Publisher
IOS Press
Is part of
Proceedings of the 25th International Conference of the Catalan Association for Artificial Intelligence
This item appears in the following Collection(s)
Rights
© L'autor/a
Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc/4.0/