Deep Learning Methods for Analyzing Vision-Based Emotion Recognition from 3D/4D Facial Point Clouds

Thesis event information

Date and time of the thesis defence

Place of the thesis defence

Auditorium L5, Linnanmaa

Topic of the dissertation

Deep Learning Methods for Analyzing Vision-Based Emotion Recognition from 3D/4D Facial Point Clouds

Doctoral candidate

Master of Science Muzammil Behzad

Faculty and unit

University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Center for Machine Vision and Signal Analysis

Subject of study

Computer Science and Engineering


Professor Hui Yu, University of Portsmouth, UK


Academy Professor Guoying Zhao, University of Oulu

Visit thesis event

Add event to calendar

Deep Learning Methods for Analyzing Vision-Based Emotion Recognition from 3D/4D Facial Point Clouds

Facial expressions serve as one of the most vital ways for humans to express and communicate human emotions effectively. Their role on giving emphasis or to clarify something, expressing internal feelings or intentions, and their importance in structuring critical aspects of human interactions are widely acknowledged and are, thus, significantly crucial. With the advent of recently trending state-of-the-art technologies such as, deep learning, the capability of systems for automatically recognizing and analyzing facial expressions from human faces have consequently proved to be exceptionally instrumental in understanding human behavior. This ignites the kick-start of recognition systems that can offer handsome number of applications in a wide range of areas containing, but not limited to, security, psychology, medicine and robotics.

To further improve the performance of such facial expression recognition (FER) systems, the use of 3D/4D facial point clouds has essentially expanded facial expression analysis by amplifying the strength to combat the inherent problems of processing 2D facial images, e.g., issues with out-of-plane motions, head pose variations, and illumination and lighting conditions. In this regard, the release of facial expression datasets containing 3D/4D face scans has allowed effective affect recognition by fetching facial deformation patterns both spatially as well as temporally. At the same time, such data brings along its own inevitable challenges, for instance, its complex data structure and limited size. Therefore, its analysis necessitates the use, extension and introduction of relatively promising approaches to develop successful recognition systems.

This thesis aims to develop and offer a number of deep learning methods to build robust models for analyzing emotion recognition from 3D/4D facial point clouds. Specifically, the thesis first focuses on collaborative emotion recognition where facial multi-views are used along with concentrating additionally on utilizing the facial landmarks. Secondly, it highlights the importance of sparsity-aware affect recognition and its role towards significant deep learning models. Thirdly, it presents a multi-view transformer architecture for learning spatial embeddings by exploiting correlations in the multi-view embeddings along with the formulation of a gradient-friendly loss function. Following on, a novel multi-view facial rendezvous model is discussed that learns to recognize expressions in a self-supervised fashion. Finally, the contributions of this thesis are summarized in the end, and some potential future directions of 3D/4D FER studies are discussed.
Last updated: 23.1.2024