Learning-based human action and affective gesture analysis

Thesis event information

Date and time of the thesis defence

Place of the thesis defence

Remote connection (check the link below)

Topic of the dissertation

Learning-based human action and affective gesture analysis

Doctoral candidate

Master of science Henglin Shi

Faculty and unit

University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Center for machine vision and signal analysis

Subject of study

Computer science and engineering


Associate professor Giovanna Varni, University of Trento


Academy Professor Guoying Zhao, University of Oulu

Visit thesis event

Add event to calendar

Understanding activities and emotions from human body gestures using computer vision and maching learning techiniques.

Human behavior understanding is an essential capability for developing applications and technologies for assisting our daily lives and work. Machines are expected to understand humans comprehensively from both activity and emotion aspects. This thesis is devoted to investigating computer vision and machine learning techniques for human behavior analysis. The study is conducted through three consecutive stages: (1) human action recognition through gestures; (2) human affective gesture recognition; and (3) human gesture detection.

Firstly, this thesis investigates robust human action recognition using skeleton data. Skeleton data has been widely used for human behavior understanding recently, since its large-scale extraction at low cost has become feasible. However, the reliability of the extracted skeleton is a concern among researchers due to the possible inaccurate results caused by dynamic illumination, occlusion, and so on. To solve this problem, two noise-resistant skeleton-based action recognition methods are developed.

Secondly, the thesis investigates human emotion understanding from body gestures. On the one hand, the problem of recognizing expressed emotion from body gestures is studied. For this purpose, a multi-scale graph convolution network that can effectively model the temporal dynamics for emotion recognition is developed. On the other hand, the thesis explores the recognition of micro-gestures for identifying human suppressed emotions. As a result, we collect a multi-modal micro-gesture dataset and propose an unsupervised micro-gesture recognition method.

Lastly, this thesis studies the problem of human gesture detection. In real-world scenarios, a given video may contain an arbitrary number of gestures, and their start/end times are also unknown. Consequently, recognition methods on their own cannot be directly applied. Thus, an action detection method is developed that can localize the temporal locations of possible gestures and simultaneously recognize their types.

In the last chapter, this thesis discusses the contributions and limitations of the work. Alongside this, we also discuss the future research direction of body gesture analysis, and propose its potential applications in human activity analysis and emotion understanding.
Last updated: 20.1.2023