Guoying Zhao is currently a Professor with the Center for Machine Vision and Signal Analysis, University of Oulu, Finland, where she has been a senior researcher since 2005 and an Associate Professor since 2014. In 2011, she was selected to the highly competitive Academy Research Fellow position. Her students and researchers are frequent recipients of very prestigious and competitive fellowships, such as the Nokia Scholarship, Infotech position, Tauno Tönning Research funding and Jorma Ollila grant. Her research has been reported by Finnish TV programs, newspapers and MIT Technology Review.
In the first part of this work, two image description methods are presented to provide discriminative representations for image classification. In the second part, based on static image description and deformable image registration, video analysis is studied for the applications of dynamic texture description, synthesis and recognition.
In recent years, facial expression recognition has become a useful scheme for computers
to affectively understand the emotional state of human beings. This thesis contributes to the research
and development of facial expression recognition systems from the above two aspects.
Face plays an important role in our social interactions as it conveys rich sources of
information. The thesis concerns using computer vision methodologies to
analyse two kinds of subtle facial information that can hardly be perceived by naked eyes: the
micro-expression, and the heart rate.
A preliminary exploration of Micro-Gesture is made in this thesis. By collecting recorded sequences of body gestures in a spontaneous state during games, a micro-gesture dataset is built through Kinect V2. Implementations of two sets of neural network architectures are achieved for micro-gestures segmentation and recognition task.
In this thesis, a new dataset of facial expressions of patients with lower back pains is collected and annotated. Both static features: well-known LBP, and Dense SIFT, and dynamic features: LBP-TOP and 3D-SIFT, are extracted. Experiments of pain detection and pain intensity estimation are carried out and the results are compared and analyzed.
In this thesis, we design and establish a new task-driven eye tracking dataset of 47 subjects. Moreover, we provide baseline results by evaluating popular saliency models. Furthermore, we hold discussions about the influence of tasks and image semantics on human visual behavior and provide suggestions for the scanpath estimation research in future.
We setup a data collecting system with Kinect camera to append five more gesture categories with complementary samples in the same format to ChaLearn Looking at People dataset. A deep learning approach is developped to detect and classify the samples with Deep Belief Network (DBN) and the Viterbi algorithm with only the skeleton data as input.