Towards efficient deep neural networks for robust visual recognition

Thesis event information

Date and time of the thesis defence

Place of the thesis defence

Online

Topic of the dissertation

Towards efficient deep neural networks for robust visual recognition

Doctoral candidate

Master of Science Jiehua Zhang

Faculty and unit

University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Center for Machine Vision and Signal Analysis

Subject of study

Computer Vision, Efficient Feature Representation, Remote Sensing Images Analysis

Opponent

Professor Enrico Magli, Politecnico di Torino

Custos

Professor Olli Silvén, University of Oulu

Visit thesis event

Add event to calendar

Towards efficient deep neural networks for robust visual recognition

Deep neural networks (DNNs) have emerged as a dominant paradigm for computer vision tasks, significantly advancing the capabilities of visual intelligence systems. Recent state-of-the-art models typically employ massive parameter counts and intricate architecture designs to achieve highly generalized representations capable of adapting to dynamic real-world scenarios. The widespread deployment of DNNs on edge devices (e.g., UAVs, cellphones, and satellites) has introduced stringent efficiency requirements; however, noise from imaging sensors and environmental interference can undermine the model’s discriminative ability, leading to reduced recognition accuracy in real-world applications. Consequently, developing lightweight DNNs that retain high performance under complex conditions remains a critical challenge for practical AI applications.

To this end, this thesis focuses on optimizing the accuracy-efficiency trade-off of lightweight DNNs. On the one hand, we develop specific lightweight architecture designs for efficient feature representation and enhanced discriminative ability. On the other hand, we explore effective quantization strategies to mitigate the performance gap between extreme low-bit DNNs and full-precision DNNs. Furthermore, we take the merit of median robust extended local binary pattern (MRELBP) to present a robust DNN against various noisy corruptions.

We start by exploring efficient feature aggregation and object response refinement for salient object detection in optical remote-sensing images (ORSI-SOD). We explicitly model local boundary cues and mitigate the suppression of object features caused by redundant backgrounds to improve the performance of lightweight DNNs under the complex scene. Next, we propose the first ORSI-SOD architecture based on the lightweight vision foundation model, achieving an exceptional efficiency-accuracy trade-off. Then, we propose dynamic threshold learning and deep semantic feature guidance for improving the unsatisfied accuracy of binarized neural networks (BNNs), which is an extreme quantization case for DNNs. Finally, we propose a novel noise-robust convolutional operator to address diverse real-world noise interference.
Created 14.11.2025 | Updated 17.11.2025