Towards efficient deep neural networks for robust visual recognition
Thesis event information
Date and time of the thesis defence
Place of the thesis defence
Online
Topic of the dissertation
Towards efficient deep neural networks for robust visual recognition
Doctoral candidate
Master of Science Jiehua Zhang
Faculty and unit
University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Center for Machine Vision and Signal Analysis
Subject of study
Computer Vision, Efficient Feature Representation, Remote Sensing Images Analysis
Opponent
Professor Enrico Magli, Politecnico di Torino
Custos
Professor Olli Silvén, University of Oulu
Towards efficient deep neural networks for robust visual recognition
Deep neural networks (DNNs) have emerged as a dominant paradigm for computer vision tasks, significantly advancing the capabilities of visual intelligence systems. Recent state-of-the-art models typically employ massive parameter counts and intricate architecture designs to achieve highly generalized representations capable of adapting to dynamic real-world scenarios. The widespread deployment of DNNs on edge devices (e.g., UAVs, cellphones, and satellites) has introduced stringent efficiency requirements; however, noise from imaging sensors and environmental interference can undermine the model’s discriminative ability, leading to reduced recognition accuracy in real-world applications. Consequently, developing lightweight DNNs that retain high performance under complex conditions remains a critical challenge for practical AI applications.
To this end, this thesis focuses on optimizing the accuracy-efficiency trade-off of lightweight DNNs. On the one hand, we develop specific lightweight architecture designs for efficient feature representation and enhanced discriminative ability. On the other hand, we explore effective quantization strategies to mitigate the performance gap between extreme low-bit DNNs and full-precision DNNs. Furthermore, we take the merit of median robust extended local binary pattern (MRELBP) to present a robust DNN against various noisy corruptions.
We start by exploring efficient feature aggregation and object response refinement for salient object detection in optical remote-sensing images (ORSI-SOD). We explicitly model local boundary cues and mitigate the suppression of object features caused by redundant backgrounds to improve the performance of lightweight DNNs under the complex scene. Next, we propose the first ORSI-SOD architecture based on the lightweight vision foundation model, achieving an exceptional efficiency-accuracy trade-off. Then, we propose dynamic threshold learning and deep semantic feature guidance for improving the unsatisfied accuracy of binarized neural networks (BNNs), which is an extreme quantization case for DNNs. Finally, we propose a novel noise-robust convolutional operator to address diverse real-world noise interference.
To this end, this thesis focuses on optimizing the accuracy-efficiency trade-off of lightweight DNNs. On the one hand, we develop specific lightweight architecture designs for efficient feature representation and enhanced discriminative ability. On the other hand, we explore effective quantization strategies to mitigate the performance gap between extreme low-bit DNNs and full-precision DNNs. Furthermore, we take the merit of median robust extended local binary pattern (MRELBP) to present a robust DNN against various noisy corruptions.
We start by exploring efficient feature aggregation and object response refinement for salient object detection in optical remote-sensing images (ORSI-SOD). We explicitly model local boundary cues and mitigate the suppression of object features caused by redundant backgrounds to improve the performance of lightweight DNNs under the complex scene. Next, we propose the first ORSI-SOD architecture based on the lightweight vision foundation model, achieving an exceptional efficiency-accuracy trade-off. Then, we propose dynamic threshold learning and deep semantic feature guidance for improving the unsatisfied accuracy of binarized neural networks (BNNs), which is an extreme quantization case for DNNs. Finally, we propose a novel noise-robust convolutional operator to address diverse real-world noise interference.
Created 14.11.2025 | Updated 17.11.2025