High-dimensional causal inference with sparse and debiased estimation

Thesis event information

Date and time of the thesis defence

Place of the thesis defence

OP auditorium (L10), Linnanmaa campus

Topic of the dissertation

High-dimensional causal inference with sparse and debiased estimation

Doctoral candidate

Master of Science Zewude Alemayehu Berkessa

Faculty and unit

University of Oulu Graduate School, Faculty of Science, Research Unit of Mathematical Sciences

Subject of study

Mathematical Sciences

Opponent

Associate Professor Pekka Marttinen, Aalto University

Custos

Professor Mikko Sillanpää, University of Oulu

Visit thesis event

Add event to calendar

A new regularization-based framework for causal inference in high-dimensional observational data

Understanding whether a treatment, policy, or intervention causes a specific outcome is a fundamental challenge in many scientific fields. While randomized experiments are the most reliable approach, they are often impractical, requiring researchers to rely on observational data, where confounding makes causal inference challenging. These challenges become more pronounced in high-dimensional settings, where the number of variables is large relative to the number of observations.

This dissertation introduces a new mathematical framework for estimating causal effects in high-dimensional observational data. In such settings, traditional methods may struggle to distinguish meaningful relationships from noise or fail to account for important confounding variables.

The proposed framework combines weighted L1 and L0 regularization with covariate balancing techniques to improve variable selection and reduce bias in treatment effect estimation. By integrating theoretical development, simulation studies, and real data applications, the method provides a flexible and robust approach that captures both strong and weak confounders without relying on explicit treatment modeling. Importantly, the framework also supports valid statistical inference, producing reliable confidence intervals in high-dimensional settings.

The method is evaluated using simulation studies and real-world datasets, including applications in health and genomics. The results show improved accuracy compared with widely used approaches.

Ultimately, the dissertation provides a practical, open-source framework for causal analysis in complex data environments, enabling more reliable decision-making.
Created 24.3.2026 | Updated 25.3.2026