Partitional capacitated clustering methods for location-allocation problems

Thesis event information

L10, Linnanmaa

Topic of the dissertation

Partitional capacitated clustering methods for location-allocation problems

Doctoral candidate

Master of science Tero Lähderanta

Faculty and unit

University of Oulu Graduate School, Faculty of Science, Research unit of mathematical sciences

Statistics

Opponent

Professor Tommi Kärkkäinen, University of Jyväskylä

Custos

Professori Mikko Sillanpää, University of Oulu

Partitional capacitated clustering methods for location-allocation problems

Clustering is one of the most important and commonly used tools in data analysis. Most often it is used to partition the data points into groups, such that within a group the points are similar to each other. This type of partitional clustering method is used in many applications and one of those applications is location-allocation problems. Location-allocation refers to a set of optimization problems where the goal is to place facilities on a region, such that the distances between those facilities and data points are minimized. For example the placement of recycling centers is a type of location-allocation problem, where distances to the nearest residences and the processing limit of a single center needs to be considered.

Thesis connects partitional clustering to the location-allocation framework. The particular focus is on the constraints and penalties introduced in various location-allocation problems. Notably one of the most used constraints in the location-allocation literature are the limits on facility capacities. This refers to constraining the sizes of the clusters, and the clustering methods are referred to as partitional capacitated clustering methods.

In addition to the capacity limits, this thesis proposes various extensions to the clustering method, that are inspired by the location-allocation applications. These include, but are not limited to, different distance functions, constraints on the possible locations of cluster centers, different memberships of the data points, outlier detection and additional penalties on the selection of the number of clusters. Furthermore, a connection to the model-based clustering is made, where a statistical model is applied to the location-allocation problem. This brings new insights to the framework, for example by enabling probabilistic interpretation on the parameters of the model. Software tools and an optimization algorithm for such problems is also presented.
Last updated: 23.1.2024