Let’s decode disease and health!

Welcome to the opening post of the Decoding health and disease blog! Our blog explores the phenomena in the fields of population health science and computational medicine. We, the main authors, work as researchers at the Faculty of Medicine, University of Oulu, with specialized expertise in genetics and statistics. Additionally, we are privileged to occasionally host distinguished guest experts from diverse disciplines.

First, let’s provide some background information.

What are population health science and computational medicine, exactly?

Population health science is an interdisciplinary field that studies factors influencing human health in populations or population groups. The information it produces helps us understand the state of public health, identify factors shaping the risk of diseases, detect health trends, and assess the effectiveness of healthcare interventions like vaccination programs.

Computational medicine harnesses extensive biological data and statistical methods. It delves deep into the world of genes and molecules, seeking clues about why certain diseases occur, how disease mechanisms function, and how new treatments can be developed. In short, population health science and computational medicine help us uncover what makes us healthy or sick and how we can enhance our well-being.

Research in the fields of population health and computational medicine seeks to answer a wide range of questions. Areas of interest may include the prevalence of a specific disease, predicting disease onset and progression, assessing causal relationships, or unraveling the biological mechanisms underlying diseases. Intriguing research questions in these scenarios could be, for instance, “how many cases of cancer are detected in Finland each year”, “what is the likelihood of a heart attack in the next year in individuals belonging to a specific risk group”, “is there a causal link between smoking and cerebrovascular disorders”, or “which genetic factors contribute to the risk of atopic dermatitis”.

Every research question and the interpretation of its results have their unique nuances that are crucial to consider to avoid misinterpretation (stay tuned for more on this topic in future posts). A unifying theme across all these questions is the pursuit of answers through data: large datasets form the foundation of our research. You might have come across various population research projects and cohort datasets, such as the exceptional Northern Finland Birth Cohorts, coordinated by the University of Oulu, which have yielded invaluable health-related insights over decades.

Where can we find these large datasets, then?

Datasets can be gathered through various means, with one of the most common approaches being the distribution of survey questionnaires by mail. In the realm of Finnish health research, valuable resources are found in health databases, such as medical birth register, Finnish cancer registry, and Finnish national infectious diseases register. These databases offer insights into the health of citizens, with some records dating back to the 1950s.

Registry data and survey questionnaires can be incorporated into more extensive research projects, like the previously mentioned Northern Finland Birth Cohorts. During these studies, participants can also contribute by offering blood samples for biomarker (e.g., cholesterol) measurements, undergoing measurements of physiological variables like weight and blood pressure, and providing DNA samples for the evaluation of hereditary risk factors related to diseases.

Currently, the goal is to merge information into a larger, more comprehensive “big data”. An excellent example of this is the FinnGen project, which has brought together the health data of half a million Finns, utilizing, among other sources, samples collected by Finnish biobanks.

Towards a healthier life with data

Numbers are extracted from the database, statistical analysis is conducted, a touch of text is added, and a new health article is born. Is it truly as straightforward as it seems?

Extensive population datasets contain numerous variables that cannot be directly controlled by researchers, in contrast to the partial control possible in laboratory-based experiments. Examples of such variables include various environmental and lifestyle factors, which can introduce potential biases and complicate result interpretation. This is where data science enters the picture: it equips us with tools to manage potential sources of research bias. Furthermore, data analysis results are seldom straightforward and require thorough examination and thoughtful consideration by field experts from various perspectives.

When interpreting the results, it is crucial to understand the difference between population-level and individual-level perspectives. Population-based findings can lead to the mistaken assumption that they apply universally to everyone. However, as the name implies, population health science examines phenomena at the population level. While it is possible to assess an individual's risk for certain diseases, it is not feasible to definitively identify which individuals within a large group will become ill or remain healthy.

At the individual level, the risk of developing an illness is influenced by a significant degree of randomness. However, when examining larger populations, it becomes easier to account for various sources of uncertainty. Let's consider smoking as an example: at the population level, it is well-established that smoking increases the risk of various diseases, including lung cancer. Nevertheless, not all smokers develop lung cancer, and conversely, lung cancer can be diagnosed in individuals who do not smoke. Nonetheless, individual cases cannot negate the evidence from numerous population-level studies indicating that smoking is detrimental to health, and that refraining from smoking and avoiding exposure to tobacco smoke yields health benefits.

In summary, population-level results offer valuable insights that individuals can best leverage by adhering to the recommendations and treatment decisions derived from these findings. The information generated through research in population health and computational medicine serves various purposes, including informing healthcare service planning and the development of more tailored treatments for specific groups. Ultimately, the main objective is to enhance the quality of life for populations and, in turn, for individuals.

We hope we have succeeded in shedding some light on the intricate yet ever-relevant landscape of our research field. In our forthcoming posts, we will explore subjects like the importance of statistical literacy, unravel the mysteries of ‘omic’ research, and elucidate the meaning of the term 'risk factor' commonly used in research outcomes and health news. Welcome aboard!


Marita Kalaoja, M.Sc. (väit.), doctoral researcher

Ville Karhunen, Ph.D., postdoctoral researcher

Minna Karjalainen, Ph.D., senior research fellow

Eeva Sliz, Ph.D., postdoctoral researcher

Jaakko Tyrmi, Ph.D., postdoctoral researcher