In my doctoral thesis, Fennistic language technology – data, methods and possibilities, I study what kinds of technological methods are being used in managing the data of Finnish language research, and what kinds of methods could be used in the future. Despite its long traditions, Fennistic relies on a very traditional way of collecting and processing the research data. With the different pieces of research in my dissertation, I aim to bring new, technologically aided ways of data collection and processing in the field of Fennistic, while also deepening the cross-disciplinary cooperation.
I have carried out the morphological annotation process and error-coding of an ICLFI-corpus consisting of a million words, planned the electronic data collection and transferred the data in full to the Korp-server of the Language Band of Finland. I have processed Oulun Nauhoitearkisto, the vast collection of recordings that have been collected since the 1950s, on the LAT-platform of the Language Bank and I have also created a metadata collection system for that data. Furthermore, I have built an electronic browsing system of microfiches for the Department of History at the University of Oulu.
- Corpus Linguistics
- Language Technology
- Digital Archiving
- TurkuNLP (Natural Language Processing)
- Northern Sociolinguistic Encounters (NSE)