Tekoäly apuna koltansaamen ja pohjoissaamen digitaalisten sanakirjojen toimitustyössä

Artificial intelligence utilised in editing the digital dictionaries of Skolt Saami and North Saami

The University of Oulu Giellagas Institute publishes the Skolt Saami and Northern Saami dictionaries in a digital format. The authors utilised artificial intelligence as part of the editing work.

The development and digitalisation of dictionaries is part of describing and recovering minority languages. The digital format allows the languages to be modernised and, in comparison to a printed book, enables including a higher number of translated words in the dictionary. A digital dictionary is easy to supplement, expand and correct as the vocabulary and the norms of standard language are specified.
For instance, as a standard language, Skolt Saami is still developing and has been used in new areas over the recent years. Consequently, the vocabulary has grown remarkably, and its renewal has been particularly rapid in the last decade.

This year, Skolt Saami was introduced as a major subject at the University of Oulu.

“The Skolt Saami digital dictionary is remarkably significant. It is the most up-to-date dictionary of Skolt Saami, and it contains a higher number of contemporary language words than its predecessors. The publication of the digital dictionary is a significant advancement from the teaching point of view as well. It is also important that the Skolts themselves participated in compiling the dictionary”, says Anni-Siiri Länsman, Director of the Giellagas Institute.

The Ve’rdd tool developed at the University of Helsinki for editing dictionaries of small languages was utilised in compiling the digital dictionaries. The tool involves an artificial intelligence whose properties include the capacity to inflect words automatically. The artificial intelligence reduces the time required for editing a dictionary by eliminating the need to manually write down each inflection of each word. The user can correct any incorrect forms produced by the artificial intelligence.

The machine-readable dictionary format enables correcting mistakes made by the artificial intelligence which improves its understanding of the language’s inflections. In addition, the same artificial intelligence and vocabulary produced using Ve’rdd can be directly utilised in proofreading applications and language teaching software.

The digital Finnish–Skolt Saami dictionary has more than 16,000 Finnish headwords and almost 19,000 translations. On the other hand, the digital North Saami–Finnish dictionary has over 50,000 headwords. Its manuscript was prepared by Pekka Sammallahti, Professor Emeritus, on the basis of his earlier dictionaries (1989 and 1993).

The digital dictionaries will be published on the shared platform of the Giellatekno Centre for Saami language technology at the University of Tromsø and the Divvun project:

Finnish–Skolt Saami Dictionary https://saan.oahpa.no/fin/sms/
North Saami–Finnish Dictionary http://satni.org/sammallahtismefin

The editing of both dictionaries was mainly carried out under special funding granted by the Ministry of Education and Culture to the University of Oulu Giellagas Institute. The Finnish–Skolt Saami dictionary was implemented jointly by the University of Oulu Giellagas Institute, the University of Helsinki, the University of Tromsø and the Saami Parliament.

The Giellagas Institute has a national responsibility to organise and promote the highest education and research in the Saami language and Saami culture in Finland.

 

Last updated: 4.11.2020