Ravnur – the Faroese Speech Recognizer

Annika Simonsen is ambitiously contributing to the establishment of the lexicographic landscape in the Faroese language: Via the Ravnur Project she intends to push language technology, especially speech recognition in Faroese, further. Det Danske Sprog-og Litteraturselskab (DSL, Denmark) & the University of Copenhagen (UCPH, Denmark) are her perfect matches to get the ball rolling.

How did you learn about the ELEXIS travel grants?

The technical leader of my project, Peter Juel Henrichsen, informed me of the ELEXIS travel grants: Due to our joint research efforts for this project, in which I work with lexicography, he encouraged me to apply for a research grant.

What is your project about?

Project Ravnur is a Faroese speech recognition project that creates versatile language materials for a broad range of Faroese language technology (a so-called BLARK, or Basic Language Resource Kit). My current role in our project is to oversee our transcription assistants, but I am also creating a wide-coverage dictionary including phonetic information for all word forms and Part of Speech (PoS) tags. 

© by Annika Simonsen, 2020

Find out more about ELEXIS visiting grants:

There is no established lexicographic field in Faroese linguistics yet, but it is easy to recognize the need for one, especially for language technologists. My biggest motivation is knowing that all the work and materials I contribute within this project are going to be open-source and freely available for everyone to use for Faroese linguistic research.

What is your background that brought you up to this point?

I graduated from the University of Edinburgh with an MSc in Applied Linguistics in 2019. I have known for a long time that I wanted to work with Faroese linguistics – my interest lies in minority languages and Faroese is my mother tongue.

During my masters, I developed an interest in language technology and I was thrilled to be hired as a linguist by Project Ravnur. 

Which hosting institution did you apply to and why?

I applied to visit Det Danske Sprog-og Litteraturselskab (DSL, Denmark) & the University of Copenhagen (UCPH, Denmark) because they are the Danish partners in the PAROLE-project; this makes them my ideal hosts, since I am making a Faroese PAROLE-tagset and corpus.
Furthermore, DSL and UCPH have been making ML-based components for the tagging and analysis of Danish and other Nordic orthography since the beginning of the 1990s. Together, DSL, UCPH and DSN (Dansk Sprognævn) form one of the most important research centers for NLP in the North, specializing in language technology for the West Nordic languages. 

Where does your interest in languages/lexicography come from and what keeps you motivated?

During my undergrad and postgrad, I attended a handful of lexicographic courses, but it was at Project Ravnur that I got the opportunity to work hands-on with lexicography. There is no established lexicographic field in Faroese linguistics yet, but it is easy to recognize the need for one, especially for language technologists. My biggest motivation is knowing that all the work and materials I contribute within this project are going to be open-source and freely available for everyone to use for Faroese linguistic research. 

Profile: Annika Simonsen
Travel Grant Call 4
Period of stay

20.9. – 2.10.2020 (tentative dates)

17.11. – 26.11.2021 (confirmed dates)

Project title

Ravnur – the Faroese Speech Recognizer

Home institution

Grunnurin Føroysk Teldutala
(‘The Faroese Language Technology Foundation’), Tórshavn

#elexis_is
Hosting institution Det Danske Sprog-og Litteraturselskab
(DSL, Denmark)
&
The University of Copenhagen
(UCPH, Denmark)

#elexis_dk