Jožef Stefan Institute has been involved in the compilation of the majority of existing Slovene mono- and multilingual corpora, the development of manually annotated corpora for training language annotation tools, the development of the tools, such as part-of-speech taggers, lemmatisers, parsers and named entity recognisers and in work on standardisation of linguistic encoding in the scope of TEI and ISO.
It is also the home of the Slovene research infrastructure CLARIN.SI, a member of the CLARIN ERIC.
Existing tools and services include:
Visiting researchers will benefit from having access to all of the resources and services at JSI and in CLARIN.SI, as well as to the resources of the Centre for Language Resources and Technologies at the University of Ljubljana (an institution with an observer status in ELEXIS).
The resources range from lexical databases, dictionaries, reference, specialised and training corpora, to language technologies such as part-of-speech taggers, parsers etc.
Visiting researchers will also have at their disposal the expertise of researchers such as Simon Krek, Tomaž Erjavec, Marko Grobelnik and Dunja Mladenič, who are internationally acknowledged experts in their respective fields.