Integration of lexicographic data: the diachronic plane

Dorota Mika is a passionate aspiring lexicographer, focusing on the history of the Polish language. She worked on a variety of lexicographic projects in the past including the electronic Conceptual Dictionary of Old Polish.In order to generate a clear workflow on how to integrate and merge diachronic lexicographic data from electronic dictionaries, she applied to visit the Instituut voor the Nederlandse Taal to benefit from its long expertise.

How did you learn about the ELEXIS travel grants?

I am a researcher at the Institute of Polish Language, Polish Academy of Sciences, which is the leading centre of lexicographic research in Poland. Our team, including me, regularly visits the ELEXIS website.

© Dorota Mika, 2020

Find out more about ELEXIS visiting grants :

What is your project about?

I would like to find a way to integrate diachronic lexicographic data. The main objective of the action is to prepare a concept of integrating data from four electronic dictionaries created at the Institute of Polish Language, Polish Academy of Sciences. I want to find the answers to such questions as how to combine data from several dictionaries in an automatic way, and how to search for links at the level of headwords, word meanings and inter-word relations to show a language in a diachronic perspective.

“I want to find the answers to such questions as how to combine data from several dictionaries in an automatic way,
and how to search for links at the level of headwords, word meanings and inter-word relations to show a language in a diachronic perspective.“

What is your background that brought you up to this point?

I received a PhD in linguistics in 2018. I’ve started my scientific work at the Department of Old Polish at the Institute of Polish Language, Polish Academy of Sciences. Now, I am a part of the team at the Department of Methodology, which is working on a project to integrate lexicographic data. Our challenge is to integrate different types of lexicographic materials – printed dictionaries, electronic ones (closed and still ongoing), word card catalogues, supplements and appendices. My research interests are focused on lexicography, the history of the Polish language and the historical evolution of word meaning.

Which hosting institution did you apply to and why?

My priority is to find solutions for integrating data in a diachronic perspective, so I have chosen the Instituut voor de Nederlandse Taal as a hosting institution. That is the place where the Diachronic Semantic Lexicon of Dutch was created. I would like to exchange ideas and learn about the technologies, lexicographical tools and infrastructure for lexicographic projects. I would also like to learn how to use techniques and tools dedicated to e-lexicography.

Where does your interest in lexicography come from and what keeps you motivated?

I started my work at the Institute of Polish Language, Polish Academy of Sciences when the last volume of the Dictionary of Jan Kochanowski’s Polish (Słownik polszczyzny Jana Kochanowskiego) was prepared for printing. It allowed me to learn the micro- and macrostructure of this idiolectal, i.e. the author’s, language dictionary.
The first large project I’ve been fully involved in was the electronic Conceptual Dictionary of Old Polish. The project required a transition from the printed version of the Old Polish Dictionary to an electronic one. Almost 30 dictionaries documenting the Polish language have been created at the Institute of Polish Language, Polish Academy of Sciences – these are projects in the field of the history of the Polish language, contemporary Polish, dialect varieties, onomastics, language contacts, and devoted to the Polish grammar. This collection is diverse: apart from electronic dictionaries (closed and still ongoing), it includes printed dictionaries, dictionary supplements, word card catalogues, and resource descriptions.

Together with Krzysztof Nowak, we currently coordinate the DARIAH-PL project at the Institute of Polish Language, Polish Academy of Sciences (POIR.04.02.00-00-D006/20; funded under the Smart Growth Operational Programme 2014-2020), that is carried out to build Dariah.lab – a research infrastructure for digital humanities. We provide a description of multidimensional linguistic data for the creation of infrastructure modules and the discovery of links between linguistic data and historical, geographical, and bibliographical information. We are developing the environment in which dictionaries will be digitized and integrated, to provide the user access to them via a modern and easy-to-use web interface linked to the Dariah.lab infrastructure.

In my work, I have access to unique lexicographic collections and I can learn from highly qualified specialists in the field of lexicography. I find it very inspiring.

Profile: Dorota Mika
Travel Grant Call 4
Period of stay 25. – 29.5.2020 (- postponed due to COVID-19)
30.5. – 3.6.2022
Project title Integration of lexicographic data: the diachronic plane
Home institution

Institute of Polish Language, Polish Academy of Science

#elexis_pl
Hosting institution Instituut voor de Nederlandse Taal
(INT, the Netherlands)
#elexis_nl