Developing a corpus-based method for extracting French and Albanian polylexical units

Eglantina Gishti is a passionate lexicographer, as well as an experienced linguist, professor and translator: In order to overcome the methodological, quantitative and qualitative gap between various dictionaries she is working with on a daily basis, she decided to visit ELEXIS infrastructures in Denmark to learn how to develop and improve tools and services needed for her work.

© Eglantina Gishti, 2019

How did you learn about the ELEXIS travel grants?

My work actually  focuses on lexicography and corpus linguistics and I am trying to be as up to date as possible.
I am actively visiting different websites that deal with my field of studies on a regular basis. I saw an informative video about the ELEXIS project on Euralex’ website, therefore I visited the ELEXIS’ website where I discovered the announcement about the open call for travel grants.   

Which hosting institution did you apply to?

I chose Det Danske Sprog-og Litteraturselskab (DSL, Denmark) & the University of Copenhagen (UCPH, Denmark) because their work meets the needs of my project, so I figured that I would gain the best expertise and experience in the field of corpus based work there, including all the phases of the work (language resources, corpus compilation, syntax and semantic annotation, etc.)

What is your project about?

My project deals with the creation of a corpus-based method for the extraction of polylexical units (starting with French and Albanian; in due course, we will be able to include other languages).

I decided to work on this project because, as a linguist, professor and translator, I realized that the dictionaries available are not sufficient for our work, especially regarding polylexical units. Additionally, the field of lexicography is not very well developed in my country [i.e. Albania] and there is a huge methodological, quantitative and qualitative gap between the dictionaries we are working with on a daily basis (mono-/bilingual Albanian dictionaries and mono-/bilingual French and English dictionaries).

The goals of my project are:

  • to improve the status quo of Albanian corpus-oriented tools and Albanian dictionary-oriented tools in general,
  • to design a toolbox for the digitalization of texts (based on the authors’ rights law in Albania);
  • to define the extraction process to determine which polylexical units are part of the common and which ones are part of the specialized lexicon: this work will allow an analysis of the treatment of collocations in the respective entries;
  • to use French databases (based on the authorization of the respective institutions),
  • to provide theoretical and practical results and suggestions. 

Find out more about ELEXIS visiting grants and former winning projects:

I realized that the dictionaries available are not sufficient for our work, especially regarding polylexical units. Additionally, the field of lexicography is not very well developed in my country [i.e. Albania] and there is a huge methodological, quantitative and qualitative gap between the dictionaries we are working with on a daily basis.

What is your background that brought you to this point?

I am holding a PhD degree in Linguistics, with specialization in French lexicography. During my PhD studies, I was a member of a research team (LabLex) working on the creation of a bilingual dictionary: Nouveau Dictionnaire Général Bilingue français-italien / italien-français.
Currently, I am a lecturer at the University of Tirana and I am teaching linguistics and lexicography at the department of French Language. My research focus lies on lexicography.

Where does your interest in languages/lexicography come from and what keeps you motivated?

I have been studying languages since secondary school; after enrolling for university lexicography became my favorite subject.
So, I carried on with my studies in linguistics and when I started my PhD I found myself within a research team of PhD students and professors from many universities in Italy and France, working in the field of lexicography: My interest in and fascination for dictionaries are only growing bigger. In the course of time, I realized that lexicography goes hand in hand with technology and so I started to grow interested in corpora tools as well.

Profile: Eglantina Gishti 
Travel Grant Call 3
Period of stay (16. – 21.3.2020 – postponed)
4. – 8.4.2022
Project title A Corpus-based method for Extraction of Polylexical Units (in French and Albanian languages)
Home Institution

Faculty of Foreign Languages, University of Tirana

#elexis_al
Hosting institution

Det Danske Sprog-og Litteraturselskab (DSL, Denmark)

University of Copenhagen (UCPH, Denmark)

#elexis_dk