Tools and services
The ELEXIS project provides cost-free access to tools and infrastructures developed by the project partners not only for academic institutions in the EU, but also for researchers, teachers and scholars in Lexicography, Linguistics, Terminology, Natural Language Processing, etc., as well as any other entities interested in the tools & services provided. The number of available resources will grow in the course of the project. There are no financial implications to the institutions/researchers/scholars/teachers for accessing them.
Graphic Guide to ELEXIS Dictionary Tools
If you need help regarding which tools to use for your dictionary, please refer to the following diagram.
Note that certain help material and tools are still under development.
Tools & services available
Terminologists, Linguists, Translators, Teachers
The Sketch Engine corpus query, corpus building and corpus management system allows users to build and work with 550+ text corpora in over 90 languages and 30 scripts. Sketch Engine contains a number of unique tools to analyse large corpora of up to 60 billion words. Each user can benefit from fully automated dictionary-building functionality.
The access to Sketch Engine is funded by the EU through the ELEXIS project between 2018 and 2022. The access is provided at no cost to academic institutions and ELEXIS observers and applies to non-commercial use only. Currently, more than 450 institutions have been using the tool.
The ELEXIS funding of access to Sketch Engine terminates on 31 March 2022 for academic users from the EU.
Contact your institution about the future of your access.
Lexonomy is a cloud-based dictionary-writing and also online-dictionary-publishing system which is highly scalable to adapt to large dictionary projects as well as small lexicographic works such as editing and online publishing of domain-specific glossaries or terminology resources. Lexonomy already interacts with Sketch Engine and the aim of the project is to develop and expand this interaction further. Sketch Engine can push lexicographic data into Lexonomy to create automatically generated dictionary drafts and Lexonomy can pull data from Sketch Engine’s corpora during the entry editing process.
OneClick Dictionary (OCD) is a dictionary drafting module. It interconnects a corpus management system (e.g. SketchEngine, noSketch Engine) or even excel sheets with our dictionary writing and online dictionary publishing system Lexonomy and provides an automatically created dictionary draft (e.g. headwords, wordforms, collocations, examples), to be post-edited in Lexonomy by the lexicographer.
OneClick Dictionary enables lexicographers to shift all lexicographers work and intellectual input into the post-editing phase instead of manually analyzing the input data before creating a dictionary draft.
Elexifier is a cloud-based dictionary conversion service. It uses advanced XML parsing and machine learning techniques to help you convert your PDF and XML dictionaries in a standardized machine-readable format. Users can upload their PDF and custom XML dictionaries to Elexifier, define mapping rules for XML transformation or create a machine learning training set for PDF conversion and download the transformed XML or PDF dictionary in a TEI-compliant file format based on the Elexis Data Model.
EDiE: ELEXIS Dictionary Evaluator
This tool is evaluating the availability and usability of linked lexical resources and dictionaries published, using the ELEXIS dictionary API, which are accessible when using the ELEXIS infrastructure.
It allows users to assess different aspects of dictionaries based on their metadata and entries. Furthermore, aggregated metrics over dictionaries of interests/contexts let users compare different dictionaries for their specific use cases.
Tools & services available
for NLP researchers
natural language processing, machine learning, computational linguistics
The demonstration of the efficacy of Clusty for performing one of the most challenging tasks in natural language processing, sense clustering, is presented in D3.1 (below).
VerbAtlas is a novel large-scale manually-crafted semantic resource for wide-coverage, intelligible & scalable Semantic Role Labeling. The goal of VerbAtlas is to manually cluster WordNet synsets that share similar semantics into sets of semantically-coherent frames. The main features are:
- 466 semantically-coherent frames using 26 cross-frame VerbNet-inspired semantic roles for their argument structure.
- Available both for download and via RESTful API.
- Full coverage of WordNet 3.0 verb synsets (13,000+).
- Complete linkage to BabelNet 4.0, which supports 280+ languages (new version to come later this year!).
- Manual mapping to PropBank of all CoNLL-2009 and CoNLL-2012 dataset occurrences (5000+ mappings).
- Selectional preferences: the superconcept most probably associated with a semantic role in a frame (e.g. food for the patient role of the EAT frame).
- Default/shadow arguments: arguments logically implied or already incorporated into a verb.
- Implicit arguments: arguments that are implicit in the argument structure of a verb.
SyntagNet is a manually-curated large-scale lexical-semantic combination database which associates pairs of concepts with pairs of co-occurring words. The goal of SyntagNet is to capture sense distinctions evoked by syntagmatic relations, hence providing information which complements the essentially paradigmatic knowledge shared by currently available Lexical Knowledge Bases such as WordNet. Its main features are:
- Wide coverage, with 78,000 noun-verb and noun-noun lexical combinations extracted from the English Wikipedia and the British National Corpus.
- High-quality, fully manual disambiguation for all of the lexical combinations, according to the WordNet 3.0 sense inventory.
- A resulting Lexical Knowledge Base made up of 88,019 semantic combinations linking 20,626 WordNet 3.0 unique synsets with a relation edge.
- A user-friendly web interface for looking up terms and their lexical-semantic combinations, with complete linkage to BabelNet 4.0.
‘NAISC’ means ‘links’ in Irish and is pronounced ‘nashk’.
NAISC 1.0 is a tool for linking datasets and was created by the SFI Insight Centre for Data Analytics and the ELEXIS project. NAISC serves as a system for aligning RDF datasets: It takes as input 2 RDF documents (referred to as ‘left’ and ‘right) and outputs an alignment (set of RDF triples) between these two documents. NAISC typically relies on a configuration, which is a JSON document.
MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation
MultiMirror is a cross-lingual sense projection approach for multilingual WSD based on a novel discriminative word alignment model, capable of jointly aligning all source and target tokens with each other, surpassing its competitors across several language combinations. The sense-tagged datasets it produces lead a standard WSD classifier to achieve state-of-the-art performances on established benchmarks in French, German, Italian, Spanish and Japanese.
MultiMirror was developed by the Sapienza Natural Language Processing Group (Sapienza NLP) and the ELEXIS project.
The BabelNet Linker is a linking web service which produces a mapping between two dictionary definitions in a cross-lingual scenario.
The BabelNet-linker API allows a dictionary to be linked to BabelNet at definition level. Specifically, this API allows a definition in any language to be mapped to a semantically-equivalent English definition in BabelNet by relying on state-of-the-art Transformer-based architectures. Importantly, this API will make it possible to map the dictionaries made available within the ELEXIS Consortium at definition level by pivoting through BabelNet.
BabelNet Linker was developed by the Sapienza Natural Language Processing Group (Sapienza NLP) and the ELEXIS project.
The search tool ELEXIFINDER is dedicated to helping lexicographers and other researchers find scientific output in lexicography and related fields. It enables users to search through papers and videos, using concepts, i.e. words or set of words with a Wikipedia page, and various other conditions, e.g. source (conference etc.), author, language etc. Each paper/video is linked to its page where the users can download or view it.
CrossTheWord is a crossword puzzle game for Android with small and big crossword puzzles, available for free download via the GooglePlay Store.
– Hundreds of automatically generated crosswords (in constant growth!)
– Power-ups to boost your game experience and help you solve the unsolvable!
– A dynamic tap and swipe interface to surf through crosswords
– A subgame of lexical substitution to earn extra points!
More resources will be provided during the lifetime of the project and will be listed here as soon as they are available. Follow us and subscribe to our newsletter to get notified.