CORPES (Corpus del español del siglo XXI):
It is the current reference corpus for Spanish language.
It is a continuously growing corpus that by 2016 contained 237678 texts and 225 million words from different geographical areas.
CORPES contains written and spoken material produced since 2000 and has a rich variety of text types, genres and topics.
The CORPES has been morphosyntactically annotated and lemmatised.
The public access to this corpus can be found here.
CREA (Corpus de referencia del español actual):
It is a morphosyntactically annotated and lemmatised balanced reference corpus comprising a wide variety of written texts and spoken transcriptions produced in all the Spanish-speaking countries from 1975 to 2004.
This corpus can be accessed at here or here.
CORDE (Corpus diacrónico del español):
It is a diachronic corpus with written texts ranging from the origins of the Spanish language to 1974.
It contains 250 millions of written words from different genres, types and geographical origins. The CORDE has been morphosyntactically annotated and lemmatised.
The public textual version of this corpus can be accessed here.
CNDHE (Corpus del nuevo diccionario histórico):
It is the corpus used for the New Historical Dictionary of the Spanish Language (NDHE).
It has more than 350 million words, many of them extracted from the CREA and the CORDE, with texts from the 12th century to the year 2000.
This corpus can be accessed here.
DRAE 23 Access Log:
The DLE 23 online receives on average sixty million lookups per month from both the web and mobile devices apps.
Access log records are preprocessed and stored in a noSQL database to extract information or analyse tendencies.
In addition to the information provided by the web server, processed log records contain information on search term(s), GeoIP (country, city, coordinates, etc.), whether search terms are present in the DLE or not, corresponding lemma(s) of the searched terms, etc.
These data provide very useful hints on lexical use, evolution or sociologically motivated lexical trends and has an interface that can be accessed within the Enclave Platform.
Rooted in the first dictionary ever published by RAE, the DLE is edited periodically since 1780.
It describes the Spanish general vocabulary while also registering local, terminological or obsolete specificities.
It is conceived as a decoding (semasiological) tool for native speakers and is collectively updated by all national academies from Spanish-speaking countries.
The last version to date (23rd edition, 2014) contains more than 93.000 entries, 26.000 multiword expressions and 195.000 senses and can be accessed here.
On top of the semantic information characteristic to dictionaries, it provides hints on regional, obsolete or classic, register or domain specific uses. Etymology, variants, spelling or morphology directions are also given when appropriate. More than 16.000 senses display examples show the behaviour of the word in context.
It is stored in a relational database with XML, HTML and printable exporting capabilities.
A friendly, tailor made Dictionary Writing System works on top of the database to ease lexicographic work.
The DLE is freely available online since 2001.
Lately it receives an average of sixty million lookups per month.
The interface provides several linguistically motivated search facilities such as term autocompletion, inflected, derived or affixed forms lookup or ortophonografic neutralisation.
In addition, definitions and examples have been lemmatised to provide intratextual navigation.