Structure of the Slovak National Corpus
Overview of SNC corpora.
Frequency lists of lemmata, word forms and parts of speech from the publicly available SNC corpora.
Monolingual corpus of written texts
The current version prim-8.0 has been available since January 2018. The publicly available subcorpus contains more than 1 500 million tokens. Registration for free access is required.
The previous version of the corpus prim-7.0 containing over 1 250 million tokens is also available.
Users can get access to the earlier versions by request.
Manually morphologically annotated corpus
Morphological database of the Slovak language
Other text corpora
Corpora of texts before the year 1955
Corpus of Dialects of the Slovak National Corpus
Corpus of Historical Slovak
Corpus of Crimean Tatar language
Slovak Terminology Database
WordNet is a lexical database including information about semantic relations of words. It is aligned with the Princeton 3.0 WordNet. Slovak synsets are linked to the English equivalents.