→po slovensky

Corpus Structure

Monolingual corpus of written texts

The current version is prim-6.0 and has been available since the beginning of 2013. The public subcorpus contains more than 1.155 thousand million tokens.

There are only two publicly available versions of the corpus and several earlier ones. One can get access to the earlier versions by request:

Manually morphologically annotated corpus

Statistics