Language models

Language models are in the iARPA format, using witten-bell smoothing. They were created by the IRSTLM Tooklit. Models are lowercased.

Raw 2-, 3- and 4-gram frequencies of the corpus prim-7.0-public-all are also available.

Slovak National Corpus