Language models

Language models are in the iARPA format, using witten-bell smoothing. They were created by the IRSTLM Tooklit. Models are lowercased.

Raw 2- and 3-gram frequencies of the corpus prim-5.0-public-all are also available.

Slovak National Corpus