Slovak-Spanish Parallel Corpus

The first version par-skes-1.0 was released in July 2019 containing about 11.5 million tokens (5 455 067 tokens in the Slovak half, 6 044 520 tokens in the Spanish half).

To access the whole corpus, use the web interface NoSketchEngine to query the Spanish half or the Slovak half. Knowledge of NoSketch Engine and CQL is required.

Slovak-Spanish Parallel corpus contains translations of 77 texts: translations from Spanish (59), translations from Slovak (1), as well as translations from other languages into Slovak and Spanish (17). The texts are automatically sentence-aligned. The Slovak texts are automatically morphologically annotated by the tagger MorphoDiTa which has been trained and tuned on tagset developed by the SNK. TreeTagger has been used to tag the Spanish texts.