Slovak-Spanish Parallel Corpus
The first version par-skes-1.0 was released in July 2019 containing about 11.5 million tokens (5 455 067 tokens in the Slovak half, 6 044 520 tokens in the Spanish half).
Slovak-Spanish Parallel corpus contains translations of 77 texts: translations from Spanish (59), translations from Slovak (1), as well as translations from other languages into Slovak and Spanish (17). The texts are automatically sentence-aligned. The Slovak texts are automatically morphologically annotated by the tagger MorphoDiTa which has been trained and tuned on tagset developed by the SNK. TreeTagger has been used to tag the Spanish texts.