Monday, 9 December 2024: HiČKoK: History of Czech in Corpus Continuum

Zveřejněno: 26. 11. 2024

Daniel Zeman (ÚFAL MFF UK)
Jiří Pergler (UJČ AV ČR)

We will present the ongoing TAČR project focused on morphological annotation of texts from all historical stages of the Czech language. The goal of the project is to connect text corpora of different periods, so far built independently at different institutes, and to enrich them with lemmatization and uniform morphological annotation according to the Universal Dependencies standard. Manually annotated datasets will subsequently be used to train models capable of annotating other historical texts. After the initial overview, we will focus on some issues with designing uniform description of the changing language, especially in the oldest period (14th-15th centuries).

Termín: 14:00

Místo konání: MFF UK, Malostranské nám. 25, 4th floor, room S1

Více informací: https://ufal.mff.cuni.cz/events/hickok-history-czech-corpus-continuum