TY  - THES
AB  - Part-of-speech ()-tagging is a method to predict a sequence of word classes given a sequence of words. Set-valued prediction can be used to allow a classifier to make restrained predictions in the face of uncertainty. In this thesis, we present a method for combining set-valued prediction with part of speech tagging to retrieve more reasonable predictions on difficult data. The set size allows the tagger to express its uncertainty of a specific prediction. The devised method can be applied to any -tagger capable of predicting a posterior distribution over the tags and provides set-valued predictions in a post-processing step. We implemented the method using the state-of-the-art tagger ore as our basis. The tagger is tuned to a diachronic corpus of Middle Lower German () that spans a wide spacial area. Because the corpus also captures human annotator uncertainty, special performance measures have been devised to properly evaluate the tagging performance. The resulting algorithm clearly outperforms our baseline in all considered measures. Our evaluation proves that set-valued prediction can give good predictions with utilities outperforming the accuracy score by large margins. This is especially shown in robustness tests that are difficult for the classifier. Results are compared against a baseline tagger, which profits even more from set-valued prediction.
AU  - Heid, Stefan
CY  - Paderborn
DA  - 2019
DO  - 10.17619/UNIPB/1-957
DP  - Universität Paderborn
LA  - eng
N1  - Tag der Abgabe: 02.12.2019
N1  - Universität Paderborn, Masterarbeit, 2019
PB  - Veröffentlichungen der Universität
PY  - 2019
SP  - 1 Online-Ressource (xi, 72 Seiten)
T2  - Fakultät für Elektrotechnik, Informatik und Mathematik
TI  - Set-Valued prediction for Part-of-Speech tagging
UR  - https://nbn-resolving.org/urn:nbn:de:hbz:466:2-37152
Y2  - 2026-06-23T19:40:41
ER  -