Go to page
 

Bibliographic Metadata

Title
Set-Valued prediction for Part-of-Speech tagging / Stefan Heid ; 1. Reviewer Prof. Dr. Eyke Hüllermeier, 2. Reviewer Prof. Dr. Michaela Geierhos
AuthorHeid, Stefan
ParticipantsHüllermeier, Eyke ; Geierhos, Michaela
PublishedPaderborn, 2019
Edition
Elektronische Ressource
Description1 Online-Ressource (xi, 72 Seiten) : Diagramme
Institutional NoteUniversität Paderborn, Masterarbeit, 2019
Annotation
Tag der Abgabe: 02.12.2019
Date of Submission02/12/2019
LanguageEnglish
Document TypesMaster Thesis
URNurn:nbn:de:hbz:466:2-37152 
DOI10.17619/UNIPB/1-957 
Files
Set-Valued prediction for Part-of-Speech tagging [1.04 mb]
Links
Reference
Classification
Abstract (English)

Part-of-speech ()-tagging is a method to predict a sequence of word classes given a sequence of words. Set-valued prediction can be used to allow a classifier to make restrained predictions in the face of uncertainty. In this thesis, we present a method for combining set-valued prediction with part of speech tagging to retrieve more reasonable predictions on difficult data. The set size allows the tagger to express its uncertainty of a specific prediction. The devised method can be applied to any -tagger capable of predicting a posterior distribution over the tags and provides set-valued predictions in a post-processing step. We implemented the method using the state-of-the-art tagger ore as our basis. The tagger is tuned to a diachronic corpus of Middle Lower German () that spans a wide spacial area. Because the corpus also captures human annotator uncertainty, special performance measures have been devised to properly evaluate the tagging performance. The resulting algorithm clearly outperforms our baseline in all considered measures. Our evaluation proves that set-valued prediction can give good predictions with utilities outperforming the accuracy score by large margins. This is especially shown in robustness tests that are difficult for the classifier. Results are compared against a baseline tagger, which profits even more from set-valued prediction.

License
CC-BY-License (4.0)Creative Commons Attribution 4.0 International License