Passer à la navigation principale Passer à la recherche Passer au contenu principal

Turning silver into gold: Error-focused corpus reannotation with active learning

  • Computer Research Institute of Montreal

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférenceParticipation à un ouvrage collectif lié à un colloque ou une conférenceRevue par des pairs

3 Citations (Scopus)

Résumé

While high quality gold standard annotated corpora are crucial for most tasks in natural language processing, many annotated corpora published in recent years, created by annotators or tools, contains noisy annotations. These corpora can be viewed as more silver than gold standards, even if they are used in evaluation campaigns or to compare systems' performances. As upgrading a silver corpus to gold level is still a challenge, we explore the application of active learning techniques to detect errors using four datasets designed for document classification and part-of-speech tagging. Our results show that the proposed method for the seeding step improves the chance of finding incorrect annotations by a factor of 2.73 when compared to random selection, a 14.71% increase from the baseline methods. Our query method provides an increase in the error detection precision on average by a factor of 1.78 against random selection, an increase of 61.82% compared to other query approaches.

langue originaleAnglais
titreInternational Conference on Recent Advances in Natural Language Processing in a Deep Learning World, RANLP 2019 - Proceedings
rédacteurs en chefGalia Angelova, Ruslan Mitkov, Ivelina Nikolova, Irina Temnikova, Irina Temnikova
EditeurIncoma Ltd
Pages758-767
Nombre de pages10
ISBN (Electronique)9789544520557
Les DOIs
étatPublié - 2019
Modification externeOui
Evénement12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 - Varna, Bulgarie
Durée: 2 sept. 20194 sept. 2019

Série de publications

NomInternational Conference Recent Advances in Natural Language Processing, RANLP
Volume2019-September
ISSN (imprimé)1313-8502

Conférence

Conférence12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019
Pays/TerritoireBulgarie
La villeVarna
période2/09/194/09/19

Empreinte digitale

Voici les principaux termes ou expressions associés à « Turning silver into gold: Error-focused corpus reannotation with active learning ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation