De grootste kennisbank van het HBO

Inspiratie op jouw vakgebied

Vrij toegankelijk

Terug naar zoekresultatenDeel deze publicatie

Samenvatting

With the proliferation of misinformation on the web, automatic methods for detecting misinformation are becoming an increasingly important subject of study. If automatic misinformation detection is applied in a real-world setting, it is necessary to validate the methods being used. Large language models (LLMs) have produced the best results among text-based methods. However, fine-tuning such a model requires a significant amount of training data, which has led to the automatic creation of large-scale misinformation detection datasets. In this paper, we explore the biases present in one such dataset for misinformation detection in English, NELA-GT-2019. We find that models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. Furthermore, we use SHAP to interpret the outputs of a fine-tuned LLM and validate the explanation method using our inherently interpretable baseline. We critically analyze the suitability of SHAP for text applications by comparing the outputs of SHAP to the most important features from our logistic regression models.

Toon meer
OrganisatieHogeschool van Amsterdam
Gepubliceerd inArtificial Intelligence and Machine Learning Pagina's: 1-15
Jaar2023
TypeConferentiebijdrage
DOI10.1007/978-3-031-39144-6_1
TaalEngels

Op de HBO Kennisbank vind je publicaties van 26 hogescholen

De grootste kennisbank van het HBO

Inspiratie op jouw vakgebied

Vrij toegankelijk