Building sentiment Lexicons applying graph theory on information from three Norwegian thesauruses

  • Hugo Hammer Oslo and Akershus University College of Applied Sciences Institute of Information Technology
  • Aleksander Bai Oslo and Akershus University College of Applied Sciences Institute of Information Technology
  • Anis Yazidi Oslo and Akershus University College of Applied Sciences Institute of Information Technology
  • Paal Engelstad Oslo and Akershus University College of Applied Sciences Institute of Information Technology

Abstract

Sentiment lexicons are the most used tool to automatically predict sentiment
in text. To the best of our knowledge, there exist no openly available
sentiment lexicons for the Norwegian language. Thus in this paper we
applied two different strategies to automatically generate sentiment lexicons
for the Norwegian language. The first strategy used machine translation to
translate an English sentiment lexicon to Norwegian and the other strategy
used information from three different thesauruses to build several sentiment
lexicons. The lexicons based on thesauruses were built using the Label
propagation algorithm from graph theory. The lexicons were evaluated
by classifying product and movie reviews. The results show satisfying
classification performances. Different sentiment lexicons perform well on
product and on movie reviews. Overall the lexicon based on machine
translation performed the best, showing that linguistic resources in English
can be translated to Norwegian without losing significant value.

Section
Artikler