Lexicon-based sentiment analysis, using multiple lexica and scaled against representative text corpora.
Sentiment analysis — the study of the positive or negative valence of texts — has wide-ranging applications across the social sciences. Automated approaches make it possible to code near unlimited quantities of texts with full replicability and high accuracy. Compared to machine-learning approaches, lexicon-based methods provide generalizability while sacrificing little in performance and gaining the ability to identify gradations in sentiment as well as cross-domain comparability. We introduce a method, MultiLexScaled, which averages valences across a number of widely-used general-purpose lexica. We validate the method against several benchmark datasets across a range of different domains and languages. We illustrate the value of identifying sentiment trends by examining coverage of Muslims in the British press, showing that tabloids and broadsheet papers diverged noticeably after 9/11, with tabloids becoming decidedly more negative about Muslims while the tone of broadsheet articles about Muslims remained relatively unchanged.