Emotion analysis in socially unacceptable discourse


  • Jasmin Franza University of Ljubljana, Faculty of Arts, Slovenia
  • Bojan Evkoski Jožef Stefan International Postgraduate School; Jožef Stefan Institute, Ljubljana, Slovenia
  • Darja Fišer University of Ljubljana, Faculty of Arts; Jožef Stefan Institute, Ljubljana; Institute of Contemporary History, Ljubljana, Slovenia




emotions, socially unacceptable discourse (SUD), hate speech, social media, corpora


Texts often express the writer’s emotional state, and it was shown that emotion information has potential for hate speech detection and analysis. In this work, we present a methodology for quantitative analysis of emotion in text. We define a simple, yet effective metric for an overall emotional charge of text based on the NRC Emotion Lexicon and Plutchik’s eight basic emotions. Using this methodology, we investigate the emotional charge of content with socially unacceptable discourse (SUD), as a distinct and potentially harmful type of text which is spreading on social media. We experiment with the proposed method on a corpus of Facebook comments, resulting in four datasets in two languages, namely English and Slovene, and two discussion topics, LGBT+ rights, and the European Migrants crisis. We reveal that SUD content is significantly more emotional than non-SUD comments. Moreover, we show differences in the expression of emotions depending on the language, topic, and target of the comments. Finally, to underpin the findings of the quantitative investigation of emotions, we perform a qualitative analysis of the corpus, exploring in more detail the most frequent emotional words of each emotion, for all four datasets. The qualitative analysis shows that the source of emotions in SUD texts heavily depends on the topic of discussion, with substantial overlaps between languages.


Download data is not yet available.


Alm, C., Roth, D., & Sproat, R. (2005). Emotions from Text: Machine Learning for Text-based Emotion Prediction. Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, October 2005, Vancouver, Canada (pp. 579–586). Association for Computational Linguistics. doi:10.3115/1220575.1220648 DOI: https://doi.org/10.3115/1220575.1220648

Al-Saqqa, S., Abdel-Nabi, H., & Awajan, A. (2018). A survey of textual emotion detection. 8th International Conference on Computer Science and Information Technology (CSIT), July 2018 (pp. 136–142). doi: 10.1109/CSIT.2018.8486405 DOI: https://doi.org/10.1109/CSIT.2018.8486405

Aman, S., & Szpakowicz, S. (2007). Identifying Expressions of Emotion in Text. In V. Matoušek & P. Mautner (Eds.), Text, Speech and Dialogue, SD 2007. Lecture Notes in Computer Science (Vol. 4629) (pp. 196–205). Berlin, Heidelberg: Springer. DOI: https://doi.org/10.1007/978-3-540-74628-7_27

Assimakopoulos, S., Baider, F. H., & Millar, S. (2017). Online Hate Speech in the European Union. A Discourse-Analytic Perspective. Cham: Springer International Publishing. DOI: https://doi.org/10.1007/978-3-319-72604-5

Brindle, A. (2016). The Language of Hate. A Corpus Linguistic Analysis of White Supremacist Language. London and New York: Routledge. DOI: https://doi.org/10.4324/9781315731643

Canales, L., Daelemans, W., Boldrini, E., & Martinez-Barco, P. (2019). EmoLabel: Semi-Automatic Methodology for Emotion Annotation of Social Media Text. IEEE Transactions on Affective Computing. Retrieved from https:// ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8758380

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Rout¬ledge.

Daelemans, W., Fišer, D., Franza, J., Kranjčić, D., Lemmens, J., Ljubešić, N., Markov, I., & Popič, D. (2020). The LiLaH Emotion Lexicon of Croatian, Dutch and Slovene. Slovenian language resource repository CLARIN.SI. https://www.clarin.si/repository/xmlui/handle/11356/1318

Denecke, K. (2008). Using SentiWordNet for Multilingual Sentiment Analysis. Proceedings of the 24th International Conference on Data Engineering, 7–12 April 2008, Cancun, Mexico (pp. 507–512). DOI: https://doi.org/10.1109/ICDEW.2008.4498370

Fišer, D., Ljubešić, N., & Erjavec, T. (2017). Legal framework, dataset and annotation schema for socially unacceptable online discourse practices in Slovene. Proceedings of the 1st Workshop on Abusive Language Online, ACL 2017, Vancouver, Canada (pp. 46–51). Association for Computational Linguistics. doi: 10.18653/v1/W17-3007 DOI: https://doi.org/10.18653/v1/W17-3007

Franza, J., & Fišer, D. (2019). The lexical inventory of Slovene socially unacceptable discourse on Facebook. Proceedings of the 7th Conference on Computer-Mediated Communication (CMC) and Social Media Corpora, CMC-Corpora 2019, Cergy-Pontoise, France. Retrieved from https://hal. archives-ouvertes.fr/hal-02292616/document#page=50

Ghazi, D. (2016). Identifying Expressions of Emotions and Their Stimuli in Text. PhD dissertation. Canada: University of Ottawa.

Gitari, N. D., Zuping, Z., Hanyurwimfura, D., & Long, J. (2015). A Lexicon-based Approach for Hate Speech Detection. International Journal of Multimedia and Ubiquitous Engineering (Vol. 10, No.4) (pp. 215–230). DOI: https://doi.org/10.14257/ijmue.2015.10.4.21

Knoblock, N. (2017). Xenophobic Trumpeters: A corpus-assisted discourse study of Donald Trump’s Facebook conversations. In A. Musolff (Ed.), Journal of Language Aggression and Conflict (Vol. 5, No.7) (pp. 295–322). Amsterdam/Philadelphia: John Benjamins Publishing Company. DOI: https://doi.org/10.1075/jlac.5.2.07kno

Ljubešić, N. (2019). The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian. Ljubljana: Slovenian language resource repository CLARIN.SI. Retrieved from http://hdl.handle.net/11356/1251

Ljubešić, N. (2020). The CLASSLA-StanfordNLP model for lemmatisation of standard Slovenian 1.1, Slovenian language resource repository CLARIN. SI. http://hdl.handle.net/11356/1286

Ljubešić, N., Fišer, D., & Erjavec, T. (2019). The FRENK datasets of Socially Unacceptable Discourse in Slovene and English. International Conference on Text, Speech, and Dialogue. Springer, Cham. doi: 10.1007/978-3-030-27947-9_9 DOI: https://doi.org/10.1007/978-3-030-27947-9_9

Ljubešić, N., Fišer, D., Erjavec, T., & Šulc, A. (2021). Offensive language dataset of Croatian, English and Slovenian comments FRENK 1.1. Ljubljana: Slovenian language resource repository CLARIN.SI. Retrieved from http://hdl.handle.net/11356/1462

Markov, I., Ljubešić, N., Fišer, D., & Daelemans, W. (2021). Exploring Stylometric and Emotion-Based Features for Multilingual Cross-Domain Hate Speech Detection. Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 149–159). Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.wassa-1.16/

Martins, R., Gomes, M., Almeida, J. J., Novais, P., & Henriques, P. (2018). Hate Speech Classification in Social Media Using Emotional Analysis. 7th Brazilian Conference on Intelligent Systems (BRACIS), 22–25 October 2018, Sao Paulo, Brazil (pp. 61–66). doi: 10.1109/BRACIS.2018.00019 DOI: https://doi.org/10.1109/BRACIS.2018.00019

Mohammad, S., & Yang T. (2011). Tracking Sentiment in Mail: How Genders Differ on Emotional Axes. Proceedings of the 2nd Workshop on Computa¬tional Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011) (pp. 70–79). Portland, Oregon: Association for Computational Linguistics.

Mohammad, S., & Turney, P. D. (2010). Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon. Pro¬ceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, June 2010, Los Angeles, California (pp. 26–34).

Pahor de Maiti, K., Fišer, D., & Ljubešić, N. (2019). How haters write: analysis of nonstandard language in online hate speech. Proceedings of the 7th Conference on Computer-Mediated Communication (CMC) and Social Media Corpora, CMC-Corpora, 9–10 September 2019, Cergy-Pontoise, France. Retrieved from https://hal.archives-ouvertes.fr/hal-02292616/document#page=44

Peng Q., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. Retrieved from https://arxiv.org/abs/2003.07082

Plutchik, R. (1980). Emotion: Theory, research and experience, 1. Academic Press.

Plutchik, R. (2001). The Nature of Emotions: Human Emotions Have Deep Evolutionary Roots, a Fact That May Explain Their Complexity and Provide Tools for Clinical Practice. American Scientist 89(4), 344–350. DOI: https://doi.org/10.1511/2001.4.344

Pratt, J. W., & Gibbons, J. D. (1981). Kolmogorov-Smirnov two-sample tests. Concepts of nonparametric theory. Springer, New York, NY. 318–344. DOI: https://doi.org/10.1007/978-1-4612-5931-2_7

Russell, J. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178. doi: 10.1037/h0077714 DOI: https://doi.org/10.1037/h0077714

Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Science Information, 44(4), 695–729. doi: 10.1177/05390184050582 DOI: https://doi.org/10.1177/0539018405058216

Vehovar, V., Povž, B., Fišer, D., Ljubešić, N., Šulc, A., & Jontes, D. (2020). Družbeno nesprejemljivi diskurz na Facebookovih straneh novičarskih portalov. Teorija in Praksa, 57(2), 622–645.

Zad, S., Jimenez, J., & Finlayson, M. A. (2021). Hell Hath No Fury? Correcting Bias in the NRC Emotion Lexicon. Proceedings of the 5th Workshop on Online Abuse and Harms, 6 August 2021, Bangkok, Thailand (pp. 102–111). Retrieved from https://aclanthology.org/2021.woah-1.pdf DOI: https://doi.org/10.18653/v1/2021.woah-1.11




How to Cite

Franza, J., Evkoski, B., & Fišer, D. (2022). Emotion analysis in socially unacceptable discourse. Slovenščina 2.0: Empirical, Applied and Interdisciplinary Research, 10(1), 1–22. https://doi.org/10.4312/slo2.0.2022.1.1-22




Most read articles by the same author(s)

1 2 > >>