Analiza čustev v družbeno nesprejemljivem diskurzu
DOI:
https://doi.org/10.4312/slo2.0.2022.1.1-22Ključne besede:
čustva, družbeno nesprejemljivi diskurz (DND), sovražni govor, družbena omrežja, korpusiPovzetek
Besedila pogosto izražajo avtorjevo čustveno stanje in pokazalo se je, da imajo informacije o čustvih potencial za odkrivanje in analizo sovražnega govora. V prispevku predstavljamo kvantitativno metodologijo analize čustev v besedilu. Na podlagi leksikona čustev NRC Emotion Lexicon in Plutchikovega modela osmih osnovnih čustev smo definirali preprosto, a učinkovito metodo za odkrivanje čustvene zaznamovanosti besedila. Z navedeno metodologijo smo raziskali čustveno zaznamovanost besedil, označenih kot družbeno nesprejemljivi diskurz (DND), ki predstavlja izrazito in potencialno škodljivo vrsto besedila ter se dandanes hitro širi na družbenih omrežjih. Metodo čustvene zaznamovanosti smo aplicirali na korpus komentarjev s Facebooka. Primerjavo in analizo smo izvajali na štirih zbirkah podatkov v dveh jezikih, in sicer v angleščini in slovenščini, ter na dveh temah, pravice LGBT+ skupnosti in evropska migrantska kriza. Ugotovili smo, da je vsebina DND komentarjev bistveno bolj čustvena od tistih, ki ne vsebujejo DND. Poleg tega smo pokazali razlike v izražanju čustev glede na jezik, temo in tarčo komentarjev. Izsledke kvantitativne metodologije analize čustev smo podprli s kvalitativno analizo korpusa, kjer smo preučili najpogostejše čustveno zaznamovane besede, povezane z vsakim čustvom v vseh štirih zbirkah podatkov. Ugotovili smo, da se čustveno zaznamovane besede v DND bistveno razlikujejo glede na temo, medtem ko obstaja med jeziki precejšnje prekrivanje.
Prenosi
Literatura
Alm, C., Roth, D., & Sproat, R. (2005). Emotions from Text: Machine Learning for Text-based Emotion Prediction. Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, October 2005, Vancouver, Canada (pp. 579–586). Association for Computational Linguistics. doi:10.3115/1220575.1220648 DOI: https://doi.org/10.3115/1220575.1220648
Al-Saqqa, S., Abdel-Nabi, H., & Awajan, A. (2018). A survey of textual emotion detection. 8th International Conference on Computer Science and Information Technology (CSIT), July 2018 (pp. 136–142). doi: 10.1109/CSIT.2018.8486405 DOI: https://doi.org/10.1109/CSIT.2018.8486405
Aman, S., & Szpakowicz, S. (2007). Identifying Expressions of Emotion in Text. In V. Matoušek & P. Mautner (Eds.), Text, Speech and Dialogue, SD 2007. Lecture Notes in Computer Science (Vol. 4629) (pp. 196–205). Berlin, Heidelberg: Springer. DOI: https://doi.org/10.1007/978-3-540-74628-7_27
Assimakopoulos, S., Baider, F. H., & Millar, S. (2017). Online Hate Speech in the European Union. A Discourse-Analytic Perspective. Cham: Springer International Publishing. DOI: https://doi.org/10.1007/978-3-319-72604-5
Brindle, A. (2016). The Language of Hate. A Corpus Linguistic Analysis of White Supremacist Language. London and New York: Routledge. DOI: https://doi.org/10.4324/9781315731643
Canales, L., Daelemans, W., Boldrini, E., & Martinez-Barco, P. (2019). EmoLabel: Semi-Automatic Methodology for Emotion Annotation of Social Media Text. IEEE Transactions on Affective Computing. Retrieved from https:// ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8758380
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Rout¬ledge.
Daelemans, W., Fišer, D., Franza, J., Kranjčić, D., Lemmens, J., Ljubešić, N., Markov, I., & Popič, D. (2020). The LiLaH Emotion Lexicon of Croatian, Dutch and Slovene. Slovenian language resource repository CLARIN.SI. https://www.clarin.si/repository/xmlui/handle/11356/1318
Denecke, K. (2008). Using SentiWordNet for Multilingual Sentiment Analysis. Proceedings of the 24th International Conference on Data Engineering, 7–12 April 2008, Cancun, Mexico (pp. 507–512). DOI: https://doi.org/10.1109/ICDEW.2008.4498370
Fišer, D., Ljubešić, N., & Erjavec, T. (2017). Legal framework, dataset and annotation schema for socially unacceptable online discourse practices in Slovene. Proceedings of the 1st Workshop on Abusive Language Online, ACL 2017, Vancouver, Canada (pp. 46–51). Association for Computational Linguistics. doi: 10.18653/v1/W17-3007 DOI: https://doi.org/10.18653/v1/W17-3007
Franza, J., & Fišer, D. (2019). The lexical inventory of Slovene socially unacceptable discourse on Facebook. Proceedings of the 7th Conference on Computer-Mediated Communication (CMC) and Social Media Corpora, CMC-Corpora 2019, Cergy-Pontoise, France. Retrieved from https://hal. archives-ouvertes.fr/hal-02292616/document#page=50
Ghazi, D. (2016). Identifying Expressions of Emotions and Their Stimuli in Text. PhD dissertation. Canada: University of Ottawa.
Gitari, N. D., Zuping, Z., Hanyurwimfura, D., & Long, J. (2015). A Lexicon-based Approach for Hate Speech Detection. International Journal of Multimedia and Ubiquitous Engineering (Vol. 10, No.4) (pp. 215–230). DOI: https://doi.org/10.14257/ijmue.2015.10.4.21
Knoblock, N. (2017). Xenophobic Trumpeters: A corpus-assisted discourse study of Donald Trump’s Facebook conversations. In A. Musolff (Ed.), Journal of Language Aggression and Conflict (Vol. 5, No.7) (pp. 295–322). Amsterdam/Philadelphia: John Benjamins Publishing Company. DOI: https://doi.org/10.1075/jlac.5.2.07kno
Ljubešić, N. (2019). The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian. Ljubljana: Slovenian language resource repository CLARIN.SI. Retrieved from http://hdl.handle.net/11356/1251
Ljubešić, N. (2020). The CLASSLA-StanfordNLP model for lemmatisation of standard Slovenian 1.1, Slovenian language resource repository CLARIN. SI. http://hdl.handle.net/11356/1286
Ljubešić, N., Fišer, D., & Erjavec, T. (2019). The FRENK datasets of Socially Unacceptable Discourse in Slovene and English. International Conference on Text, Speech, and Dialogue. Springer, Cham. doi: 10.1007/978-3-030-27947-9_9 DOI: https://doi.org/10.1007/978-3-030-27947-9_9
Ljubešić, N., Fišer, D., Erjavec, T., & Šulc, A. (2021). Offensive language dataset of Croatian, English and Slovenian comments FRENK 1.1. Ljubljana: Slovenian language resource repository CLARIN.SI. Retrieved from http://hdl.handle.net/11356/1462
Markov, I., Ljubešić, N., Fišer, D., & Daelemans, W. (2021). Exploring Stylometric and Emotion-Based Features for Multilingual Cross-Domain Hate Speech Detection. Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 149–159). Association for Computational Linguistics. Retrieved from https://aclanthology.org/2021.wassa-1.16/
Martins, R., Gomes, M., Almeida, J. J., Novais, P., & Henriques, P. (2018). Hate Speech Classification in Social Media Using Emotional Analysis. 7th Brazilian Conference on Intelligent Systems (BRACIS), 22–25 October 2018, Sao Paulo, Brazil (pp. 61–66). doi: 10.1109/BRACIS.2018.00019 DOI: https://doi.org/10.1109/BRACIS.2018.00019
Mohammad, S., & Yang T. (2011). Tracking Sentiment in Mail: How Genders Differ on Emotional Axes. Proceedings of the 2nd Workshop on Computa¬tional Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011) (pp. 70–79). Portland, Oregon: Association for Computational Linguistics.
Mohammad, S., & Turney, P. D. (2010). Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon. Pro¬ceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, June 2010, Los Angeles, California (pp. 26–34).
Pahor de Maiti, K., Fišer, D., & Ljubešić, N. (2019). How haters write: analysis of nonstandard language in online hate speech. Proceedings of the 7th Conference on Computer-Mediated Communication (CMC) and Social Media Corpora, CMC-Corpora, 9–10 September 2019, Cergy-Pontoise, France. Retrieved from https://hal.archives-ouvertes.fr/hal-02292616/document#page=44
Peng Q., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. Retrieved from https://arxiv.org/abs/2003.07082
Plutchik, R. (1980). Emotion: Theory, research and experience, 1. Academic Press.
Plutchik, R. (2001). The Nature of Emotions: Human Emotions Have Deep Evolutionary Roots, a Fact That May Explain Their Complexity and Provide Tools for Clinical Practice. American Scientist 89(4), 344–350. DOI: https://doi.org/10.1511/2001.4.344
Pratt, J. W., & Gibbons, J. D. (1981). Kolmogorov-Smirnov two-sample tests. Concepts of nonparametric theory. Springer, New York, NY. 318–344. DOI: https://doi.org/10.1007/978-1-4612-5931-2_7
Russell, J. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178. doi: 10.1037/h0077714 DOI: https://doi.org/10.1037/h0077714
Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Science Information, 44(4), 695–729. doi: 10.1177/05390184050582 DOI: https://doi.org/10.1177/0539018405058216
Vehovar, V., Povž, B., Fišer, D., Ljubešić, N., Šulc, A., & Jontes, D. (2020). Družbeno nesprejemljivi diskurz na Facebookovih straneh novičarskih portalov. Teorija in Praksa, 57(2), 622–645.
Zad, S., Jimenez, J., & Finlayson, M. A. (2021). Hell Hath No Fury? Correcting Bias in the NRC Emotion Lexicon. Proceedings of the 5th Workshop on Online Abuse and Harms, 6 August 2021, Bangkok, Thailand (pp. 102–111). Retrieved from https://aclanthology.org/2021.woah-1.pdf DOI: https://doi.org/10.18653/v1/2021.woah-1.11
Prenosi
Objavljeno
Številka
Rubrika
Licenca
Avtorske pravice (c) 2022 Jasmin Franza, Bojan Evkoski, Darja Fišer

To delo je licencirano pod Creative Commons Priznanje avtorstva-Deljenje pod enakimi pogoji 4.0 mednarodno licenco.