Logično sklepanje v naravnem jeziku za slovenščino
DOI:
https://doi.org/10.4312/slo2.0.2024.1.1-53Ključne besede:
logično sklepanje v naravnem jeziku, veliki jezikovni modeli, arhitektura transformer, SloBERTa, SloT5, GPT-3.5-turbo, ChatGPT, razlage, slovenščina, prilagajanje modelovPovzetek
Na področju strojnega razumevanja naravnega jezika so v zadnjih letih najuspešnejši veliki jezikovni modeli. Pomemben problem s tega področja je logično sklepanje v naravnem jeziku, za reševanje katerega morajo modeli vsebovati dokaj široko splošno znanje, strojno generiranje razlag sklepov pa nam omogoča dodaten vpogled v njihovo delovanje.
Preizkusili smo različne pristope za logično sklepanje v naravnem jeziku za slovenščino. Uporabili smo dva slovenska velika jezikovna modela, SloBERTa in SloT5, in mnogo večji angleški jezikovni model GPT-3.5-turbo. Za učenje modelov smo uporabili slovensko podatkovno množico SI-NLI, strojno pa smo prevedli še 50.000 primerov iz angleške množice ESNLI.
Model SloBERTa, prilagojen na SI-NLI, doseže na testni množici SI-NLI klasifikacijsko točnost 73,2 %. Z vnaprejšnjim učenjem na prevodih ESNLI smo točnost izboljšali na 75,3 %. Ugotovili smo, da modeli delajo drugačne vrste napak kot ljudje in da slabo posplošujejo med različnimi domenami primerov. SloT5 smo na množici ESNLI prilagodili za generiranje razlag pri logičnem sklepanju. Ustreznih je manj kot tretjina razlag, pri čemer se model dobro nauči pogostih stavčnih oblik v razlagah, večinoma pa so pomensko nesmiselne. Predvidevamo, da so slovenski veliki jezikovni modeli z nekaj sto milijoni parametrov zmožni iskanja in uporabe jezikovnih vzorcev, njihovo poznavanje jezika pa ni povezano s poznavanjem resničnosti.
Za uvrščanje primerov in generiranje razlag smo uporabili tudi večji model GPT-3.5-turbo. Pri učenju brez dodatnih primerov doseže na testni množici SI-NLI točnost 56,5 %, pri pravilno uvrščenih primerih pa je ustreznih 81 % razlag. V primerjavi z manjšimi slovenskimi modeli kaže ta model dokaj dobro razumevanje resničnosti, pri čemer pa ga omejuje slabše poznavanje slovenščine.
Prenosi
Literatura
Bowman, S. R., Angeli, G., Potts, C., & Manning, C. D. (2015). A large annotated corpus for learning natural language inference. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, …, & Askell, A. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877–1901.
Camburu, O.-M., Rocktäschel, T., Lukasiewicz, T., & Blunsom, P. (2018). e-SNLI: Natural Language Inference with Natural Language Explanations. Advances in Neural Information Processing Systems, 31.
CJVT UL. (2023). SloBench – Natural language inference (SI-NLI) leaderboard. Pridobljeno s https://slobench.cjvt.si/leaderboard/view/9
DeepL Translate API. (2023). Pridobljeno s https://www.deepl.com/pro-api
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers) (str. 4171–4186).
Erjavec, T., Fišer, D., & Ljubešić, N. (2021). The KAS corpus of Slovenian academic writing. Lang. Resour. Eval., 55(2), 551–583.
Fišer, D., Erjavec, T., & Ljubešić, N. (2016). JANES v0.4: Korpus slovenskih spletnih uporabniških vsebin. Slovenščina 2.0: empirične, aplikativne in interdisciplinarne raziskave, 4(2), 67–99.
Google Prevajalnik. (2023). Pridobljeno s https://translate.google.com/?hl=sl
Klemen, M., Žagar, A., Čibej, J., & Robnik-Šikonja, M. (2022). Slovene Natural Language Inference Dataset SI-NLI, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1707
Klemen, M., Žagar, A., Čibej, J., & Robnik-Šikonja, M. (2024). SI-NLI: A Slovene Natural Language Inference Dataset and its Evaluation. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italia (str. 14859–14870). ELRA and ICCL. Pridobljeno s https://aclanthology.org/2024.lrec-main.1294.pdf
Krek, S., Arhar Holdt, Š., Erjavec, T., Čibej, J., Repar, A., Gantar, P., Ljubešić, N., Kosem, I., & Dobrovoljc, K. (2020). Gigafida 2.0: The Reference Corpus of Written Standard Slovene. Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France (str. 3340–3345). European Language Resources Association. Pridobljeno s https://aclanthology.org/2020.lrec-1.409
Kumar, S., & Talukdar, P. (2020). NILE: Natural Language Inference with Faithful Natural Language Explanations. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (str. 8730–8742). Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.771
Lebar Bajec, I., Repar, A., Demšar, J., Bajec, Ž., Rizvič, M., Kumperščak, B., & Bajec, M.(2022). Neural Machine Translation model for Slovene-English language pair RSDO-DS4-NMT 1.2.6, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1736
Liu, H., Ning, R., Teng, Z., Liu, J., Zhou, Q., & Zhang, Y. (2023). Evaluating the logical reasoning ability of ChatGPT and GPT-4. Pridobljeno s file:///C:/Users/student1/Downloads/Evaluating_the_Logical_Reasoning_Ability_of_ChatGP.pdf
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, …, & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. Pridobljeno s https://arxiv.org/pdf/1907.11692
Ljubešić, N., & Erjavec, T. (2011). hrWaC and slWac: compiling web corpora for Croatian and Slovene. Proceedings of the 14th International Conference on Text, Speech and Dialogue (str. 395–402). doi: 10.1007/978-3-642-23538-2_50
Logar, N., Erjavec, T., Krek, S., Grčar, M., & Holozan, P. (2013). Written corpus ccKres 1.0, Slovenian language resource repository CLARIN.SI, http://hdl.handle.net/11356/1034
McCoy, T., Pavlick, E., & Linzen, T. (2019). Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy (str. 3428–3448). Association for Computational Linguistics. doi: 10.18653/v1/P19-1334
McHugh, M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276–282.
Müller, A., & Guido, S. (2016). Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media.
OpenAI. (2022). Introducing ChatGPT. Pridobljeno s https://openai.com/blog/chatgpt
OpenAI. (2023a). GPT-4 Technical Report. Pridobljeno s https://arxiv.org/pdf/2303.08774
OpenAI. (2023b). Models – OpenAI API. Pridobljeno s https://platform.openai.com/docs/models/gpt-3-5
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., …, & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744. Pridobljeno s https://arxiv.org/abs/2203.02155
Pančur, A., & Erjavec, T. (2020). The siParl corpus of Slovene parliamentary proceedings. Proceedings of the Second ParlaCLARIN Workshop (str. 28–34).
Poth, C., Pfeiffer, J., R“uckl’e, A., & Gurevych, I. (2021). What to Pre-Train on? Efficient Intermediate Task Selection. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (str. 10585–10605).
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1), 5485–5551.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., …, & Scialom, T. (2023). Llama 2: Open foundation and fine-tuned chat models. Pridobljeno s https://arxiv.org/abs/2307.09288
Ulčar, M., & Robnik-Šikonja, M. (2021). SloBERTa: Slovene monolingual large pretrained masked language model. Proceedings of SI-KDD within the Information Society 2021 (str. 17–20).
Ulčar, M., & Robnik-Šikonja, M. (2023). Sequence-to-sequence pretraining for a less-resourced Slovenian language. Frontiers in Artificial Intelligence, 6, 1–12. doi: 10.3389/frai.2023.932519
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Wang, S., Fang, H., Khabsa, M., Mao, H., & Ma, H. (2021). Entailment as few-shot learner. Pridobljeno s https://arxiv.org/pdf/2104.14690
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., ..., & Rush, A. (2020). Transformers: State-of-the-Art Natural Language Processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (str. 38–45).
Zhong, Q., Ding, L., Liu, J., Du, B., & Tao, D. (2023). Can ChatGPT understand too? A comparative study on ChatGPT and fine-tuned BERT. Pridobljeno s https://arxiv.org/pdf/2302.10198
Prenosi
Objavljeno
Številka
Rubrika
Licenca
Avtorske pravice (c) 2024 Tim Kmecl, Marko Robnik-Šikonja
To delo je licencirano pod Creative Commons Priznanje avtorstva-Deljenje pod enakimi pogoji 4.0 mednarodno licenco.