DirKorp

Hrvaški korpus direktivnih govornih dejanj (v3.0)

Avtorji

  • Petra Bago Univerza v Zagrebu, Filozofska fakulteta, Hrvaška
  • Virna Karlić Univerza v Zagrebu, Filozofska fakulteta, Hrvaška

DOI:

https://doi.org/10.4312/slo2.0.2023.1.189-217

Ključne besede:

korpusna pragmatika, direktivna govorna dejanja, DirKorp, hrvaški jezik

Povzetek

V prispevku predstavljamo razvoj nove različice (v3.0) korpusa DirKorp (Korpus direktivnih govornih činova hrvatskoga jezika), prvega hrvaškega korpusa direktivnih govornih dejanj, ki je bil izdelan za namene raziskav pragmatike. Korpus vsebuje 800 govornih dejanj, ki so bila zbrana s spletnim vprašalnikom z nalogami igranja vlog – gre za metodo stimulirane komunikacije, ki poteka pod vnaprej določenimi pogoji. Metoda je primerna za raziskovanje govornih dejanj, saj lahko na ta način zberemo veliko število primerov z enako propozicijsko vsebino in ilokucijskim namenom, ki so uporabljeni v enaki kontrolirani situaciji. Predstavljene situacije razdelimo v dve kategoriji glede na odnos med udeleženci komunikacijskega dejanja: (1) situacije, ki vključujejo sogovorce, ki niso v sorodstvenem razmerju; (2) situacije z govorci v sorodstvenem razmerju. Naloge v obeh kategorijah so razdeljene v štiri pare, od sodelujočih pa zahtevajo, da pripišejo govorno dejanje s podobno propozicijsko vsebino. V vprašalniku je sodelovalo 100 govorcev hrvaščine; vsi so bili dodiplomski (63 %) ali podiplomski študenti (37 %) Fakultete za humanistiko in družbene vede (Univerza v Zagrebu). Korpus je bil ročno označen na ravni govornih dejanj, vsako dejanje pa vsebuje do 14 značilnosti: (1) ID sodelujočega, (2) sorodstveno/nesorodstveno razmerje, (3) tip izjave, (4) direktivni performativni glagol v prvi osebi, (5) ilokucijska sila, (6) propozicijska vsebina, (7) tikanje/vikanje, (8) prepričevalnost, (9) leksikalni označevalec za prošnjo, (10) leksikalni označevalec za opravičilo, (12) naziv spoštovanja, (13) slovnični naklon, (14) modalni glagol v drugi osebi. Korpus vsebuje 12.676 pojavnic in 1.692 različnic, enkodiran pa je v skladu s smernicami TEI P5: Guidelines for Electronic Text Encoding and Interchange, ki jih razvija in vzdržuje konzorcij Text Encoding Initiative Consortium (TEI). DirKorp je v formatu TEI na voljo za prenos pod licenco CC BY-SA 4.0 na platformi GitHub. V prispevku opišemo označevanje in strukturo korpusa.

Prenosi

Podatki o prenosih še niso na voljo.

Literatura

Allen, J. F., Schubert, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., Light, M., …, & Traum, D. R. (1995). The TRAINS Project: A Case Study in Building a Conversational Planning Agent. Journal of Experimental & Theoretical Artificial Intelligence, 7(1),7–48.

Alsop, S., & Nesi, H. (2013). Annotating a Corpus of Spoken English: The Engineering Lecture Corpus (ELC). In Proceedings of GSCP 2012: Speech and Corpora (pp. 58–62). Firenze University Press, Florence.

Alsop, S., & Nesi, H. (2014). The Pragmatic Annotation of a Corpus of Academic Lectures. In The International Conference on Language Resources and Evaluation 2014 Proceedings (pp. 1560–1563). Reykjavik: European Language Resources Association.

Anderson, A. H., Bader, M., Gurman Bard, E., Boyle, E., Doherty, G., Garrod, S., Isard, S., …, & Weinert, R. (1991). The HCRC Map Task Corpus, Language and Speech, 34(4), 351–366.

Austin, J. L. (1962). How to Do Things with Words. Oxford: Clarendon Press.

Barron, A. (2008). The structure of requests in Irish English and English English. In K. P. Schneider & A. Barron (Eds.), Variational Pragmatics: A Focus on Regional Varieties in Pluricentric Languages (pp. 35–68). John Benjamins Publishing Company.

Brown, P., & Levinson, S. C. (1987). Politeness: Some Universals in Language Usage. Cambridge University Press.

Bunt, H. (2017). Computational Pragmatics. In Oxford Handbook of Pragmatics (pp. 326–345). Oxford University Press, New York.

Bunt, H., Petukhova, V., Malchanau, A., Fang, A. & Wijnhoven, K. (2019). The DialogBank: Dialogues with Interoperable Annotations. In Language Resources and Evaluation, 53(2), 213–249.

Capone, A. (2009). Speech Acts, Classification and Definition. In Concise Encyclopedia of Pragmatics (pp. 1015–1017). Oxford: Elsevier.

Caspers, J. (2000). Melodic Characteristics of Backchannels in Dutch Map Task Dialogues. In Proceedings, 6th International Conference on Spoken Language Processing (pp. 611–614). Beijing: China Military Friendship Publish,. Retrieved from https://www.isca-speech.org/archive/icslp_2000/

Flӧck, I., & Geluykens, R. (2015). Speech Acts in Corpus Pragmatics: A Quantitative Contrastive Study of Directives in Spontaneous and Elicited Discourse. In Yearbook of Corpus Linguistics and Pragmatics (pp. 7–37). Springer International Publishing.

Franović, T., & Šnajder, J. (2012). Speech Act Based Classification of Email Messages in Croatian Language. In Proceedings of the Eighth Language Technologies Conference (pp. 69–72). Ljubljana: Information Society.

Geertzen, J., Girard, Y., Morante, R., Van der Sluis, J., Van Dam, H., Suijkerbuijk, B., Van der Werf, R., & Bunt, H. (2004). The DIAMOND Project. In: Proceedings of the 8th Workshop on the Semantics and Pragmatics of Dialogue (CATALOG 2004), Barcelona.

Godfrey, J., Holliman, E. & McDaniel, J. (1992). SWITCHBOARD: Telephone Speech Corpus for Research and Development. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (Vol. 1, pp. 517–520). San Francisco: IEEE Computer Society.

Hržica, G., Košutar, S., & Posavec, K. (2021). Konektori i druge diskursne oznake u pisanome i spontanome govorenom jeziku. Fluminensia: časopis za filološka istraživanja, 33(1), 25–52.

Huang, Y. (2009). Speech Acts. In Concise Encyclopedia of Pragmatics (pp. 1000–1009). Oxford: Elsevier.

Ivanetić, N. (1995). Govorni činovi. Zagreb: FF-press, Zavod za lingvistiku Filozofskoga fakulteta Sveučilišta u Zagrebu.

Jucker, A. H. (2009). Speech Act Research between Armchair, Field and Laboratory: The Case of Compliments. Journal of Pragmatics, 41, 1611–1635.

Jucker, A. H., Schreier, D., & Hundt, M. (Eds.). (2009). Corpora: Pragmatics and Discourse. Rodopi, Amsterdam.

Kallen, J. L., & Kirk, J. M. (2012). SPICE-Ireland: A User’s Guide. Retrieved from https://pure.qub.ac.uk/en/publications/spice-ireland-a-users-guide

Karlić, V., & Bago, P. (2021). (Računalna) pragmatika: temeljni pojmovi i korpusnopragmatičke analize. Zagreb: FF Press. Retrieved from https://openbooks.ffzg.unizg.hr/index.php/Ffpress/catalog/book/125.

Kehoe, A., & Gee, M. (2007). New Corpora from the Web: Making Web Text More ‘Text-Like’. In Studies in Variation, Contacts and Change in English 2. Retrieved from https://varieng.helsinki.fi/series/volumes/02/kehoe_gee/

Kehoe, A., & Gee, M. (2012). Reader Comments as an Aboutness Indicator in Online Texts: Introducing the Birmingham Blog Corpus. In: Studies in Variation, Contacts and Change in English 12. Retrieved from https://varieng.helsinki.fi/series/volumes/12/kehoe_gee/

Kuvač Kraljević, J., & Hržica, G. (2016). Croatian Adult Spoken Language Corpus (HrAL). Fluminensia: časopis za filološka istraživanja, 28(2), 87–102.

Leech, G. N. (1992). Corpora and Theories of Linguistic Performance. In Directions in Corpus Linguistics (pp. 105–122). De Gruyter, Berlin.

Ljubešić, N., & Klubička, F. (2014). {bs, hr, sr}WaC-Web Corpora of Bosnian, Croatian and Serbian. In: Proceedings of the 9th Web as Corpus Workshop (WaC-9) (pp. 29–35). Association for Computational Linguistics, Gothenburg. Retrieved from https://aclanthology.org/W14-0405.pdf

Lutzky, U., & Kehoe, A. (2017a). I Apologize for My Poor Blogging: Searching for Apologies in the Birmingham Blog Corpus. Corpus Pragmatics, 1(1), 37–56.

Lutzky, U., & Kehoe, A. (2017b). Oops, I Didn’t Mean to Be so Flippant. A Corpus Pragmatic Analysis of Apologies in Blog Data. Journal of Pragmatics, 116, 27–36.

Matić, D. (2011). Govorni činovi u političkome diskursu. PhD thesis. Zagreb: Faculty of Humanities and Social Sciences.

Miščević, N. (2018). Rođenje pragmatike. Orion Art, Beograd.

Palašić, N. (2020). Pragmalingvistika – lingvistički pravac ili petlja? Zagreb: Hrvatska sveučilišna naklada.

Petukhova, V., Gropp, M., Klakow, D., Eigner, G., Topf, M., Srb, S., Motlicek, P., … Potard, …, & Schmidt, A. (2014). The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14) (pp. 252–258). European Language Resources Association, Reykjavik.

Piper, P. et al. (2005) = Предраг Пипер, Ивана Антонић, Бранислава Ружић, Срето Танасић, Људмила Поповић, Бранко Тошовић. 2005. Синтакса савременог српског језика. Проста реченица, Београд: Институт за српски језик САНУ, Београдска књига, Матица српска.

Pišković, T. (2007). Dramski diskurs između pragmalingvistike i feminističke lingvistike. Rasprave: Časopis Instituta za hrvatski jezik i jezikoslovlje, 33(1), 325–341.

Prasad, R., Webber, B., & Lee, A. (2018). Discourse Annotation in the PDTB: The NextGeneration. In: Proceedings of the 14th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (pp. 87–97). Santa Fe: Association for Computational Linguistics. Retrieved from https://aclanthology.org/W18-4710.pdf

Prüst, H., Minnen, G. & Beun, R. (1984). Transcriptie dialooogesperiment juni/juli 1984, IPORapport 481. Eindhoven: Institute for Perception Research, Eindhoven University of Technology.

Pupovac, M. (1990). Jezik i djelovanje. Zagreb: Biblioteka časopisa Pitanja.

Romero-Trillo, J. (Ed.). (2008). Pragmatics and Corpus Linguistics: A Mutualistic Entente. De Gruyter, Berlin.

Rühlemann, C., & Aijmer, K. (2015). Introduction. Corpus pragmatics: laying the foundations. In: Corpus pragmatics (pp. 1–28).

Searle, J. R. (1969). Speech Acts. Cambridge University Press, Cambridge.

Searle, J. R. (1975). A Taxonomy of Speech Acts. In: Minnesota Studies in the Philosophy of Science (Vol. 9, pp. 344–369). University of Minnesota Press.

Searle, J. R. (1976). A classification of illocutionary acts. Language in Society, 5, 1–23.

Silić, S. & Pranjković, I. (2007). Gramatika hrvatskoga jezika za gimnazije i visoka učilista. Zagreb: Školska knjiga.

Šegić, T. (2019). Tata kupi mi auto und Nivea Milk weil es nichts Besseres für die Hautpflege gibt. Filologija, 73, 103–116.

Tadić, M. (1996). Računalna obradba hrvatskoga i nacionalni korpus. Suvremena lingvistika, 41–42, 603–611.

TEI Consortium (Ed.). (2021). TEI P5: Guidelines for Electronic Text Encoding and Interchange. TEI Consortium.

Trosborg, A. (1995). Interlanguage Pragmatics: Requests, Complaints, and Apologies. Berlin; New York: Mouton de Gruyter.

Wojtaszek, A. (2016). Thirty years of Discourse Completion Test in Contrastive Pragmatics research, Linguistica Silesiana, 37, 161–173.

Yule, G. (2002). Pragmatics. Oxford, New York: Oxford University Press.

Objavljeno

12. 09. 2023

Številka

Rubrika

Članki – Sklop 2: Jezikovni viri in tehnologije

Kako citirati

DirKorp: Hrvaški korpus direktivnih govornih dejanj (v3.0). (2023). Slovenščina 2.0: Empirične, Aplikativne in Interdisciplinarne Raziskave, 11(1), 189-217. https://doi.org/10.4312/slo2.0.2023.1.189-217