Distant Co-occurrence Patterns of Connectives: a Corpus Study of Formulaicity in Japanese

Authors

DOI:

https://doi.org/10.4312/ala.13.2.9-38

Keywords:

connectives, distant co-occurrence, co-occurrence patterns, formulaic language, habitus, genre, directed graphs

Abstract

Using corpus research methods, this study aims to establish whether there are two-item and, more generally, multi-item distant co-occurrence patterns of connectives in written Japanese, and further, to clarify the role these patterns play in discourse. The study is based on a hybrid corpus of written Japanese including Humanities and social science papers, Science and technology papers, and general written language data. The co-occurrence threshold was set at co-occurrence frequency > 10, PMI value > 2, and Dice coefficient > 0.01. The distribution of the observed co-occurring pairs differed according to the genre. Visualization of the connectivity potential of co-occurring pairs as directed graphs showed that these co-occurring pairs constitute longer co-occurrence chains which can be interpreted as ready-made co-occurrence patterns. Two-item and multi-item co-occurrence patterns are considered a type of Bourdieu’s habitus and contribute to both discourse development and discourse prediction.

Downloads

Download data is not yet available.

Author Biography

  • Andrej BEKEŠ, University of Ljubljana

    Emeritus Professor, University of Ljubljana

References

Abekawa, T. 阿辺川武, Nishina, K. 仁科喜久子, Yagi, Y. 八木豊, & Hodošček, B. ホドシチェック・ボル (2020). Nihongo setsuzoku hyōgen no keiryō-teki bunseki ni motodzuku shidō-hō no teian 「日本語接続表現の計量的分析に基づく指導法の提案」[Proposal for a teaching method based on a quantitative analysis of Japanese connectives], Keiryōkokugogaku, 『計量国語学』, 32(7), 387-401.

Bekeš, A. (2008). Text and Boundary: A Sideways Glance at Textual Phenomena in Japanese. Ljubljana: Ljubljana University Press.

Bekeš, A. (2012). Suppositional Adverb-based Brackets in Discourse. In R. Tomiya 富谷玲子 & M. Tsutsumi 堤正典 (Eds.), Modariti to gengokyouiku 『モダリティと言語教育』[Modality and language education] (pp. 21-37). Tokyo: Hituzi Shobo ひつじ書房.

Bourdieu, P. (1991). Language and symbolic power (J. B. Thompson, Ed., G. Raymond & M. Adamson, Trans.). Cambridge: Polity Press.

Bourdieu, P. (1994). Raisons pratiques: sur la théorie de l'action. Paris: Seuil.

de Beaugrande, R., & Dressler, W. U. (1981). Introduction to text linguistics. London: Longman. DOI: https://doi.org/10.4324/9781315835839

de Nooy, W., Mrvar, A., & Batagelj, V. (2005). Exploratory Social Network Analysis with Pajek. Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511806452

de Saussure, F. (1966). Cours de linguistique générale (W. Baskin, Trans.) Course in General Linguistics, New York: Mac Graw-Hill. (Original work published 1916)

Ichikawa, T. 市川孝 (1978). Kokugo kyōiku no tame no bunshō-ron gaisetsu 『国語教育のための文章論概説』[Introduction to text theory for teaching of Japanese as the first language]. Tokyo: Kyoiku Shuppan 教育出版.

Ishiguro, K. 石黒圭 (2008). Bunshō wa setsuzoku hyōgen de kimaru 『文章は接続表現で決まる』 [Sentence is determined by the connective expressions]. Tokyo: Kobunsha 光文社.

Kaneyasu, M., Ajioka, M., Kawanishi, Y., & Iwasaki, S. (2015). Mikan yo Mikan: Formulaic Constructions and Their Implicature in Conversation. Japanese/Korean Linguistics 21, 199-213. Stanford, CA: CSLI Publications.

Kolesnikova, O. (2016). Survey of Word Co-occurrence Measures for Collocation Detection. Computación y Sistemas, 20(3), 327-344. doi: 10.13053/CyS-20-3-2456 DOI: https://doi.org/10.13053/cys-20-3-2456

Kudo, H. 工藤浩 (2000). Fukushi to bun no chinjutsu-teki taipu 「副詞と文の陳述的タイプ」 [Adverbs and the type of medus in sentence]. In Y. Nita 仁田義雄 & T. Masuoka 益岡隆志 (Eds.), Nihongo no bunpō 3 — modariti 『日本語の文法3—モダリティ』[Grammar of Japanese 3: Modality]. Tokyo: Iwanami Shoten 岩波書店.

Minami, F. 南不二雄 (1974). Gendai nihongo no kōzō 『現代日本語の構造』 [The Structure of modern Japanese language]. Tokyo: Taishukan Shoten 大修館書店.

Minami, F. 南不二雄 (1993), Gendai nihongo bunpō no rinkaku 『現代日本語文法の輪郭』 [Theoutline of modern Japanese grammar]. Tokyo: Taishukan Shoten 大修館書店.

Mrvar, A., & Batagelj, V. (2022). Pajek Programs for Analysis and Visualization of Very Large Networks - Reference Manual ver. 5.16 last accessed June 1, 2023.

Noda, H. 野田尚史 (1995). Bun no kaisō koozō kara mita shudai to toritate 「文の階層構造と主題の取り立て」[Theme and extrapolation viewed from the hierarchical sentence structure]. In T. Masuoka 益岡隆志 et al. (Eds.), Nihongo no shudai to toritate 日本語の主題と取り立て [Theme and extrapolation in Japanese] (pp. 1-35). Tokyo: Kurosio Publishers.

Petrovic, S., Snajder J., Dalbelo Basic B., & Kolar, M. (2006). Comparison of Collocation Extraction Measures for Document Indexing. Journal of Computing and Information Technology, 14(4), 321-327. (doi:10.2498/cit.2006.04.08). DOI: https://doi.org/10.2498/cit.2006.04.08

Sakuma, M. 佐久間まゆみ (2012). Bunshō danwa no bunseki tan'i 「文章・談話の分析単位」 [Units of analysis intext and discourse]. Gengo - serekushon『言語』セレクション 1, 93-100.

Sakuma, M. (2019). Units for the analysis of Japanese written text and spoken discourse. In I. Srdanović & A. Bekeš (Eds.), The Japanese Language from an Empirical Perspective: Corpus-based studies and studies on discourse (pp. 11-30). Ljubljana: University of Ljubljana Press.

Srdanović Erjavec, I., Bekeš, A., & Nishina, K. (2007). Cluster analysis of suppositional adverbs and clause-final modality. Asian and African Studies, 11(3), 21-31.

Srdanović, I., Hodošček, B., Bekeš. A, & Nishina, K. (2009). Uebukōpasu to kensaku shisutemu o riyō shita suiryō fukushi to modariti keishiki no enkaku kyōki chūshutsu to nihongo kyōiku e no ōyō 「ウェブコーパスと検索システムを利用した推量副詞とモダリティ形式の遠隔共起抽出と日本語教育への応用」 [Distant co-occurrence extraction of inferred adverbs and modality forms using a web corpus and search system and its application to Japanese language teaching]. Keiryōkokugogaku 『自然言語処理』, 16(4), 29-46. DOI: https://doi.org/10.5715/jnlp.16.4_29

Tanaka, S. 田中茂範 (2016). Dainigengo hattatsu ni okeru kan'yō hyōgen-ryoku 「第二言語発達における慣用表現力」 [Conventional expressivity in second language development]. ARCLE Review 10, 40-52. <https://www.arcle.jp ' research ' books ' data ' html ' data ' pdf ' vol10_4-1.pdf>: last accessed January 15, 2023.

Wang, J. 王金博 (2015a). Ronsetsu bun ni okeru setsuzoku hyōgen no `enkaku kyōki' ni tsuite no kenkyū: Shinbun shasetsu no `shikashi' to `sokode' o chūshin ni 『論説文における接続表現の「遠隔共起」についての研究 : 新聞社説の「しかし」と「そこで」を中心に』 [A study on 'distant co-occurrence' of connectives in editorial writing: focusing on 'shikashi (but)' and 'sokode (there)' in newspaper editorials]. PhD dissertation <https://tsukuba.repo.nii.ac.jp/record/37004/files/DA07517.pdf>, last accessed January 20, 2023

Wang, J. 王金博 (2015b). Ronsetsu bun no bunmyaku tenkai ni okeru setsuzoku hyōgen `shikashi' to `sokode' no enkaku kyōki' 「論説文の文脈展開における接続表現「しかし」と「そこで」の遠隔共起 」[Distant co-occurrence of the connectives 'shikashi (but)' and 'sokode (there)' in the contextual development of editorial texts]. Kokusai Nihon kenkyū 『国際異本研究』 7, 97-110. <https://japan.tsukuba.ac.jp/ content/uploads/sites/43/2022/02/JIAJS_Vol7_PRINT_06_Wang.pdf>, last accessed January 20, 2023.

Wray, A. (2002). Formulaic Language and the Lexicon. New York: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511519772

Wray, A. (2017). Formulaic Sequences as a Regulatory Mechanism for Cognitive Perturbations During the Achievement of Social Goals. Topics in Cognitive Science, 9(3), 569-587. DOI: https://doi.org/10.1111/tops.12257

Additional analyzed materials

Asahi Shimbun: 300 Editorials and Opinion articles (31 Jul 2011 - 31 Dec 2011). 朝日新聞社説・意見300記事(2011年07月31日〜2011年12月31日)

Ishiguro, K. 石黒圭 (Ed.) (2020). Bijinesu bunsho no ōyōgengo-gaku-teki kenkyū — kuraudosōshingu o mochiita bijinesu nihongo no takaku-teki bunseki 『ビジネス文書の応用言語学的研究—クラウドソーシングを用いたビジネス日本語の多角的分析』ひつじ書房 [Applied linguistic research of business documents: a multidimensional analysis of business Japanese using crowdsourcing data]. Tokyo: Hitsuji Shobo.

Directed graph application

Pajek: analysis and visualization of very large networks <http://mrvar.fdv.uni-lj.si/pajek/>, last accessed June 30, 2023

Downloads

Published

30. 07. 2023

Issue

Section

Research articles

How to Cite

Bekeš, A., Hodošček, B., Nishina, K., & Abekawa, T. (2023). Distant Co-occurrence Patterns of Connectives: a Corpus Study of Formulaicity in Japanese. Acta Linguistica Asiatica, 13(2), 9-38. https://doi.org/10.4312/ala.13.2.9-38