Distant Co-occurrence Patterns of Connectives: a Corpus Study of Formulaicity in Japanese
DOI:
https://doi.org/10.4312/ala.13.2.9-38Keywords:
connectives, distant co-occurrence, co-occurrence patterns, formulaic language, habitus, genre, directed graphsAbstract
Using corpus research methods, this study aims to establish whether there are two-item and, more generally, multi-item distant co-occurrence patterns of connectives in written Japanese, and further, to clarify the role these patterns play in discourse. The study is based on a hybrid corpus of written Japanese including Humanities and social science papers, Science and technology papers, and general written language data. The co-occurrence threshold was set at co-occurrence frequency > 10, PMI value > 2, and Dice coefficient > 0.01. The distribution of the observed co-occurring pairs differed according to the genre. Visualization of the connectivity potential of co-occurring pairs as directed graphs showed that these co-occurring pairs constitute longer co-occurrence chains which can be interpreted as ready-made co-occurrence patterns. Two-item and multi-item co-occurrence patterns are considered a type of Bourdieu’s habitus and contribute to both discourse development and discourse prediction.
Downloads
References
Abekawa, T. 阿辺川武, Nishina, K. 仁科喜久子, Yagi, Y. 八木豊, & Hodošček, B. ホドシチェック・ボル (2020). Nihongo setsuzoku hyōgen no keiryō-teki bunseki ni motodzuku shidō-hō no teian 「日本語接続表現の計量的分析に基づく指導法の提案」[Proposal for a teaching method based on a quantitative analysis of Japanese connectives], Keiryōkokugogaku, 『計量国語学』, 32(7), 387-401.
Bekeš, A. (2008). Text and Boundary: A Sideways Glance at Textual Phenomena in Japanese. Ljubljana: Ljubljana University Press.
Bekeš, A. (2012). Suppositional Adverb-based Brackets in Discourse. In R. Tomiya 富谷玲子 & M. Tsutsumi 堤正典 (Eds.), Modariti to gengokyouiku 『モダリティと言語教育』[Modality and language education] (pp. 21-37). Tokyo: Hituzi Shobo ひつじ書房.
Bourdieu, P. (1991). Language and symbolic power (J. B. Thompson, Ed., G. Raymond & M. Adamson, Trans.). Cambridge: Polity Press.
Bourdieu, P. (1994). Raisons pratiques: sur la théorie de l'action. Paris: Seuil.
de Beaugrande, R., & Dressler, W. U. (1981). Introduction to text linguistics. London: Longman.
de Nooy, W., Mrvar, A., & Batagelj, V. (2005). Exploratory Social Network Analysis with Pajek. Cambridge: Cambridge University Press.
de Saussure, F. (1966). Cours de linguistique générale (W. Baskin, Trans.) Course in General Linguistics, New York: Mac Graw-Hill. (Original work published 1916)
Ichikawa, T. 市川孝 (1978). Kokugo kyōiku no tame no bunshō-ron gaisetsu 『国語教育のための文章論概説』[Introduction to text theory for teaching of Japanese as the first language]. Tokyo: Kyoiku Shuppan 教育出版.
Ishiguro, K. 石黒圭 (2008). Bunshō wa setsuzoku hyōgen de kimaru 『文章は接続表現で決まる』 [Sentence is determined by the connective expressions]. Tokyo: Kobunsha 光文社.
Kaneyasu, M., Ajioka, M., Kawanishi, Y., & Iwasaki, S. (2015). Mikan yo Mikan: Formulaic Constructions and Their Implicature in Conversation. Japanese/Korean Linguistics 21, 199-213. Stanford, CA: CSLI Publications.
Kolesnikova, O. (2016). Survey of Word Co-occurrence Measures for Collocation Detection. Computación y Sistemas, 20(3), 327-344. doi: 10.13053/CyS-20-3-2456
Kudo, H. 工藤浩 (2000). Fukushi to bun no chinjutsu-teki taipu 「副詞と文の陳述的タイプ」 [Adverbs and the type of medus in sentence]. In Y. Nita 仁田義雄 & T. Masuoka 益岡隆志 (Eds.), Nihongo no bunpō 3 — modariti 『日本語の文法3—モダリティ』[Grammar of Japanese 3: Modality]. Tokyo: Iwanami Shoten 岩波書店.
Minami, F. 南不二雄 (1974). Gendai nihongo no kōzō 『現代日本語の構造』 [The Structure of modern Japanese language]. Tokyo: Taishukan Shoten 大修館書店.
Minami, F. 南不二雄 (1993), Gendai nihongo bunpō no rinkaku 『現代日本語文法の輪郭』 [Theoutline of modern Japanese grammar]. Tokyo: Taishukan Shoten 大修館書店.
Mrvar, A., & Batagelj, V. (2022). Pajek Programs for Analysis and Visualization of Very Large Networks - Reference Manual ver. 5.16 last accessed June 1, 2023.
Noda, H. 野田尚史 (1995). Bun no kaisō koozō kara mita shudai to toritate 「文の階層構造と主題の取り立て」[Theme and extrapolation viewed from the hierarchical sentence structure]. In T. Masuoka 益岡隆志 et al. (Eds.), Nihongo no shudai to toritate 日本語の主題と取り立て [Theme and extrapolation in Japanese] (pp. 1-35). Tokyo: Kurosio Publishers.
Petrovic, S., Snajder J., Dalbelo Basic B., & Kolar, M. (2006). Comparison of Collocation Extraction Measures for Document Indexing. Journal of Computing and Information Technology, 14(4), 321-327. (doi:10.2498/cit.2006.04.08).
Sakuma, M. 佐久間まゆみ (2012). Bunshō danwa no bunseki tan'i 「文章・談話の分析単位」 [Units of analysis intext and discourse]. Gengo - serekushon『言語』セレクション 1, 93-100.
Sakuma, M. (2019). Units for the analysis of Japanese written text and spoken discourse. In I. Srdanović & A. Bekeš (Eds.), The Japanese Language from an Empirical Perspective: Corpus-based studies and studies on discourse (pp. 11-30). Ljubljana: University of Ljubljana Press.
Srdanović Erjavec, I., Bekeš, A., & Nishina, K. (2007). Cluster analysis of suppositional adverbs and clause-final modality. Asian and African Studies, 11(3), 21-31.
Srdanović, I., Hodošček, B., Bekeš. A, & Nishina, K. (2009). Uebukōpasu to kensaku shisutemu o riyō shita suiryō fukushi to modariti keishiki no enkaku kyōki chūshutsu to nihongo kyōiku e no ōyō 「ウェブコーパスと検索システムを利用した推量副詞とモダリティ形式の遠隔共起抽出と日本語教育への応用」 [Distant co-occurrence extraction of inferred adverbs and modality forms using a web corpus and search system and its application to Japanese language teaching]. Keiryōkokugogaku 『自然言語処理』, 16(4), 29-46.
Tanaka, S. 田中茂範 (2016). Dainigengo hattatsu ni okeru kan'yō hyōgen-ryoku 「第二言語発達における慣用表現力」 [Conventional expressivity in second language development]. ARCLE Review 10, 40-52. <https://www.arcle.jp ' research ' books ' data ' html ' data ' pdf ' vol10_4-1.pdf>: last accessed January 15, 2023.
Wang, J. 王金博 (2015a). Ronsetsu bun ni okeru setsuzoku hyōgen no `enkaku kyōki' ni tsuite no kenkyū: Shinbun shasetsu no `shikashi' to `sokode' o chūshin ni 『論説文における接続表現の「遠隔共起」についての研究 : 新聞社説の「しかし」と「そこで」を中心に』 [A study on 'distant co-occurrence' of connectives in editorial writing: focusing on 'shikashi (but)' and 'sokode (there)' in newspaper editorials]. PhD dissertation <https://tsukuba.repo.nii.ac.jp/record/37004/files/DA07517.pdf>, last accessed January 20, 2023
Wang, J. 王金博 (2015b). Ronsetsu bun no bunmyaku tenkai ni okeru setsuzoku hyōgen `shikashi' to `sokode' no enkaku kyōki' 「論説文の文脈展開における接続表現「しかし」と「そこで」の遠隔共起 」[Distant co-occurrence of the connectives 'shikashi (but)' and 'sokode (there)' in the contextual development of editorial texts]. Kokusai Nihon kenkyū 『国際異本研究』 7, 97-110. <https://japan.tsukuba.ac.jp/ content/uploads/sites/43/2022/02/JIAJS_Vol7_PRINT_06_Wang.pdf>, last accessed January 20, 2023.
Wray, A. (2002). Formulaic Language and the Lexicon. New York: Cambridge University Press.
Wray, A. (2017). Formulaic Sequences as a Regulatory Mechanism for Cognitive Perturbations During the Achievement of Social Goals. Topics in Cognitive Science, 9(3), 569-587.
Additional analyzed materials
Asahi Shimbun: 300 Editorials and Opinion articles (31 Jul 2011 - 31 Dec 2011). 朝日新聞社説・意見300記事(2011年07月31日〜2011年12月31日)
Ishiguro, K. 石黒圭 (Ed.) (2020). Bijinesu bunsho no ōyōgengo-gaku-teki kenkyū — kuraudosōshingu o mochiita bijinesu nihongo no takaku-teki bunseki 『ビジネス文書の応用言語学的研究—クラウドソーシングを用いたビジネス日本語の多角的分析』ひつじ書房 [Applied linguistic research of business documents: a multidimensional analysis of business Japanese using crowdsourcing data]. Tokyo: Hitsuji Shobo.
Directed graph application
Pajek: analysis and visualization of very large networks <http://mrvar.fdv.uni-lj.si/pajek/>, last accessed June 30, 2023
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Bekeš, Andrej; Hodošček, Bor; Nishina, Kikuko; Abekawa, Takeshi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors are confirming that they are the authors of the submitting article, which will be published online in journal Acta Linguistica Asiatica by Ljubljana University Press, Faculty of Arts (University of Ljubljana, Faculty of Arts, Aškerčeva 2, 1000 Ljubljana, Slovenia). Author’s name will be evident in the article in journal. All decisions regarding layout and distribution of the work are in hands of the publisher.
- Authors guarantee that the work is their own original creation and does not infringe any statutory or common-law copyright or any proprietary right of any third party. In case of claims by third parties, authors commit their self to defend the interests of the publisher, and shall cover any potential costs.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.