Examples of Corpus Data Visualization: Collocations in Chinese

Authors

DOI:

https://doi.org/10.4312/ala.12.2.9-26

Keywords:

visualization, corpus, Chinese, Javascript library Vis.js, Python

Abstract

The article aims to show a practical procedure that can be used in the visualization of language data. The paper freely follows our previous articles about the visualization of language data in language pedagogy. We demonstrate how to retrieve language data – in our case from corpora, how to edit data in a spreadsheet program, and then in the last step, how to visualize it on the example of Legal Chinese and partly Legal German. The Javascript library Vis.js via Pyvis is chosen for the visualization of the language data.

 

Downloads

Download data is not yet available.

References

Benická, J. (2017). Archaizujúci jazyk v čínštine a jeho prekladanie do slovenčiny. (Historicizing Language in Chinese and its Translation into Slovak). In D. Veverková, I. Kolečáni Lenčová, M. Ľupták & Z. Danihelová (Eds.), Aplikované jazyky v univerzitnom kontexte IV (pp. 58-68). Technická univerzita.

CSV. (n.d.). Retrieved March 15, 2022, from https://www.w3.org/TR/tabular-data-primer/#tabular-data

Gajdoš, Ľ., Garabík, R., & Benická, J. (2016). The New Chinese Webcorpus Hanku—Origin, Parameters, Usage. Studia Orientalia Slovaca, 15(2), 21-33.

Gajdoš, Ľ. (2020). Verb Collocations in Chinese – Retrieving, Visualization and Analysis of Corpus Data. Studia Orientalia Slovaca, 19(1), 121-138.

Gajdoš, Ľ. (2022a). Vizualizácia jazykových dát ako didaktická pomôcka na príklade korpusu čínskych právnych textov (Visualisation of Linguistic Data as a Didactic Tool on the Example of the Corpus of Legal Chinese). In Kontexty súdneho prekladu (pp. 7-22).

Gajdoš, Ľ. (2022b). Praktická korpusová lingvistika – čínština (Practical Corpus Linguistics – Chinese Language). Univerzita Komenského.

Gajdošová, E., & Gajdoš, Ľ. (2018). Korpus nemeckého právneho textu COLEGE (Corpus of Legal German). In Kontexty súdneho prekladu a tlmočenia, 7, 40-47.

Gajdošová, E. (2022). Korpusbasierte Analyse von Rechtstexten in slowakischer und deutscher Sprache mit besonderem Augenmerk auf Verb-Nomen-Kollokationen [Unpublished doctoral dissertation]. Univerzita Komenského.

JSON. (n.d.). Retrieved March 15, 2022, from https://www.json.org/json-en.html

PANDAS. (n.d.). Retrieved March 14, 2022, from https://pandas.pydata.org

Petrovčič, M. (2022). Chinese Idioms: Stepping Into L2 Student’s Shoes. Acta Linguistica Asiatica, 12(1), 37-58. https://doi.org/10.4312/ala.12.1.37-58

Python. (n.d.). Retrieved March 15, 2022, from https://www.python.org

Pyvis. (n.d.). Retrieved March 16, 2022, from https://pyvis.readthedocs.io/en/latest/#

Rýchly, P. (2008). A Lexicographer-Friendly Association Score. In P. Sojka & A. Horák (Eds.), Recent Advances in Slavonic Natural Language Processing (pp. 6-9). Masaryk University.

VIS.JS. (n.d.). Retrieved March 16, 2022, from https://almende.github.io/vis/docs/network/

Downloads

Published

30. 07. 2022

Issue

Section

Research articles

How to Cite

Gajdoš, L., & Gajdošova, E. (2022). Examples of Corpus Data Visualization: Collocations in Chinese. Acta Linguistica Asiatica, 12(2), 9-26. https://doi.org/10.4312/ala.12.2.9-26