From verbal to adjectival

Evaluating the lexicalization of participles in an Estonian corpus

Authors

  • Geda Paulsen Institute of the Estonian Language, Tallinn, Estonia; University of Uppsala, Sweden
  • Maria Tuulik Institute of the Estonian Language, Tallinn, Estonia
  • Ahti Lohk Institute of the Estonian Language, Tallinn; Tallinn University of Technology, Estonia
  • Ene Vainik Institute of the Estonian Language, Tallinn, Estonia

DOI:

https://doi.org/10.4312/slo2.0.2022.1.65-97

Keywords:

corpus linguistics, lexicography, Estonian language, adjective, participle, deviation analysis

Abstract

This study addresses categorization issues related to adjective candidates in Estonian, focusing on the category of participles. The aim of the analysis was to assess the ranges of the prototypical adjective and to determine its degree of deviation on the prototypicality scale. The investigation was based on a group of validated adjectives – selected adjectives included in the Basic Estonian Dic­tionary – and two control groups of more and less lexicalized participles. We tested seven morphosyntactic corpus patterns characteristic of adjectives. The test patterns were based on the prototypical features of the adjective, as well as on observations made in the actual lexicographic analysis. To assess the sam­ple words and determine the significance of the test patterns from the point of view of defining adjectivity, we used deviation analysis. The results of this study can be applied to establish a measure of adjectivity for lexicographic judgments when distinguishing, for instance, lexicalized participles from regular ones.

Downloads

Download data is not yet available.

References

Berlin, B., & Kay, P. (1969). Basic Color Terms: Their Universality and Evolution. Berkeley: University of California Press.

CombiDic = The EKI Combined Dictionary. (2020). Hein, I., Kallas, J., Kiisla, O., Koppel, K., Langemets, M., Leemets T., …, & Voll, P. Institute of the Estonian Language. Retrieved from https://sonaveeb.ee (15. 11. 2022)

Dixon, R. M. W. (2004). Adjective classes in typological perspective. In R. M. W. Dixon & A. Aikhenvald (Eds.), Adjective classes: a cross-linguistic typology. Oxford: Oxford University Press.

Ekilex. Retrieved from https://ekilex.eki.ee/ (26. 11. 2021)

Erelt, M. (2017a). Omadussõnafraas [The adjective phrase]. In M. Erelt & H. Metslang (Eds.), Eesti keele süntaks [The Syntax of Estonian] (pp. 405−415). Eesti keele varamu III. Tartu Ülikooli Kirjastus.

Erelt, M. (2017b). Öeldistäide [The predicative]. In M. Erelt & H. Metslang (Eds.), Eesti keele süntaks [The Syntax of Estonian] (pp. 278−288). Eesti keele varamu III. Tartu Ülikooli Kirjastus.

Erelt, M. (2017c). Öeldistäitemäärus [The predicative adverbial]. In M. Erelt & H. Metslang (Eds.), Eesti keele süntaks [The Syntax of Estonian] (pp. 289−299). Eesti keele varamu III. Tartu Ülikooli Kirjastus.

Erelt, M. (2017d). Sekundaartarindiga laused [Sentences with secondary constructions]. In M. Erelt & H. Metslang (Eds.), Eesti keele süntaks [The Syntax of Estonian] (pp. 756–840). Eesti keele varamu III. Tartu Ülikooli Kirjastus.

Erelt, M. (2017e). Sissejuhatus süntaksisse [Introduction to syntax]. In M. Erelt & H. Metslang (Eds.), Eesti keele süntaks [The Syntax of Estonian] (pp. 53–89). Eesti keele varamu III. Tartu: Tartu Ülikooli Kirjastus.

The Estonian Collocations Dictionary = Eesti keele naabersõnad. (2019). Kallas, J., Koppel, K., Paulsen, G. & Tuulik, M., Institute of the Estonian Language. Retrieved from http://www.sonaveeb.ee (14. 11. 2022)

The Estonian-Russian Dictionary (2019). Laasi, H., Lagle, T., Leemets, H., Liiv, M., Pärn, H., Simm, L., …, Tubin, V., (Comp.); Liiv, M., Melts, N., Romet, A., Kallas, J., Riikoja, E., Martoja, I., Smirnov, S., …, & Veskimägi, E. (Eds.). doi: 10.15155/3-00-0000-0000-0000-0001BL

Geeraerts, D. (1989). Prospects and problems of prototype theory. Linguistics, 27, 587−612. DOI: https://doi.org/10.1515/ling.1989.27.4.587

Habicht, K., Kaalep, H.-J., Muischnek, K., Müürisep, K., & Rääbis, A. (2000). Kas tegelik tekst allub eesti keele morfoloogilistele kirjeldustele? Eesti kirjakeele testkorpuse morfosüntaktilise märgendamise kogemusest. Keel ja Kirjandus, 9, 623−633.

Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers.

Kaalep, H.-J., Kirt, R., & Muischnek, K. (2012). A trivial method for choosing the right lemma. In Baltic HLT (pp. 82−89).

Kallas, J., Tiits, M., Tuulik, M., Koppel, K., & Jürviste, M. (2014). Eesti keele põhisõnavara sõnastik [The Basic Estonian Dictionary]. Tallinn: Eesti Keele Sihtasutus.

Kallas, J., & Tuulik, M. (2011). Eesti keele põhisõnavara sõnastik: ajalooline kontekst ja koostamispõhimõtted. Estonian Papers in Applied Linguistics, 7, 59−75. DOI: https://doi.org/10.5128/ERYa7.04

Kallas, J., Tuulik, M., & Langemets, M. (2014). The Basic Estonian Dictionary: the first Monolingual L2 learner’s Dictionary of Estonian. In A. Abel, C. Vettori & N. Ralli (Eds.), Proceedings of the XVI EURALEX International Congress: The User in Focus, 15–19 July 2014, Bolzano, Bozen (pp. 1109– 1119). Bolzano/Bozen: Institute for Specialised Communication and Multilingualism. Retrieved from https://www.euralex.org/elx_proceedings/ Euralex2014/euralex_2014_086_p_1109.pdf

Kasik, R. (2015). Sõnamoodustus. Tartu: Tartu Ülikooli Kirjastus.

Kerge, K. (1998). Vormimoodustus, sõnamoodustus ja leksikon: oleviku kesksõna võrdluse all. Tallinn: TPÜ Kirjastus.

Kilgarriff, A., Rychlý, P., Smrz, P., & Tugwell, D. (2004). The Sketch Engine. In G. Williams & S. Vessier (Eds.), Proceedings of the Eleventh EURALEX International Congress, 6–10 July 2004, Lorient, France (pp. 105–116). Lorient: Université de Bretagne Sud. Retrieved from https://euralex.org/category/publications/euralex-2004/

Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1, 7–36. DOI: https://doi.org/10.1007/s40607-014-0009-9

Koppel, K., Tavast, A., Langemets, M., & Kallas, J. (2019). Aggregating dictionaries into the language portal Sõnaveeb: issues with and without a solution. In I. Kosem, T. Zingano Kuhn., M. Correia, J. P. Ferreria, M. Jansen, I. Pereira, J. Kallas, M. Jakubíček, S. Krek & C. Tiberius (Eds.). Proceedings of the eLex 2019 conference: Smart lexicography, 1–3 October 2019, Sintra, Portugal (pp. 434−452). Brno: Lexical Computing CZ, s.r.o. Retrieved from https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_24.pdf

Laur, S., Orasmaa, S., Särg, D., & Tammo, P. (2020). EstNLTK 1.6: Remastered Estonian NLP Pipeline. In Proceedings of The 12th Language Resources and Evaluation Conference, May 2020, Marseille, France (pp. 7152–7160). European Language Resources Association (ELRA). Retrieved from https://aclanthology.org/2020.lrec-1.0.pdf

Orasmaa, S., Petmanson, T., Tkatšenko, A., Laur, S., & Kaalep, H-J. (2016). EstNLTK – NLP Toolkit for Estonian. In N. Calzolari, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariani, A. Moreno, J. Odijk & P. Stelios (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia (pp. 2460−2466). European Language Resources Association (ELRA). Retrieved from http://www.lrec-conf.org/proceedings/lrec2016/pdf/332_Paper.pdf

Pajusalu, R. (2017). Nimisõnafraas [The noun phrase]. In M. Erelt & H. Met¬slang (Eds.), Eesti keele süntaks [The Syntax of Estonian]. Eesti keele var¬amu III. Tartu Ülikooli Kirjastus (pp. 379−404).

Paulsen, G., Vainik, E., Tuulik, M., & Lohk, A. (2019). The Lexicographer’s Voice: Word Classes in the Digital Era. In I. Kosem, T. Zingano Kuhn., M. Correia, J. P. Ferreria, M. Jansen, I. Pereira, J. Kallas, M. Jakubíček, S. Krek & C. Tiberius (Eds.). Proceedings of the eLex 2019 conference: Smart lexicography, 1–3 October 2019, Sintra, Portugal (pp. 319−337). Brno: Lexical Computing CZ, s.r.o. Retrieved from https://elex.link/elex2019/wp-con-tent/uploads/2019/09/eLex_2019_18.pdf

Paulsen, G., Vainik, E., & Tuulik, M. (2020). Sõnaliik leksikograafi töölaual: sõnaliikide roll tänapäeva leksikograafias [On word classes in contemporary lexicography: The lexicographers” view]. Estonian Papers in Applied Linguistics, 16, 177−202. DOI: https://doi.org/10.5128/ERYa16.11

Paulsen, G., Vainik, E., Lohk, A., & Tuulik, P. (2021). Catching lexemes. The case of Estonian noun-based ambiforms. Electronic lexicography in the 21st century. In I. Kosem, M. Cukr, M. Jakubíček, J. Kallas, S. Krek & C. Tiberius (Eds.), Proceedings of the eLex 2021 conference: Post-editing lexicography, 5–7 July 2021, Brno, Czech Republic (pp. 288−311). Brno: Lexical Computing CZ, s.r.o. Retrieved from https://elex.link/elex2021/ wp-content/uploads/2021/08/eLex_2021_17_pp288-311.pdf

Rosch, E. (1973). On the internal structure of perceptual and semantic categories. In T. E. Moore (Ed.), Cognitive Development and the Acquisition of Language (pp. 111−144). New York, San Francisco/London: Academic Press.

Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology, General, 104(3), 192−233. DOI: https://doi.org/10.1037/0096-3445.104.3.192

Rosch, E. (1978). Principles of categorization. In E. Rosch & B. B. Lloyd (Eds.), Cognition and categorization (pp. 27−48). Hillsdale, Lawrence Erlbaum, New York.

Tavast A., Koppel, K., Langemets, M., & Kallas, J. (2020). Towards the Superdictionary: Layers, Tools and Unidirectional Meaning Relations. In Z. Gavriilidou, M. Mitsiaki & A. Fliatouras (Eds.), Proceedings of XIX EURALEX Congress: Lexicography for Inclusion, Alexandroupolis, 2021, online, Vol. 1 (pp. 215−223). Greece: Democritus University of Thrace.

Tavast, A., Langemets, M., Kallas, J., & Koppel, K. (2018). Unified Data Modelling for Presenting Lexical Data: The Case of EKILEX. In J. Čibej, V. Gor¬janc, I. Kosem & S. Krek (Eds.), Proceedings of the XVIII EURALEX International Congress: EURALEX: Lexicography in Global Contexts, 17–21 July 2018, Ljubljana, Slovenia (pp. 749–761). doi: 10.4312/9789610600961 DOI: https://doi.org/10.4312/9789610600961

Tuulik, M., Vainik, E., Paulsen, G., & Lohk, A. (2022). Kuidas ära tunda adjektiivi? Korpuskäitumise mustrite analüüs [How to recognize adjectives? An analysis of corpus patterns]. Estonian Papers in Applied Linguistics, 18, 279−302. doi: 10.5128/ERYa18.16. DOI: https://doi.org/10.5128/ERYa18.16

Vainik, E., Paulsen, G., & Lohk, A. (2021). A typology of lexical ambiforms in Estonian. In Z. Gavriilidou, M. Mitsiaki & A. Fliatouras (Eds.), Proceedings of XIX EURALEX Congress: Lexicography for Inclusion, 7–11 September 2021, Alexandroupolis, Greece, Vol. 1 (pp. 119−130). Alexandroupolis, Greece: Democritus University of Thrace. Retrieved from https:// euralex2020.gr/wp-content/uploads/2020/11/EURALEX2020_ProceedingsBook-p119-130.pdf

Vainik, E., Lohk, A., & Paulsen, G. (2021). The Distribution Index Calculator for Estonian. Electronic lexicography in the 21st century. In I. Kosem, M. Cukr, M. Jakubíček, J. Kallas, S. Krek & C. Tiberius (Eds.), Proceedings of the eLex 2021 conference: Post-editing lexicography, 5–7 July 2021, Brno, Czech Republic (pp. 121−138). Brno: Lexical Computing CZ, s.r.o. Retrieved from https://elex.link/elex2021/wp-content/uploads/2021/08/eLex_2021_07_pp121-138.pdf

Vainik, E., Paulsen, G., Tuulik, M., & Lohk, A. (in press). Towards the Morphosyntactic Corpus Profile of Prototypical Adjectives in Estonian. Estonian Papers in Applied Linguistics.

Vare, S. (1984). Omadussõnaliited tänapäeva eesti kirjakeeles. [The adjectival suffixes in contemporary Estonian]. Tallinn: Valgus.

Viitso, T.-R. (2003). Structure of the Estonian language: Phonology, morphology, and word formation. In M. Erelt (Ed.), Estonian language (pp. 9−92). Tallinn: Estonian Academy Publishers.

Warren, B. (1984). Classifying Adjectives. Göteborg: Acta Universitatis Gothoburgensis.

Downloads

Published

21.12.2022

How to Cite

Paulsen, G., Tuulik, M., Lohk, A., & Vainik, E. (2022). From verbal to adjectival: Evaluating the lexicalization of participles in an Estonian corpus. Slovenščina 2.0: Empirical, Applied and Interdisciplinary Research, 10(1), 65–97. https://doi.org/10.4312/slo2.0.2022.1.65-97

Issue

Section

Articles