Dialectometry of Linguistic Varieties Common in the Distance between the South of Hamadan Province to the North of Khuzestan Province: Using Levenshtein Distance Approach

Document Type : Original Article

Authors

1 Ph.D. Student of Linguistics, Department of Linguistics, Faculty of Humanities, Tarbiat Modares University, Tehran, Iran.

2 Associate Professor of Linguistics, Department of Linguistics, Faculty of Humanities, Tarbiat Modares University, Tehran, Iran.

Abstract

Dialectometry is a computational, quantitative and statistical approach, in which linguistic differences, in a selected geographical area, are examined by using specific methods and techniques. In the present study, the linguistic distances between varieties, common in the area from the south of Hamadan province to the north of Khuzestan province, and their regional distributions, have been studied by using a novel dialectometric approach. These language varieties are mostly Laki and Lori. This study is done in a library and field work method. Therefore, to do that, the distances between the equivalents of 100 words in 80 locations are measured, using Levenshtein distance which is included in RuG/L04
software. After the analyzing of the linguistic distances, the outputs are presented in the forms of interpretable maps, diagrams and statistical analysis. The main results of this study are: 1. The clustering of language varieties and 2. The manner of their linguistic-geographical distribution through a linguistic continuum. Also, the study of phonetic-lexical differences between the collected linguistic data, confirms the nature of continuity of these linguistic varieties. Thus, at one end of this continuum, Laki and Lori Lorestani varieties and at the other end, Lori Bakhtiari varieties are locate
Introduction
Language has a continuous nature, making it difficult to establish clear-cut boundaries between different language varieties. Unlike traditional and qualitative approaches used in dialectology, which often relied on subjective methods, modern dialectometric methods should be employed due to the continuous nature of language. Hence, it is unfeasible to ascertain precise linguistic demarcations among language variations.
     Quantitative methods in dialectology differ from traditional methods. Traditional approaches rely on the assumptions and linguistic features of the native speakers to determine the distribution and spread of language varieties in a particular geographic region. One of the disadvantages of these methods is the absence of consistent overlap between isoglosses and the utilization of personal preferences and opinions in choosing bundles of isoglosses. Quantitative approaches offer several advantages over traditional approaches. The advantage of using quantitative approaches compared to traditional approaches is the digital classification of data, automatic measurement of distances and frequencies, digital mapping of outputs and providing statistical analysis of linguistic data, which leads to the analysis of a large amount of linguistic data without the personal preferences of the individual researcher.
     Consequently, this study involves the aggregate analysis of a vast amount of linguistic data through quantitative methods, specifically linguistic distance measurement.
     The application of quantitative approaches in dialectology research, along with various dialectometric techniques and the assessment of linguistic distances in non-Iranian studies, can be traced back to the works of Seguy (1973) and Goebel (1982). Over the past ten years, dialectometry in Iran has become a focal point for numerous researchers specializing in Iranian languages and dialects. These Studies have been carried out on various common language varieties found in regions such as East Azarbaijan, West Azarbaijan, Hamadan, Mazandaran, Gorgan, Yazd, Ilam, Cherdaval and Talesh.
     The preservation of a society's identity and cultural heritage is closely intertwined with the study of languages and dialects. As a result, researchers and linguists must prioritize conducting methodical studies in this field. Historically, studies on Iranian language varieties have been conducted in isolation, focusing solely on specific linguistic aspects such as phonology, morphology or syntax. Thus, numerous studies have been carried out thus far, employing both traditional and scientific methodologies, to explore different facets of language varieties (Lori and Laki) common in the examined geographical region. These investigations have primarily focused on phonology, morphology and syntax. It is important to note that these studies are solely descriptive and qualitative in nature, lacking any comparative or quantitative research elements.
     Furthermore, up to now, few studies have addressed the matter of comparing different linguistic variations with a comprehensive approach, aiming to establish a systematic correlation between linguistic varieties.
     Their efforts have focused on creating a linguistic atlas and developing a meticulous, scientific, and well-organized classification system for these variations. Hence, it is imperative to carry out a linguistic investigation employing a dialectometric methodology to employ contemporary analytical-computational techniques on prevalent language variations in Lorestan province and its adjacent provinces, namely Hamadan and Khuzestan. The rationale behind selecting this specific geographical scope is the extensive usage of both Lori and Laki dialects within these territories. Consequently, it holds significant value to ascertain the linguistic-geographical dispersion of these language varieties in the aforementioned regions, disregarding any geographical limitations.
Methodology
This study is a synchronic descriptive-analytical investigation. The RuG/L04 dialectometry and cartography software package was employed to conduct this study. Initially, 100 lexical entries were gathered from 80 different locations. The research database was sourced from three national dialectology projects in Lorestan and Khuzestan provinces, as well as from the field research conducted by the researchers. The participants in this study encompass both males and females, ranging in age from 20 to 70 years, with an average educational attainment of a high school diploma. After transcribing 8000 lexical forms, the geographical coordinates for each location were determined by utilizing Google Earth software.
     Subsequently, in order to calculate the linguistic distance index, Levenshtein distance algorithm was applied to the data as one of the aggregate analysis approaches. The resulting distance, obtained from the 80x80 matrix, represents a quantitative index within the range of natural numbers. In the subsequent phase of the study, diverse subprograms were employed to categorize the acquired language types. These categorizations are then presented in the form of diagrams, tables, and maps.
Results
Upon analyzing the acquired outcomes, it was discovered that the linguistic variations being investigated form a continuous language continuum, devoid of any distinct boundaries (in contrast to traditional dialectology approaches). This continuum commences from the Laki and Lori varieties of Lorestan in the southern region of Hamadan province, extending all the way to the Bakhtiari varieties in the northern part of Khuzestan province. The attribute of continuity is also evident in the phonetic differences and alternations observed among the three primary language varieties, namely Laki, Lori Bakhtiari, and Lori Lorestani.
Conclusion
The greater language distance and difference between Laki and Lori varieties (as indicated in equation 2.5) confirms the belonging of each of these varieties to a different language family; Also, due to belonging to a common language family (Southwestern Iranian), there is less linguistic distance (equal to 1.5) and more linguistic similarity between two varieties of Lori. The obtained Pearson correlation coefficient between the linguistic varieties under study is r=0.88, indicating a strong and statistically significant correlation percentage. This validates the findings of the research and highlights the effectiveness of utilizing Levenshtein's distance dialectometric approach in identifying the primary linguistic clusters of the examined varieties and confirming the continuity nature of the language.

Keywords

Main Subjects


Amanolahi Baharvand, S. (2014). Qome Lor. Tehran: Agah. (In Persian)
Anonby, E. J. (2003). Updates on Luri: How many language. Journal of the Royal Asiatic Society, 13(2), 171-197. https://www.jstor.org/stable/25188361
Anonby, E. J. (2004-2005). Kurdish or Luri? Laki’s disputed identity in the Luristan Province of Iran. Kurdische Studien, 4-5, 7-22.
Arlatto, A. (2005). Introduction to historical linguistics. Tehran: Institute of Humanities and Cultural Studies. (In Persian)
Asadpour, H. (2012). The computer developed linguistic Atlas of Azerbaijan-e Qarbi: Notes on typological-perceptual approach in geolinguistics. M.A. Thesis in linguistics, Islamic Azad University. (In Persian)
Asatrian, G. S. (2009). Prolegomena to the study of the Kurds. Journal of Iran and Caucasus, 13(1), 1-59. http://dx.doi.org/10.1163/160984909X12476379007846
Blau, J. (2001). Personal correspondence.
Chambers, J. K., & Trudgill, P. (2004). Dialectology. Cambridge: Cambridge University Press.
Crystal, D. (2008). A dictionary of linguistics and phonetics. Oxford: Blackwell Publishing.
Deris Zayeri, M. (2015). International project of Iran dilaectology no.13 (20 villages in north of Lorestan province). M.A. Thesis in linguistics, Islamic Azad University. (In Persian)
Dinarvand, Z. (2013). International project of Iran dilaectology no.14 (20 villages in south of Lorestan province). M.A. Thesis in linguistics, Islamic Azad University. (In Persian)
Fattah. I. K. (2000). Les dialectes Kurdes Meridionaux: Etude Linguistique Dialectologique. Acta Iranica, 37, xxii-919.
Geravand, R., Karimi Doostan, Gh. H., Gholami, V., & Varzande, O. (2021). The study of phonology of Laki Kuhdasht. Journal of Iranian Dialects, 5(2), 295-314. (In Persian)
Ghesmatpour, B., Gholifamian, A. R., & Molaye Pashae, S. (2020). The computational dialectometry of phonetic differences of Talysh dialect in Gilan province. Journal of Farsi Language and Iranian Dialects, 5(1 (9)), 213-233. (In Persian) https://doi.org/10.22124/plid.2020.14995.1420
Goebl, H. (1982). Dialektometrie; Prinzipien und Methoden des Einsatzes der numerischen Taxonomie im Bereich der Dialektgeographie. Wien: Verlag der Öst.
Gundogdu, S. (2016). Remarks on vowel and consonants in Kurmanji. Journal of Social Sciences of Mus Alparslans University, 4(1), 57-69.    http://dx.doi.org/10.18506/anemon.16606
Heeringa, W. J. (2004). Measuring dialect pronunciation using Levenshtein distance. Ph.D. Dissertation in linguistics, University of Groningen.
Heidarizad, M., & Modaresi Qavami, G. (2019). Dialect Atlas of Chardaval: Studying phonetic differnces and measuring dialectal distances. Journal of Camparative Linguistics, 9(18), 45-64. (In Persian)     https://doi.org/10.22084/rjhll.2018.17009.1854
Izadpanah, H. (1964). Farhange Lori. Tehran: Asatir. (In Persian)
Izadpanah, H. (1978). Farhange Laki. Tehran: Asatir. (In Persian)
Kessler, B. (1995). Computational dialectology in Irish Gaelic. In Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics, 60-67     .
Kord Zafaranlu Kambuzia, A. (2006). Phonology: Rule-Based approaches. Tehran: Samt. (In Persian)
Lazard, G. (1992). Le dialects Laki d’ Aleshtar (Kurde Meridionale). Journal of Studia Iranica, 21, 215-245.
Lorimer, D. L. R. (1922). The Phonology of the Bakhtiari, Badakhshani and Madaglashti dialects of Modern Persian. London: Royal Asiatic Society.
MacKenzie, D. N. (1961). The origins of Kurdish. Transactions of Philological Societies, 60(1), 68-86. https://doi.org/10.1111/j.1467-968X.1961.tb00987.x
MacKenzie, D. N. (2009). A Consicise Pahlavi dictionary (M. Mirfakhrai, Trans.). Tehran: Institute of Humanities and Cultural Studies. (In Persian)
MacKinnon, C. (2011). Lori language, Lori dialect. Iranica Encyclopaedia.
Mann, O. (1904). Kurze Skizze der Lurdialecte, Sitzungsberichte der Koniglichen Preubischen Akademie der Wissenschaften, Berlin, 1173-93.
Minorsky, V. (1986a). Lak. The Encyclopaedia of Islam, 5, 616-7.
Molaye Pashae, S. (2014). Computational dialectometry of Northern slopes of the Central Alborz via Levenshtein algorithm: A linguistic Atlas. Ph.D. Dissertation in linguistics, Payame Noor University. (In Persian)
Najafian, A., Musavi, T., Roshan, B., & Molaye Pashae, S. (2016). Dialectometric recognition of Mazandarani language varieties located off the Gorgan Gulf through central Mazandaran. Journal of Language Related Research, 7(6 (34, 445-469. (In Persian)
Nerbonne, J., & Heeringa, W. (1997). Measuring dialect distance phonetically: In computational phonology. Third Meeting of the ACL Special Interest Group in Computational Phonology, 11-18.
Rostambeik Tafreshi, A. (2015). Dialectal Atlas and measuring of dialectal distances in Yazd province. Journal of Camparative Linguistic Research, 5(10), 57-74. (In Persian)
Rostambeik Tafreshi, A. (2016). Dialectal Atlas and measuring of dialectal distances in Hamedan province. Journal of Language Related Research, 7(1 (29)), 59-80. (In Persian)
Saeidi, M. (2012). International project of Iran dilaectology no.11 (20 villages in north of Khizestan province). M.A. Thesis in linguistics, Islamic Azad University. (In Persian)
Safinejad, J. (2002). Methods of tax collection in Lor nomadic areas of Iran. Journal of Anthropology Letter, 1(1), 17-31. (In Persian)
Sanaei, Y. (2016). Linguistic verieties of north Ilam province: A linguistic Atlas. M.A. Thesis in linguistics, Payame Noor University. (In Persian)
Seguy, J. (1973). La dialectometrie dans l Atlas linguistique de la Gascogne. In Revue de Linguistique Romance, 37, 1-24.
Serva, M., & Petroni, F. (2008). Indo-European languages tree by Levenshtein distance. Europhysics Letters, 81(6). DOI 10.1209/0295-5075/81/68005
Tedesco, P. (1921). Dialektologie der mitteliranschen Turfantexte. Monde Oriental, 15, 184-258.
Valls, E., Nerbonne, J., Proki, J., & Wieling, M. (2012). Applying the Levenshteon distance to Catalan dialects: A brief comparison of two dialometric approaches. Journal of Universidad de Santiago de Compostela, 39, 35-61.
Van der Ark, R., Mennecier, P., Nerbonne, J., & Mannie, F. (2007). Preliminar identification of language groups and loan words in Central Asia. P. Osenova, E. Hinrichs & J. Nerbonne (Eds.), Proceedings of the RANLP workshop on computational phonology at the conference recent advances in natural language processing, 13-20.
Windfuhr, G. (1990). New West Iranian languages. In R. Schmitt (Ed.), Compendium Linguarum Iranicarum, (p. 248). Wiesbaden, Verlag.
Windfuhr, G. (2009). The iranian languages. London and New York: Routledge.