Persian Author Identification based on Systemic Functional Grammar

Document Type : Original Article

Authors

1 Department of Linguistics, Faculty of Persian Literature and Foreign Languages, Allameh Tabataba'i University, Tehran, Iran.

2 Department of Computer, Faculty of Statistics, Mathematics and Computer, Allameh Tabataba’i University, Tehran, Iran.

3 Azad Eslami University

Abstract

Automated Author identification is one of the important fields in forensic linguistics. In this study, the effectiveness of systemic functional grammar features in Persian authorship attribution was compared with that of function words. First, a corpus composed of documents written by seven contemporary Iranian authors was collected. Second, a list of function words was extracted from the corpus. Moreover, conjunction, modality and comment adjunct system networks were applied to form a lexicon using linguistics resources. Then, the relative frequency of function words in addition to systemic functional features were calculated in each document. Multilayer perceptron classifier, a type of neural network, was used for learning phase which resulted in a desirable accuracy in evaluation phase. The results of the study showed that using function words method is superior to systemic functional approach alone in Persian author identification, however, simultaneous use of the two methods increases the effectiveness in comparison to each alone.

Keywords