Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11889/6019
Title: Authorship attribution of Arabic articles
Authors: Hajja, Maha 
Yahya, Ahmad 
Yahya, Adnan 
Keywords: Arabic authorship attribution;Arabic plagiarism detection;Writing style recognition;Arabic special features;Arabic text author identification
Issue Date: 17-Oct-2019
Publisher: Springer
Source: M Hajja, A Yahya and A. Yahya. "Author Attribution of Arabic Articles". 7th International Conference on Arabic Language Processing ICALP2019. Nancy, France, 16-17 October 2019.
Abstract: With the huge size and large diversity of web content and the appearance of more social media platforms and blog websites, more people are contributing content of varying quality. Many users prefer to keep themselves anonymous when posting material to the web, which resulted in more pieces of text: articles, blogs, essays and emails being published under assumed identities or have no known author. This may result in copyright and other legal issues and thus the need for good authorship attribution systems. The problem may be more acute for Arabic texts due to restrictions, actual and perceived, on electronic content publication and the prevailing social norms. In this paper we study the issue of Arabic author attribution (AAA) concerned with designating a particular author of an Arabic (MSA) article from among a given set of potential authors. Many features were taken into consideration for training and testing our models for AAA. We studied the effects of features like part of speech (PoS) tags, stylistic issues like punctuation marks usage and sentence characteristics, word types and word diversity. In general, PoS features, word n-grams features and rare words proved to be the most informative for our task. We also investigated the effect of factors like number of potential authors, number of articles per author, and the size of text chunks used and we report on the results.
Description: A paper accepted to the 7th International Conference on Arabic Language Processing ICALP2019. Nancy, France, 16-17 October 2019.
URI: http://hdl.handle.net/20.500.11889/6019
Appears in Collections:Fulltext Publications (BZU Community)

Files in This Item:
File Description SizeFormat Existing users please Login
Paper14AAAMahaAhmadAdnanYahya.pdfEarly Draft505.38 kBAdobe PDF    Request a copy
Show full item record

Page view(s)

8
Last Week
0
Last month
4
checked on Dec 10, 2019

Download(s)

1
checked on Dec 10, 2019

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.