Please use this identifier to cite or link to this item:
|Title:||Models for Arabic document quality assessment||Authors:||Yahya, Adnan
|Keywords:||Information services - Quality control;Archives - Arab countries;Information retrieval - Arab countries;Wikipedia;Archives - Processing;Machine learning;Information retrieval - Reliability||Issue Date:||27-Aug-2020||Publisher:||Springer||Source:||Adnan Yahya and Afnan Ahmad and Alaa Assaf and Rawan Khater and Ali Salhi. Models for Arabic Document Quality Assessment. 3rd Workshop on Quality of Open Data (QOD 2020). June 8-10, 2020. Colorado Springs, USA.||Conference:||3rd Workshop on Quality of Open Data (QOD 2020), CO, USA||Abstract:||Digital content has been increasing rapidly. This content can be generated, accessed and used by anyone and thus the need for quality assessment of web content before usage becomes an important issue. Devising methods to assess quality of Arabic digital content is the focus of this paper. Our work was partially based on Wikipedia articles annotated into featured and good according to quality guidelines of the Wikipedia. Our analysis was directed at finding features that can serve as best quality indicators. Using the defined features we trained a high accuracy quality assessment model using machine-learning algorithms. Our work went beyond the Wik-ipedia documents to build a general model that can assess the quality of Arabic documents that lack Wikipedia metadata with acceptable accuracy. The model was trained and built using features from documents we collected from Arabic online news sites and blogs, and annotated in collaboration with university students.||URI:||http://hdl.handle.net/20.500.11889/6389|
|Appears in Collections:||Fulltext Publications|
Show full item record
Files in This Item:
|ArabicDocumentQualityAssessmentPaperAdnanYahyaQOD2020SemiFinal.pdf||Paper Text, Prefinal||956.93 kB||Adobe PDF||View/Open|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.