Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11889/4443
DC FieldValueLanguage
dc.contributor.authorYahya, Adnan
dc.contributor.authorSalhi, Ali
dc.date.accessioned2017-03-09T07:13:15Z
dc.date.available2017-03-09T07:13:15Z
dc.date.issued2014-02-01
dc.identifier.citationAdnan Yahya and Ali Salhi. 2014. Arabic Text Categorization Based on Arabic Wikipedia. 13, 1, Article 4 (February 2014), 20 pages. DOI: http://dx.doi.org/10.1145/2537129en_US
dc.identifier.urihttp://hdl.handle.net/20.500.11889/4443
dc.descriptionACM Transactions on Asian Language Information Processing. Vol. 13, No. 1, Article 4. February 2014
dc.description.abstractThis paper describes an algorithm for categorizing Arabic text, relying on highly categorized corpus-based data sets, obtained from the Arabic Wikipedia by using manual and automated processes to build and customize categories. The categorization algorithm was built by adopting a simple categorization idea, then moving forward to more complex one. We applied tests and filtration criteria to end with the best and most efficient results that our algorithm can achieve. The categorization depends on the statistical relation between the input text and the reference (training) data supported by well defined Wikipedia-based categories. Our algorithm supports two levels for categorizing Arabic text; categories are grouped into a hierarchy of main categories and subcategories. This introduces a challenge due to the correlation between certain subcategories and overlap between main categories. We argue that our algorithm achieved good performance compared to other methods reported in the literature.en_US
dc.language.isoen_USen_US
dc.publisherACM: Association for Computing Machineryen_US
dc.relation.ispartofseriesdoi>10.1145/2540989;
dc.subjectNatural language processing (Computer science)en_US
dc.subjectComputer network resources - Arab countriesen_US
dc.subjectComputational linguisticsen_US
dc.subjectLinguistics - Databasesen_US
dc.subjectText processing (Computer science)en_US
dc.subject.lcshWikipedia
dc.titleArabic text categorization based on Arabic Wikipediaen_US
dc.typeArticleen_US
newfileds.departmentEngineering and TechnologyEngineering and Technologyen_US
newfileds.custom-issue-dateVol. 13, No. 1, Article 4. February 2014en_US
newfileds.conferenceACM Transactions on Asian Language Information Processing. Vol. 13, No. 1, Article 4. February 2014en_US
newfileds.item-access-typebzuen_US
newfileds.thesis-prognoneen_US
newfileds.general-subjectComputers and Information Technology | الحاسوب وتكنولوجيا المعلوماتen_US
item.languageiso639-1other-
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:Fulltext Publications
Files in This Item:
File Description SizeFormat
TALIP_PreFinalCopy.pdf651.3 kBAdobe PDFView/Open
Show simple item record

Page view(s)

165
Last Week
1
Last month
4
checked on Apr 14, 2024

Download(s)

128
checked on Apr 14, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.