Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11889/4394
DC FieldValueLanguage
dc.contributor.authorSalhi, Ali
dc.contributor.authorYahya, Adnan
dc.date.accessioned2017-03-06T10:37:22Z
dc.date.available2017-03-06T10:37:22Z
dc.date.issued2011-04-25
dc.identifier.urihttp://hdl.handle.net/20.500.11889/4394
dc.description.abstractThe Arabic web content is growing rapidly and the need for its efficient management is gaining importance and the morphological complexity of Arabic raises many challenges in this regard. This paper reports on some of our work aimed at designing text mining and query pre-processing tools that are able to efficiently process and search large quantities of Arabic web data. In our research we try to address the challenges Arabic poses for natural language processing (NLP) and information retrieval, root extraction, language detection, and Arabic query correction, suggestion and expansion. While not reported in detail here, we are also developing tools for automatic Arabic document categorization. All through, we employ a statistical/Corpus-based approach based on data obtained from a variety of sources. Based on corpus statistics we constructed databases of words and their frequencies as single, double and triple expressions and used that as the infrastructure for the well structured search aid tools that are able to handle the sophisticated nature of Arabic, and capable of being integrated into existing web search engines and document processing systems. We also utilize context analysis and spellchecking of the user queries to enable a more complete and efficient search. While the results reported here are promising, they must be viewed as work in progress, still in need of testing, refining, integration and deployment in real life settings.en_US
dc.language.isoen_USen_US
dc.subjectNatural language processing (Computer science)en_US
dc.subjectInformation retrieval - Arabic countriesen_US
dc.subjectInformation storage and retrieval systemsen_US
dc.subjectComputational linguisticsen_US
dc.subjectArabic language - Rootsen_US
dc.titleEnhancement tools for Arabic web search : a statistical approachen_US
dc.typeArticleen_US
newfileds.departmentEngineering and TechnologyEngineering and Technologyen_US
newfileds.conferenceInnovations in IT (7th : 2011 : Abu Dhabi, UAE)en_US
newfileds.item-access-typebzuen_US
newfileds.thesis-prognoneen_US
newfileds.general-subjectBiotechnology and Genetic Engineering | التكنولوجيا الحيوية والهندسة الوراثيةen_US
item.languageiso639-1other-
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:Fulltext Publications
Files in This Item:
File Description SizeFormat
UAEInnovationsConferencePaperJanuary2011.pdf332.03 kBAdobe PDFView/Open
Show simple item record

Page view(s)

167
Last Week
0
Last month
2
checked on Apr 14, 2024

Download(s)

69
checked on Apr 14, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.