Document similarity for Arabic and cross-lingual Web content

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11889/5350

Title:	Document similarity for Arabic and cross-lingual Web content
Authors:	Salhi, Ali Yahya, Adnan
Keywords:	Cross-language information retrieval;Explicit Semantic Association;Similarity (Language learning);Information retrieval - Arab countries
Issue Date:	11-Nov-2017
Publisher:	Springer Verlag
Series/Report no.:	Communications in Computer and Information Science #782
Abstract:	Document similarity is basic for Information Retrieval. Cross Lin- gual (CL) similarity is important for many data processing tasks such as CL palgiarism detection and retrieval and document quality assessment. We study CL similarity based on the Explicit Semantic Association (ESA) adapted to a cross lingual setting with focus on Arabic. We compare the degree to which CL similarity testing performs where one of the language is Arabic with its monolingual counterpart for various text chunk sizes. We describe the used infrastructure and report on some of the testing results, study the possible sources of encountered weaknesses and point to the possible directions for improvement.
URI:	http://hdl.handle.net/20.500.11889/5350
ISSN:	1865-0937
Appears in Collections:	Fulltext Publications

File	Description	Size	Format
10.1007_978-3-319-73500-9_10.pdf		241.61 kB	Adobe PDF	View/Open

160

Last Week
0

Last month
2

checked on Apr 14, 2024

38

checked on Apr 14, 2024

Check