Clustering Arabic tweets for sentiment analysis

Abuaiadah, Diab; Rajendran, Dileep; Jarrar, Mustafa

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11889/5231

DC Field	Value	Language
dc.contributor.author	Abuaiadah, Diab
dc.contributor.author	Rajendran, Dileep
dc.contributor.author	Jarrar, Mustafa
dc.date.accessioned	2017-11-16T09:24:05Z
dc.date.available	2017-11-16T09:24:05Z
dc.date.issued	2017-11-02
dc.identifier.uri	http://hdl.handle.net/20.500.11889/5231
dc.description.abstract	The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normalsized documents where, in many information retrieval applications, light stemming performs better than rootbased stemming and the Cosine function is commonly used	en_US
dc.language.iso	en_US	en_US
dc.publisher	IEEE	en_US
dc.subject	Microblogs	en_US
dc.subject	Language and emotions - Arab countries	en_US
dc.subject	Similarity (Language learning)	en_US
dc.subject.lcsh	Online social networks
dc.subject.lcsh	Arabic language - Terms and phrases
dc.subject.lcsh	Cluster analysis
dc.subject.lcsh	Ontologies (Information retrieval)
dc.title	Clustering Arabic tweets for sentiment analysis	en_US
dc.type	Conference Proceedings	en_US
newfileds.department	Engineering and Technology	en_US
newfileds.conference	IEEE/ACS 14th International Conference on Computer Systems and Applications	en_US
newfileds.item-access-type	open_access	en_US
newfileds.thesis-prog	none	en_US
newfileds.general-subject	Computers and Information Technology \| الحاسوب وتكنولوجيا المعلومات	en_US
item.grantfulltext	open	-
item.fulltext	With Fulltext	-
item.languageiso639-1	other	-
Appears in Collections:	Fulltext Publications

Files in This Item:

File	Description	Size	Format
ARJ17.pdf		787.79 kB	Adobe PDF	View/Open

Show simple item record

Page view(s)

144

Last Week
0

Last month
3

checked on Apr 14, 2024

Download(s)

52

checked on Apr 14, 2024

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM