Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.11889/6743
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Addabe', Yara | en_US |
dc.contributor.author | Abu Hammad, Yara | en_US |
dc.contributor.author | Ayyad, Nataly | en_US |
dc.contributor.author | Yahya, Adnan | en_US |
dc.date.accessioned | 2021-04-21T05:41:38Z | - |
dc.date.available | 2021-04-21T05:41:38Z | - |
dc.date.issued | 2021-02-11 | - |
dc.identifier.citation | Yara Addabe', Yara Abu Hammad, Nataly Ayyad and Adnan Yahya. A Dataset for Authorship Analysis of Short Modern Arabic Text. Graduation Project. Department of Electrical and Computer Engineering. Birzeit University. 2021 | en_US |
dc.identifier.uri | http://hdl.handle.net/20.500.11889/6743 | - |
dc.description.abstract | The collection has 71391 Arabic tweets written by 44 Arab author from 13 Arab Countries plus tweets of Pope Francis in MSA. Tweepy API was used for tweet scraping. Tweet topics: Politics, Journalism, Religion, Arabic Literature. The tweets were preprocessed: Retweets and replies were removed, any tweet with the author’s name was removed, emojis, hyperlinks, and English hashtags were filtered out, numbers were normalized and English and other non-Arabic characters were filtered out. Then stop words were removed, tokens lemmatized and normalized. The two files have the training and testing data used in our project. | en_US |
dc.description.sponsorship | Birzeit University | en_US |
dc.language.iso | ar | en_US |
dc.publisher | BZU-ECE Department | en_US |
dc.subject | Tweets, Arabic | en_US |
dc.subject | Authorship attribution | en_US |
dc.subject | Authorship - Data processing | en_US |
dc.subject | Computational linguistics | en_US |
dc.subject | Authorship analysis | en_US |
dc.subject | Information storage and retrieval systems | en_US |
dc.subject | Tweets - Authorship analysis | en_US |
dc.title | A Dataset for Authorship Analysis of Short Modern Arabic Text | en_US |
dc.title.alternative | Authorshiip Attribution of Arabic Tweets Dataset | en_US |
dc.type | Dataset | en_US |
dcterms.available | September 2020 | en_US |
dcterms.creator | Yara Addabe', Yara Abu Hammad, Nataly Ayyad and Adnan Yahya | en_US |
dcterms.format | Excel | en_US |
dcterms.instructionalMethod | Automated collection from Twitter | en_US |
dcterms.source | en_US | |
newfileds.department | Engineering and Technology | en_US |
newfileds.item-access-type | archive | en_US |
newfileds.thesis-prog | none | en_US |
newfileds.general-subject | Computers and Information Technology | الحاسوب وتكنولوجيا المعلومات | en_US |
item.grantfulltext | open | - |
item.languageiso639-1 | other | - |
item.fulltext | With Fulltext | - |
Appears in Collections: | 6. BZU Dataset Collection |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
AuthorAttributionTweetsTrainingDataYara_2_Nataly.xlsx | Training Dataset Tweets | 4.14 MB | Microsoft Excel XML | View/Open |
AuthorAttributionTweetsTestDataYara_2_Nataly.xlsx | Testing Dataset Tweets | 1.06 MB | Microsoft Excel XML | View/Open |
Page view(s)
538
checked on Apr 14, 2024
Download(s)
9,942
checked on Apr 14, 2024
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.