Please use this identifier to cite or link to this item:
Title: Efficient spam filtering based on informative features extracted from the header fields and the urls in the message
Authors: Qaroush, Aziz 
Washaha, Mahdi 
Khater, Ismail 
Keywords: Computer security;Machine learning;Spam;Ham;Spam filtering (Electronic mail);Classification;Data mining
Issue Date: 2014
Publisher: Computer Systems Science and Engineering
Abstract: The dramatic increase in spam is regarded as one of the major problems afflicting internet email service, as spammers endeavor to defeat spam filters by modifying and developing new techniques to raise the effectiveness of their campaigns for advertising or phishing websites. In this paper we present the results of our analysis of message header fields and the URLs in the message body, and propose informative and discriminative email spam detection features based on recent public email data sets. Furthermore, the Web of Trust (WOT) service was used to measure the reputation of the sender and the URLs included in the message. Subsequently, several machine learning-based classifiers were applied to evaluate the performance of these features, including the reputation feature. The results demonstrate the power of the extracted features, and also establish that the Random Forest (RF) classifier has the best performance of all the classifiers used in terms of accuracy, precision, recall, F-measure, and total cost ratio of 99.69%, 99.70%, 99.90%, 99.8%, and 65 respectively.
Appears in Collections:Fulltext Publications

Show full item record

Page view(s)

checked on Jun 27, 2024


checked on Jun 27, 2024

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.