A domain-based feature generation and convolution neural network approach for extracting adverse drug reactions from social media posts

Odeh, Feras

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11889/5536

DC Field	Value	Language
dc.contributor.author	Odeh, Feras	-
dc.date.accessioned	2018-05-07T09:44:58Z	-
dc.date.available	2018-05-07T09:44:58Z	-
dc.date.issued	2018	-
dc.identifier.uri	http://hdl.handle.net/20.500.11889/5536	-
dc.description.abstract	The recent popularity of the social media networks including forums, blogs, and micro-blogging networks changed the way patients share their health experiences and treatment options. Such forums offer valuable, unsolicited, uncensored information on drug safety and side effects directly from patients. However, it is very challenging to extract useful information from such forums due to several factors such as grammatical and spelling errors, colloquial language, and post length limitation. Furthermore, due to the sensitivity of the domain for adverse drug reactions (ADR) detection, it is more critical to identify correct ADRs (i.e., achieve higher classification precision) than identifying non-precise ones. The aims of this thesis are: (i) to develop a new approach for ADR classification in twitter posts called Semantic Vector(SemVec); (ii) to explore natural language processing (NLP) approaches for generating domain features from text, and utilizing them for ADRs detection; and (iii) to improve convolution neural network (CNN) ADR classifi- cation precision by incorporating domain features. This thesis proposes a dynamic and pluggable model, named SemVec, for representing words as a vector of both domain and morphological features. Based on the problem domain, domain features can be added or removed to generate an enriched word representation with domain knowledge. SemVec represents each post as a matrix of word vectors, which is fed into CNN. SemVec is scalable, can be applied to other domains by employing relevant natural language processing methods and domain lexicons. The proposed method was evaluated on Twitter (ADR) dataset. Results show that SemVec improves the precision of ADR detection by 13.43% over other state-of-the-art deep learning methods with a comparable recall score.	en_US
dc.language.iso	en_US	en_US
dc.subject	Medicine - Data processing	en_US
dc.subject	Social media in medicine	en_US
dc.subject	Drugs - Side effects	en_US
dc.subject	Online social networks - Safety measures	en_US
dc.subject	Information storage and retrieval systems - Medical care	en_US
dc.subject	Semantic web	en_US
dc.title	A domain-based feature generation and convolution neural network approach for extracting adverse drug reactions from social media posts	en_US
dc.type	Thesis	en_US
newfileds.department	Graduate Studies	en_US
newfileds.item-access-type	open_access	en_US
newfileds.thesis-prog	Scientific Computation	en_US
newfileds.general-subject	none	en_US
item.grantfulltext	open	-
item.fulltext	With Fulltext	-
item.languageiso639-1	other	-
Appears in Collections:	Theses

Files in This Item:

File	Description	Size	Format
Thesis(1).pdf		2.11 MB	Adobe PDF	View/Open

Show simple item record

Page view(s)

142

Last Week
0

Last month
2

checked on Apr 14, 2024

Download(s)

102

checked on Apr 14, 2024

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM