Topic Modeling with Word2Vec based Noun Expansion for Dark Web Marketplace Analysis

Keywords
No Thumbnail Available
Issue Date
2019-10-11
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A novel approach for expanding documents is proposed to improve topic modeling on short text. The enrichment is based on expanding noun words with information from custom (e.g. domain-speci c) and pretrained Word2Vec models. The quality of the di erent conditions: original, custom and pretrained, are evaluated with manual analysis of the created topics and with the classi cation performance of a Suport Vector Machine trained on the output of an LDA system. Manual analysis did not show a striking improvement of the created topics with the enriched texts, compared to the original text. The performance of the prediction models show a improved performance, only when enriched with information from the custom Word2Vec models. However, the extent of the improvement is dependent on the text domain.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen