Topic Modeling with Word2Vec based Noun Expansion for Dark Web Marketplace Analysis
Keywords
No Thumbnail Available
Authors
Issue Date
2019-10-11
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
A novel approach for expanding documents is proposed to improve topic modeling on short
text. The enrichment is based on expanding noun words with information from custom (e.g.
domain-speci c) and pretrained Word2Vec models. The quality of the di erent conditions:
original, custom and pretrained, are evaluated with manual analysis of the created topics and
with the classi cation performance of a Suport Vector Machine trained on the output of an
LDA system. Manual analysis did not show a striking improvement of the created topics with
the enriched texts, compared to the original text. The performance of the prediction models
show a improved performance, only when enriched with information from the custom Word2Vec
models. However, the extent of the improvement is dependent on the text domain.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen