Methods For Automatically Generating a Legal Thesaurus
Keywords
Loading...
Authors
Issue Date
2017-08-31
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
Automatic thesaurus generation is a desired technique for the reason that a thesaurus
is a useful tool in NLP, but manually making a thesaurus is expensive and
time consuming. In this thesis, the process of thesaurus generation is divided up in
two parts: term extraction and relation extraction. Term extraction being the process
of automatically finding candidate terms for a legal thesaurus and relation extraction
is the process of finding which terms are hypernyms of each other. For term extraction
different termhood measures are used: Log Likelihood, Kullback Leibler Divergence
and the measure as assigned by the TExSIS tool. For relation extraction, different
classifiers are trained to classify whether two terms have a hypernym-relation.
The conclusion of this thesis is that no system could be built that can autonomously
build a thesaurus and that in the short term it is better to look for a system to assist
humans in making a thesaurus.
Description
Citation
Supervisor
Faculty
Faculteit der Letteren