Detecting Private Information: Using and comparing an Artificial Immune System to a rule-based algorithm

Keywords
Loading...
Thumbnail Image
Issue Date
2019-12-04
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
The present master thesis seeks to develop a better way of extracting private information from nonprivate information in official documents. The aim is to automatize this process. For this, several algorithms were created. One is a rule-based algorithm that uses a set amount of words to determine if something is private or not. This algorithm is meant as a baseline for comparison. The second algorithm is an Artificial Immune System, which tries to detect ‘outside’ information, or in this case, private information. The two algorithms were compared in terms of Sensitivity and Specificity during initial tests. A Wilcoxon test was utilized during the final test. The hypothesis is that the Artificial Immune System would perform better due learning the patterns itself, while the rulebased algorithm would face difficulty generalizing. It was proven that the two algorithms function differently in terms of performance (p<0.05), with hints that the Artificial Immune System performs better. However, both the Artificial Immune System and the rule-based algorithm could not reliably detect private information (33% found and 22% found respectively). Other methods will be necessary to solve this problem.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen