Detecting Private Information: Using and comparing an Artificial Immune System to a rule-based algorithm

Keywords

Loading...
Thumbnail Image

Issue Date

2019-12-04

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

The present master thesis seeks to develop a better way of extracting private information from nonprivate information in official documents. The aim is to automatize this process. For this, several algorithms were created. One is a rule-based algorithm that uses a set amount of words to determine if something is private or not. This algorithm is meant as a baseline for comparison. The second algorithm is an Artificial Immune System, which tries to detect ‘outside’ information, or in this case, private information. The two algorithms were compared in terms of Sensitivity and Specificity during initial tests. A Wilcoxon test was utilized during the final test. The hypothesis is that the Artificial Immune System would perform better due learning the patterns itself, while the rulebased algorithm would face difficulty generalizing. It was proven that the two algorithms function differently in terms of performance (p<0.05), with hints that the Artificial Immune System performs better. However, both the Artificial Immune System and the rule-based algorithm could not reliably detect private information (33% found and 22% found respectively). Other methods will be necessary to solve this problem.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen