Clustering patients based on their HPO terms and corresponding graphs by use of different semantic similarity measures and graph similarity measures
Keywords
Loading...
Authors
Issue Date
2022-01-27
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
For many patients with intellectual disability (ID), no diagnosis is found
after genetic testing. However, sometimes a mutation is found of which the
effect is yet unknown: a variant of unknown significance (VUS). Classification
or clustering of patients’ HPO terms i.e. phenotypic abnormalities and
their corresponding graphs, might be very useful for therapies or recognition
of (new) genetic disorders.
In this study different semantic and graph similarity measures are applied
to the HPO terms and their corresponding graphs of patients with genetic
disorders. The measures examined are Resnik, Lin, GED and MCS. The
resulting similarity matrices are used as input for the spectral clustering
algorithm. Finally, among others, the adjusted rand index (ARI), a measure
related to accuracy that can be used to assess clusters, is used to analyze
which (combination of) measure(s) leads to the best clustering of patients
such that it corresponds most to the actual disorders.
We find that semantic similarity measures individually perform better
than graph similarity measures. Also, combinations of measures or using
evenly distributed data do not seem to lead to better results. A similarly
improved method may in the future be useful in solving the VUS.
Description
Citation
Supervisor
Faculty
Faculteit der Sociale Wetenschappen