Clustering patients based on their HPO terms and corresponding graphs by use of different semantic similarity measures and graph similarity measures

Keywords
Loading...
Thumbnail Image
Issue Date
2022-01-27
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
For many patients with intellectual disability (ID), no diagnosis is found after genetic testing. However, sometimes a mutation is found of which the effect is yet unknown: a variant of unknown significance (VUS). Classification or clustering of patients’ HPO terms i.e. phenotypic abnormalities and their corresponding graphs, might be very useful for therapies or recognition of (new) genetic disorders. In this study different semantic and graph similarity measures are applied to the HPO terms and their corresponding graphs of patients with genetic disorders. The measures examined are Resnik, Lin, GED and MCS. The resulting similarity matrices are used as input for the spectral clustering algorithm. Finally, among others, the adjusted rand index (ARI), a measure related to accuracy that can be used to assess clusters, is used to analyze which (combination of) measure(s) leads to the best clustering of patients such that it corresponds most to the actual disorders. We find that semantic similarity measures individually perform better than graph similarity measures. Also, combinations of measures or using evenly distributed data do not seem to lead to better results. A similarly improved method may in the future be useful in solving the VUS.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen