Clustering patients based on their HPO terms and corresponding graphs by use of different semantic similarity measures and graph similarity measures

Keywords

Thumbnail Image

Issue Date

2022-01-27

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

For many patients with intellectual disability (ID), no diagnosis is found after genetic testing. However, sometimes a mutation is found of which the effect is yet unknown: a variant of unknown significance (VUS). Classification or clustering of patients’ HPO terms i.e. phenotypic abnormalities and their corresponding graphs, might be very useful for therapies or recognition of (new) genetic disorders. In this study different semantic and graph similarity measures are applied to the HPO terms and their corresponding graphs of patients with genetic disorders. The measures examined are Resnik, Lin, GED and MCS. The resulting similarity matrices are used as input for the spectral clustering algorithm. Finally, among others, the adjusted rand index (ARI), a measure related to accuracy that can be used to assess clusters, is used to analyze which (combination of) measure(s) leads to the best clustering of patients such that it corresponds most to the actual disorders. We find that semantic similarity measures individually perform better than graph similarity measures. Also, combinations of measures or using evenly distributed data do not seem to lead to better results. A similarly improved method may in the future be useful in solving the VUS.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen