On the explainability of case law recommendations using paragraph embeddings

Keywords
Loading...
Thumbnail Image
Issue Date
2022-05-17
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
Word and paragraph embeddings are known to capture semantic properties of texts and may therefore be useful for recommending relevant legal cases based on automated text analysis. We investigate to which extent paragraph embeddings are able to capture two types of juridically relevant information from case transcriptions. Firstly, recommendation and clustering experiments show that paragraph embeddings can capture the rhetorical and argumentative role of various sections in a case transcription. For these experiments, we have constructed a new data set of Dutch criminal law cases, where each case section is assigned a role label that indicates the type of legal information that can be found in that section. Moreover, we have developed a rudimentary form of automated argumentative zoning by showing that these section roles can be predicted using a text classification pipeline. Secondly, given that paragraph embeddings are useful for representing the characteristics of separate section roles, including case decisions, are they also able to track the case outcome? We developed a pattern- and rule-based classifier in order to extract the case outcome automatically for arbitrarily large data sets. A particular innovation is that this classifier successfully extracts co-occurring punishments as well as their heights, as opposed to a binary verdict (guilty/not guilty). We were able to find interpretable clusters of co-occurring punishments, but our experiments showed that paragraph embeddings were not able to capture this type of legal information. Given the relative pros and cons of the machine learning models and the pattern- and rule-based methods used in this project, we reflect on the merits of hybrid AI in the legal domain. Because the legal domain is a high-risk domain for the use of AI, we pay special attention to issues concerning the explainability of embedding-based case recommendation. We conclude by arguing that hybrid AI is a promising approach to explainability in the high-risk legal domain compared to post-hoc explanation methods on non-transparent machine learning models, such as embedding models.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen