Faculteit der Sociale Wetenschappen

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 1396
  • Item
    Mind the Linguistic Gap: Studying the learning of linguistic properties of continuous sign language videos in an isolated sign language recognition task
    (2023-06-08) Martínez Rodríguez, Javier; Larson, M.A.; Fitz, H.
    This research carried out in this thesis makes an initial investigation into whether a deep neural network, more concretely a 3D convolutional neural network (3D-CNN), is able to learn any aspect of continuous sign language (SL) linguistics in an isolated sign language recognition (ISLR) task. To do so, we use Dutch SL or Nederlandse Gebarentaal (NGT) data from the Corpus Nederlandse Gebarentaal (CNGT) and NGT Signbank. We define a Linguistic Gap (LG) as the difference between SL linguistics knowledge and the observable linguistic properties learnt by the classifier. We hypothesize the existence of a LG in the difference between the intrinsic dimension (ID) of the 1024-dimensional neural representations found in the last hidden layer of our classifier and the 21 theoretical dimensions of NGT we derive from linguistic specifications in NGT Signbank. To study the LG effectively, we design a new linguistically centered methodology in which the effect of linguistics on the classification is showcased. Given the isolated nature of the sign language recognition (SLR) task, we determine that phonology is the most straightforward linguistic aspect to study in this work. Thus, we use the phonological difference between pairs of signs to design and evaluate different experiments that approach the binary classification task from a linguistic perspective. We freeze all layers in the model except for the last hidden layer while fine tuning on our SL data. This confines the potential linguistic knowledge acquired by the network to this last hidden layer, which allows us to study the ID in relation to linguistics. To the best of our knowledge, we present the first application of ID on video data and on representations learnt on SL data. To extract the ID of the neural representations, we use the maximum likelihood estimation (MLE) and Two-Nearest Neighbours (TwoNN) algorithms, which are the only recorded applications of ID estimation on image data and on neural representations of image data. We carry out three experiments, in which we compare the classification of minimal pairs, i.e., two signs with different meaning that differ only in one phoneme, with non-minimal pairs. The first experiment highlights the effect of phonological difference between pairs of signs on the LG when a maximum amount of data is available. We compare four classifiers trained on the two most frequent non-minimal pairs and the two most frequent minimal pairs in the dataset. The second experiment keeps the best-performing minimal pair and non-minimal pair to study the effect of input data resolution on the LG. In the last experiment, we expand on the concept of minimal pairs and make a first introduction of phonological distance, which gives us a measure of the phonological difference between non-minimal pairs. We study the effect of this distance between pairs of signs to gain further insight into how the network incorporates SL linguistic knowledge in the classification. In these experiments, we discover through the calculation of ID that the last hidden layer of our I3D model is capable of representing SL data in latent space as effectively as the repre-sentations made by linguists in NGT Signbank, albeit remaining highly over-represented with respect to the dimensionality of its feature vector. We also observe that the ID of the neural representations in this layer is not sensitive to phonology of signs, but to other aspects such as spatial and temporal resolution of the input data. These initial results suggest that, in opposition to our initial hypothesis, the LG does not lie in the difference between the ID of the neural representations and the theoretical ID of NGT. Finally, through the study of our phonological distance measure, we discover that the classification performance of the I3D model increases with increasing phonological distance between the classified pairs of signs, suggesting that knowledge captured by the network is related to phonology, among other visual aspects of the data. This research contributes to the field of interpretability of SL technologies through the study of phonological aspects of SL in the representations of the last hidden layer of a binary classifier in an ISLR task. We discuss the implications of understanding how a deep neural network performs classification to improve performance and interpretability of SL systems and encourage research to further study linguistics and its impact on them.
  • Item
    Neural Correlates of Emotion Inhibition in Recurrent Major Depressive Disorder
    (2023-02-23) Tjeerdsma, Sarah; Tyborowska, Anna
    Major Depressive Disorder’s (MDD) high disease burden can partially be attributed to its high recurrence rate. Since cognitive control deficits often remain after remission, emotion inhibition may be an important vulnerability factor in the recurrence of depression. The current study aims to investigate the behavioural and neural association between cognitive control, specifically emotion inhibition, and MDD, and its potential role as a risk factor for depressive relapse. Additionally, the modulatory effect of depressive brooding on emotion inhibition capacity was examined. 57 patients with remitted recurrent MDD (rrMDD) and 41 never-depressed matched controls performed a Cued Emotional Conflict Task (CECT) while undergoing fMRI, in order to measure their abilities to inhibit a dominant response to positive or negative stimuli. rrMDD-patients were followed up for 2.5 years to detect potential recurrence of depressive episodes. There were no differences in behavioural performance and neural activity in the CECT between patients and controls. Participants were faster in responding to happy as compared to sad stimuli, and were faster in trials where they could give in to a dominant response as compared to trials in which they had to inhibit a dominant response. Emotion inhibition showed to be associated with increased dlPFC activity, as well as decreased ventral ACC (vACC) activity. However, these effects did not differ between patients and controls. rrMDD-patients who did experience a relapse within the follow-up period showed increased vACC activity during emotion inhibition trials as compared to patients without relapse. This might indicate a deficit in regulation of this brain region in relapse patients during emotion inhibition. The absence of expected group differences on a behavioural and neural level might suggest that emotion inhibition is not a very informative factor in the recurrence of depression.
  • Item
    Even co-speech gestures’ early beginnings improve predictions about upcoming words
    (2022-12-22) Otterdijk, van, Lina; Bekke, ter, Marlijn; Drijvers, Linda; Holler, Judith; Terporten, René
    Human face-to-face conversation involves rapid turn-taking, likely due to predictive language processing. Moreover, the multimodal aspect of communication can enhance language processing. In this study we examined whether gestures facilitate predictive language processing, and specifically whether the very beginnings of co-speech gestures (i.e. the preparation phase), which have been deemed largely meaningless, may in fact help in predicting upcoming utterance content. Additionally, we asked whether empathy influences this effect. In a cloze task, participants saw video fragments from natural face-to-face conversations and filled out their predictions on how the speaker would continue after the fragment ended. These video fragments always ended prior to the ‘lexical affiliate’ (i.e. the lexical item(s) semantically most closely related to the gesture’s meaning). The video clips were presented in two conditions: (1) with the preparation phase visible, or (2) with the preparation phase blurred. Participants also filled out the Empathy Quotient questionnaire. Results demonstrated that predictions were not more accurate based on visibility of the gesture preparation, but predictions were more similar to the lexical affiliate’s meaning when the gestural preparation was shown. Additionally, predictions varied considerably across participants and preparation visibility did not impact this. With regards to empathy, no influence on the effect of preparation visibility was found. This is the first study to show that even seeing the very early beginnings of co-speech gestures helps with predictive language processing, thus underlining the need for conceptualizing predictive processes during language comprehension in multimodal terms.
  • Item
    Learning under threat: threat-induced freezing and the adaptation to environmental volatility
    (2022-09-05) Carneiro de Andrade, Mariana; Livermore, James; Roelofs, Karin
    Threat occurs often in unpredictable volatile situations. Adequate coping with acute threat requires one to flexibly adapt to environmental volatility. While stable environments require a low learning rate, rapidly changing volatile environments require a high learning rate. Although there is increasing knowledge from computational neuroscience on how our brains enable learning rate adaptation, it remains unclear how psychophysiological changes during acute threat affect this skill. Recent empirical and theoretical work proposed that the bradycardia and immobility observed during threat-induced freezing enhance perceptual sensitivity, subjective value integration, and action preparation. However, it is still unknown whether stronger freezing responses are also linked to stronger adaptation of learning rate to environmental volatility. The aim of the current thesis is to test this hypothesis, and to explore the neural correlates of such a potential effect. Fifty-two participants (22 females) performed a probabilistic reversal learning task featuring a stable and a volatile cue inside a magnetic resonance imaging scanner. The task entailed a go/no-go type of shooting simulation where errors were reinforced by a shock, during which participants found out through trial and error which target to shoot and not to shoot. Freezing was qualified by significant threat-anticipatory heart rate deceleration. Our findings showed that in the volatile condition participants froze more and made more errors. A Pearce-Hall learning model featuring a flexible learning rate fitted the participants’ behavioural data best. Learning rates were higher for volatile cues, and trial-wise freezing was positively associated with trial-wise learning rate across cues. These effects are in line with our hypothesis that threat- induced freezing may facilitate the adaptation of learning to volatile environments. As far as neural correlates go, the dorsal anterior cingulate cortex was more active for volatile than stable cues in the feedback phase of the task, but not during the pre-decision phase. I will end by discussing additional neuroimaging analyses that should be performed to further explore the neural underpinnings of the effect of freezing on learning. Keywords: freezing, learning, volatility, decision-making, threat, dACC
  • Item
    The role of monosynaptic pathway and hippocampal memory transience in rapid statistical learning of auditory word representations
    (2021-09-30) Schneider, Fabian; Spaak, Eelke; Janzen, Gabriele; McQueen, James
    Variability in the speech envelope between and even within speakers poses a challenge for word learning because, in traditional views such as complementary learning systems theory, abstraction of episodic memories may occur only after consolidation. As such, rapid learning of novel words that generalises across speakers cannot easily be explained in these models. In the present study, we investigated the functional distinctions between the mono- and trisynaptic pathways (rapid statistical learning and episodic memory formation, respectively) within hippocam-pus as well as memory transience in rapid statistical learning of auditory words. Participants (N = 31) learned auditory words under low to high variability of speakers and were tested with MEG and behavioural measurements being taken. We manipulated novelty/familiarity of words/speakers at test such that episodic memory formation, statistical learning of words, statistical learning of speakers and multidimensional statistical learning (across both, words and speakers) could be assessed. Results demonstrated the expected benefit of veridical episodic over statis-tical learning as well as statistical learning over control conditions. High-variability training was associated with a benefit in statistical learning over low-variability training, the latter presenting with stronger recruitment of the hippocampus in the delta-band, but no such difference emerged for veridical episodic learning. Findings are inter-preted as tentative evidence of the functional distinction between mono- and trisynaptic pathways within the hippo-campus, supporting rapid statistical learning and episodic memory formation, respectively. Further, results are in-terpreted to indicate that mono- and trisynaptic pathways may be recruited concurrently but that the degree to which rapid statistical learning occurs is dependent on the degree of variability in inputs. A role of memory transience between dimensions of the learning space is tentatively suggested but remains elusive. Keywords: word learning, rapid statistical learning, memory transience, hippocampus, meg