Cross-modal Scene Prediction with Adversarial Domain Uncertainty Alignment
No Thumbnail Available
Cross-modal machine learning integrates and transfers information across multiple modalities of data to accomplish a given task such as image classification. Here we consider the problem of scene classification on a cross-modal data set of places. Specifically, we investigate a zero-shot learning setting where part of the modalities lack training data of some scene categories. We approach this problem by means of a recently proposed method that aligns predicted class probabilities across domains via adversarial learning. The original method performs unsupervised domain adaptation on features extracted by a deep neural network and we adapt it for supervised training to make efficient use of any labeled training data available in the target modalities. Our method is then evaluated on the cross-modal scenes data set. Our experiments show that class prediction uncertainty alignment benefits scene classification in a zero-shot setting. The results highlight that knowledge of class distributions in one modality can improve classification accuracy within a different but related modality. These findings motivate to further consider the potential of cross-modal knowledge transfer to resolve the problem of zero-shot learning.
Faculteit der Sociale Wetenschappen