Wav2vec 2.0: Contrastive Analyses in end-to-end Transformer Models
Keywords
Loading...
Authors
Issue Date
2023-10-25
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
End-to-end ASR systems with a transformer architecture, such as wav2vec 2.0, have become widely available
as a means of automatic speech transcription. These systems take speech, in the form of audio files, as input,
and convert this to written text, usually in the form of grapheme sequences. Converting continuous audio to
discrete text means the system needs to deal with this transition. We research this transition by investigating
the transformer block in an ASR system, wav2vec 2.0, using probing classifiers, in which we detect the gradual
conversion by the system using probability vectors and information storage in individual layers of the transformer
block. Through these insights, we learn how the system deals with such a transition, and we visualize how
grapheme activation differs across the ASR system.
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen