Decoding Speech From Human Brain Activity Using Diffusion Models

dc.contributor.advisorBerezutskaya, Julia
dc.contributor.advisorAmbrogioni, Luca
dc.contributor.authorSchroder, Pascal
dc.date.issued2023-05-15
dc.description.abstractThis thesis investigated the use of diffusion models for decoding speech from brain activity, which can enable the development of brain-computer interfaces to restore communication in severely paralyzed individuals. Although the field has seen significant progress, existing approaches display several limitations: they often require recordings of isolated utterances with multiple repetitions, or use brain data from multiple individuals to generate intelligible speech. To address this, we employed a twostage training framework: First, we pre-trained a diffusion-based speech generator on a large speech corpus, and then utilized the speech generator to develop models that generate speech from brain activity. We worked on brain data recorded from a single subject during a book reading task and trained our models to generate speech from single instances of words in the brain data. Our results showed that our models can generate naturalistic, intelligible speech by mapping brain data to speech fragments from the pre-training dataset. We conclude that diffusion models are a promising choice for generating speech from brain activity, and are robust enough to work on the brain activity of a single subject, without repetitions of utterances. This has the potential to advance the field of speech BCIs for severely paralyzed individuals.
dc.identifier.urihttps://theses.ubn.ru.nl/handle/123456789/16420
dc.language.isoen
dc.thesis.facultyFaculteit der Sociale Wetenschappen
dc.thesis.specialisationspecialisations::Faculteit der Sociale Wetenschappen::Artificial Intelligence::Master Artificial Intelligence
dc.thesis.studyprogrammestudyprogrammes::Faculteit der Sociale Wetenschappen::Artificial Intelligence
dc.thesis.typeMaster
dc.titleDecoding Speech From Human Brain Activity Using Diffusion Models

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Schroder, P. s-1062138-MSc-MKI92-Thesis-2023.pdf
Size:
8.87 MB
Format:
Adobe Portable Document Format