Decoding Speech From Human Brain Activity Using Diffusion Models
| dc.contributor.advisor | Berezutskaya, Julia | |
| dc.contributor.advisor | Ambrogioni, Luca | |
| dc.contributor.author | Schroder, Pascal | |
| dc.date.issued | 2023-05-15 | |
| dc.description.abstract | This thesis investigated the use of diffusion models for decoding speech from brain activity, which can enable the development of brain-computer interfaces to restore communication in severely paralyzed individuals. Although the field has seen significant progress, existing approaches display several limitations: they often require recordings of isolated utterances with multiple repetitions, or use brain data from multiple individuals to generate intelligible speech. To address this, we employed a twostage training framework: First, we pre-trained a diffusion-based speech generator on a large speech corpus, and then utilized the speech generator to develop models that generate speech from brain activity. We worked on brain data recorded from a single subject during a book reading task and trained our models to generate speech from single instances of words in the brain data. Our results showed that our models can generate naturalistic, intelligible speech by mapping brain data to speech fragments from the pre-training dataset. We conclude that diffusion models are a promising choice for generating speech from brain activity, and are robust enough to work on the brain activity of a single subject, without repetitions of utterances. This has the potential to advance the field of speech BCIs for severely paralyzed individuals. | |
| dc.identifier.uri | https://theses.ubn.ru.nl/handle/123456789/16420 | |
| dc.language.iso | en | |
| dc.thesis.faculty | Faculteit der Sociale Wetenschappen | |
| dc.thesis.specialisation | specialisations::Faculteit der Sociale Wetenschappen::Artificial Intelligence::Master Artificial Intelligence | |
| dc.thesis.studyprogramme | studyprogrammes::Faculteit der Sociale Wetenschappen::Artificial Intelligence | |
| dc.thesis.type | Master | |
| dc.title | Decoding Speech From Human Brain Activity Using Diffusion Models |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Schroder, P. s-1062138-MSc-MKI92-Thesis-2023.pdf
- Size:
- 8.87 MB
- Format:
- Adobe Portable Document Format
