Visualizing Subjective Experiences of Visual Cortical Implant Users with AtoPI: an AI-driven Audio-to-Phosphene-Image Generator
Keywords
Loading...
Authors
Issue Date
2024-08-24
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
Visual neuroprostheses technologies aim to evoke artificial visual percepts for visually
impaired people. The visual pathway can be damaged at various locations, necessitating
different visual neuroprostheses solutions. As an illustration, a bionic eye could benefit
people suffering from eye diseases, whereas a visual cortical implant is suitable when the
entire visual pathway from eye to the brain is damaged. Given the emergent nature of the
visual neuroprostheses technology, research in this field is also facing some challenges and
limitations. In clinical trials, there is a need for automated visualization of what the volunteer
implantees are seeing to improve efficient data processing and identification of
volunteer-specific perception features. The current research project investigates the
possibility of using Artificial Intelligence to visualize blind volunteers’ phosphene perceptions,
based on audio-recorded descriptions provided by the volunteers, to improve the testing
process in human trials for visual cortical prostheses. The research pipeline was divided into
three phases: audio-to-text transcription, feature extraction and visual perception generation.
Audio descriptions recorded during experiment sessions were obtained from .ns5 files and
transcribed using automatic speech recognition model Whisper. Phosphene features were
extracted by applying natural language processing methods. Eight phosphene feature
categories were identified as fundamental to phosphene image generation: shape, size,
color, brightness, filled, location, additional and data-driven. Using the extracted feature
information, a volunteer-specific, AI-driven Audio-to-Phosphene-Image (AtoPI) generator
was designed to visualize what the volunteer was perceiving on a computer screen. Through
this research pipeline, images representing the phosphene percepts of the current volunteer
in the CORTIVIS clinical trials were visualized. Furthermore, to ensure successful and
accurate visualization in future experiments, additional research directions were proposed to
the scientists based on the limitations identified in the current experimental procedure.
Keywords: visual neuroprostheses, AI, NLP, phosphene, audio-to-text transcription,
feature extraction, image generator
Description
Citation
Faculty
Faculteit der Sociale Wetenschappen
