Visualizing Subjective Experiences of Visual Cortical Implant Users with AtoPI: an AI-driven Audio-to-Phosphene-Image Generator

Keywords

Loading...
Thumbnail Image

Issue Date

2024-08-24

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

Visual neuroprostheses technologies aim to evoke artificial visual percepts for visually impaired people. The visual pathway can be damaged at various locations, necessitating different visual neuroprostheses solutions. As an illustration, a bionic eye could benefit people suffering from eye diseases, whereas a visual cortical implant is suitable when the entire visual pathway from eye to the brain is damaged. Given the emergent nature of the visual neuroprostheses technology, research in this field is also facing some challenges and limitations. In clinical trials, there is a need for automated visualization of what the volunteer implantees are seeing to improve efficient data processing and identification of volunteer-specific perception features. The current research project investigates the possibility of using Artificial Intelligence to visualize blind volunteers’ phosphene perceptions, based on audio-recorded descriptions provided by the volunteers, to improve the testing process in human trials for visual cortical prostheses. The research pipeline was divided into three phases: audio-to-text transcription, feature extraction and visual perception generation. Audio descriptions recorded during experiment sessions were obtained from .ns5 files and transcribed using automatic speech recognition model Whisper. Phosphene features were extracted by applying natural language processing methods. Eight phosphene feature categories were identified as fundamental to phosphene image generation: shape, size, color, brightness, filled, location, additional and data-driven. Using the extracted feature information, a volunteer-specific, AI-driven Audio-to-Phosphene-Image (AtoPI) generator was designed to visualize what the volunteer was perceiving on a computer screen. Through this research pipeline, images representing the phosphene percepts of the current volunteer in the CORTIVIS clinical trials were visualized. Furthermore, to ensure successful and accurate visualization in future experiments, additional research directions were proposed to the scientists based on the limitations identified in the current experimental procedure. Keywords: visual neuroprostheses, AI, NLP, phosphene, audio-to-text transcription, feature extraction, image generator

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen