The cocktail party problem and Cochlear Implants, comparing a segmentation and a separation model.
No Thumbnail Available
In this research two models will be proposed intended to improve main speaker identi cation in noisy settings, e.g. the Cocktail Party problem, for CI(Cochlear Implant) users. The aim was not to replicate the attention procedure involved in the Cocktail Party Problem. The assumption is made that the CI users performance of speech perception will be improved by attending to the main speaker instead of the background noise. The goal is to nd which model is more capable of removing the noise from the signal and thus more able to solve the simpli ed cocktail party problem for CI users. The rst approach to the problem uses a model that segments the audio samples while the second approach uses a deep-clustering neural network to separate the signal from the noise. The models were tested on two data sets, the rst data set consisted of samples of two overlapping speakers. The second data set contained samples of 2 to 5 overlapping speakers. The models were only trained on the training subset of the rst data set but tested on both data sets. The results were evaluated based on quantitative metrics such as a signal to distortion ratio, a signal to noise ratio and the short-time objective intelligibility function. The separation model outperformed the segmentation model signi cantly and is therefore the better approach to solving the simpli ed cocktail party problem for CI users based on the results of the quantitative metrics.
Faculteit der Sociale Wetenschappen