Speech Separation Via Harmonic Suppression in Multi-Speaker Conversations to Assist Individuals with Hearing Loss
Kai Ito, Yasuaki Ishikawa, Taku Itami
Abstract
It is difficult for deaf and hard-of-hearing people to obtain information from their hearing, particularly in group conversations where multiple speakers overlap. In this study, we propose a speech separation and recognition system that does not rely on a deep neural network but instead focuses on the removal of harmonic components. Specifically, we propose a method to extract the frequency components of one of the sounds from a mixed-gender audio signal by removing the harmonics of the other. The effectiveness of this system is evaluated by separating each individual voice from the mixed signal and measuring the recognition accuracy using an automatic speech recognition (ASR) system. We discuss the proposed method and validation results in terms of speech separation and recognition accuracy.