TY - GEN
T1 - Blind speech separation in a meeting situation with maximum SNR beamformers
AU - Araki, Shoko
AU - Sawada, Hiroshi
AU - Makino, Shoji
PY - 2007
Y1 - 2007
N2 - We propose a speech separation method for a meeting situation, where each speaker sometimes speaks and the number of speakers changes every moment. Many source separation methods have already been proposed, however, they consider a case where all the speakers keep speaking: this is not always true in a real meeting. In such, cases, in addition, to separation, speech detection and the classification of the detected speech according to speaker become important issues. For that purpose, we propose a method that employs a maximum signal-to-noise (MaxSNR) beamformer combined with a voice activity detector and online clustering. We also discuss the scaling ambiguity problem as regards the MaxSNR beamformer, and provide their solutions. We report some encouraging results for a real meetingin a room with a reverberation time of about 350 ms.
AB - We propose a speech separation method for a meeting situation, where each speaker sometimes speaks and the number of speakers changes every moment. Many source separation methods have already been proposed, however, they consider a case where all the speakers keep speaking: this is not always true in a real meeting. In such, cases, in addition, to separation, speech detection and the classification of the detected speech according to speaker become important issues. For that purpose, we propose a method that employs a maximum signal-to-noise (MaxSNR) beamformer combined with a voice activity detector and online clustering. We also discuss the scaling ambiguity problem as regards the MaxSNR beamformer, and provide their solutions. We report some encouraging results for a real meetingin a room with a reverberation time of about 350 ms.
KW - Ambiguity
KW - Maximum SNR beamformer
KW - Online clustering
KW - Scaling
KW - Speech separation
KW - Voice activity detector
UR - http://www.scopus.com/inward/record.url?scp=34547498831&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547498831&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2007.366611
DO - 10.1109/ICASSP.2007.366611
M3 - Conference contribution
AN - SCOPUS:34547498831
SN - 1424407281
SN - 9781424407286
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - I41-I44
BT - 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
T2 - 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
Y2 - 15 April 2007 through 20 April 2007
ER -