TY - JOUR
T1 - Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation
AU - Omachi, Motoi
AU - Ogawa, Tetsuji
AU - Kobayashi, Tetsunori
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2017/3
Y1 - 2017/3
N2 - We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation matrix such as independent vector analysis (IVA) can avoid nonlinear distortion, but the separation performance is reduced under reverberant conditions. The tandem connectionist approach combines several separation methods and it has been used frequently to compensate for the disadvantages of these methods. In this study, we propose associative memory model (AMM)-based linear filtering and a tandem connectionist framework, which applies TF masking followed by linear filtering. By using AMM trained with speech spectra to optimize the separation matrix, the proposed linear filtering method considers the properties of speech that are not considered explicitly in IVA, such as the harmonic components of spectra. TF masking is applied in the proposed tandem connectionist framework to reduce unwanted components that hinder the optimization of the separation matrix, and it is approximated by using a linear separation matrix to reduce nonlinear distortion. The results obtained in simultaneous speech separation experiments demonstrate that although the proposed linear filtering method can increase the signal-to-distortion ratio (SDR) and signal-to-interference ratio (SIR) compared with IVA, the proposed tandem connectionist framework can obtain greater increases in SDR and SIR, and it reduces the phoneme error rate more than the proposed linear filtering method.
AB - We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation matrix such as independent vector analysis (IVA) can avoid nonlinear distortion, but the separation performance is reduced under reverberant conditions. The tandem connectionist approach combines several separation methods and it has been used frequently to compensate for the disadvantages of these methods. In this study, we propose associative memory model (AMM)-based linear filtering and a tandem connectionist framework, which applies TF masking followed by linear filtering. By using AMM trained with speech spectra to optimize the separation matrix, the proposed linear filtering method considers the properties of speech that are not considered explicitly in IVA, such as the harmonic components of spectra. TF masking is applied in the proposed tandem connectionist framework to reduce unwanted components that hinder the optimization of the separation matrix, and it is approximated by using a linear separation matrix to reduce nonlinear distortion. The results obtained in simultaneous speech separation experiments demonstrate that although the proposed linear filtering method can increase the signal-to-distortion ratio (SDR) and signal-to-interference ratio (SIR) compared with IVA, the proposed tandem connectionist framework can obtain greater increases in SDR and SIR, and it reduces the phoneme error rate more than the proposed linear filtering method.
KW - Blind source separation
KW - independent vector analysis
KW - neural network
KW - speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85013054724&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85013054724&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2017.2653941
DO - 10.1109/TASLP.2017.2653941
M3 - Article
AN - SCOPUS:85013054724
SN - 2329-9290
VL - 25
SP - 637
EP - 650
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
IS - 3
ER -