A multiresolution non-negative tensor factorization approach for single channel sound source separation

Kirbiz, Bilge; GUNSEL, B.

doi:10.1016/j.sigpro.2014.05.019

A multiresolution non-negative tensor factorization approach for single channel sound source separation

Atıf İçin Kopyala

Kirbiz S., GUNSEL B.

SIGNAL PROCESSING, cilt.105, ss.56-69, 2014 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 105
Basım Tarihi: 2014
Doi Numarası: 10.1016/j.sigpro.2014.05.019
Dergi Adı: SIGNAL PROCESSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.56-69
İstanbul Teknik Üniversitesi Adresli: Hayır

Özet

We propose a single channel audio source separation method to alleviate the smearing effects caused by fixed time-frequency (TF) resolution Short-Time Fourier Transform (STFT). We introduce a multiresolution representation based on Non-negative Tensor Factorization (NTF) where each layer of the tensor represents the mixture signal at a different time-frequency resolution. In order to fuse the information at different layers, the source separation is modeled as a joint optimization problem where the optimal solution is derived based on the Kullback-Leibler (KL) divergence. The resynthesis is made through an additional adaptive weighted fusion procedure which combines the sources separated at different scales by maximizing energy concentration. Numerical results over a large sound database indicate that the proposed joint optimization scheme enhances the quality of the separated sources both in terms of the conventional and the perceptual distortion measures. (c) 2014 Elsevier B.V. All rights reserved.