A Novel Expectation-Maximization Framework for Speech Enhancement

  • In this project, improved speech enhancement algorithms were developed based on the sparsity of speeches in the cepstrum domain.
  • A new expectation-maximization framework was also developed to provide a theoretical basis for the iterative enhancement process.
  • The new algorithms work extremely well in different colored noise environments.
  • A paper of this project received the best paper award in an international conference.

On the right, the magnitude spectrum of a noisy speech frame is shown. It can be seen that the proposed LogMMSE_L1_EM algorithm improves the estimation in each iteration. The black line is the ground truth, the red line is the noisy spectrum, and the yellow line is the result after 3 iterations.

 

Resource for download:

Paper

Matlab program (It is a p-file. It can be called using the following command in Matlab: Logmmse_L1_EM(‘noisy speech wave file.wav’,’output wave file.wav’). The speeches are assumed to be sampled at 16 kHz)

 

Key references:
  • T.W. Shen and Daniel P.K. Lun, “A Speech Enhancement Method Based on Sparse Reconstruction on Log-Spectra”, HKIE Transactions, Vol.24, Issue 1, pp.24-34, January 2017.
  • Daniel P.K. Lun, Tak-Wai Shen and K.C. Ho, “A novel expectation-maximization framework for speech enhancement in non-stationary noise environments”, IEEE Transactions on Audio, Speech and Language Processing, Vol. 22, Issue 2, pp.335-346, Feb 2014.
  • Daniel Pak-Kong Lun, Tak-Wai Shen, Tai-Chiu Hsung and Dominic K.C. Ho, “Wavelet Based Speech Presence Probability Estimator for Speech Enhancement”, Digital Signal Processing, Vol.22, Issue 6, pp.1161-1173, December 2012.
  • Tingtian Li, Daniel P.K. Lun and T.W. Shen, “Improved Expectation-Maximization Framework for Speech Enhancement Based on Iterative Noise Estimation”, Proceedings, 2015 IEEE International Conference on Digital Signal Processing, Singapore, pp.287-291, 2015. (Best paper award)
  • T.W. Shen and Daniel P.K. Lun, “Speech Enhancement Based on L1 Regularization in the Cepstral Domain,” Proceedings, 2014 IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne, Australia, pp.121-123, 2014.
  • Tak-Wai Shen, Daniel P.K. Lun, and Tai-Chiu Hsung, “Speech Enhancement Using Harmonic Regeneration with Improved Wavelet Based A-Priori Signal to Noise Ratio Estimator”, Proceedings, 2010 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS’2010), Cheng Du, December 2010, pp.1-4.
  • Daniel P.K. Lun and Tai-Chiu Hsung, “Improved Wavelet Based A-Priori SNR Estimation for Speech Enhancement”, 2010 International Symposium on Circuits and Systems (ISCAS’2010), Paris, France, May 2010, pp.2382-2385.
  • Tai-Chiu Hsung, Daniel P.K. Lun, and H.K. Kwan, “Speech Enhancement Based on Adaptive Wavelet Denoising on Multitaper Spectrum”. Proceedings, IEEE International Symposium on Circuits and Systems (ISCAS’2008), Seattle, U.S.A., May 2008, pp.1700-3.