Bayesian estimation employing a phase-sensitive observation model for noise and reverberation robust automatic speech recognition / Dipl-Ing. Volker Sebastian Leutnant ; Erster Gutachter: Prof. Dr.-Ing. Reinhold Häb-Umbach, zweiter Gutachter: Prof. Bhiksha Raj. Paderborn, 2016
Inhalt
- Introduction
- Contribution and Organization
- Statistical Framework of Automatic Speech Recognition
- Feature Extraction
- Acoustic Modeling
- Language Modeling
- Decoding
- Evaluation Metrics
- Environmental Robustness
- Bayesian Estimation of the Speech Feature Posterior
- Conceptually Optimal Solution
- A Priori Model
- Observation Models
- From a Deterministic Relation in the Short-Time Discrete-Time Fourier Domain to a Stochastic Relation in the Logarithmic Mel Power Spectral Domain
- Deterministic Relation in the Short-Time Discrete-Time Fourier Domain
- Stochastic Relation in the Logarithmic Mel Power Spectral Domain
- Presence of Reverberation and Absence of Background Noise
- Presence of Reverberation and Background Noise
- Absence of Reverberation and Presence of Background Noise
- Overview of Observation Models
- AIR Model
- Observation Models – AIR Model Applied
- Overview of Non-Recursive Observation Models
- Recursive Observation Model in the Presence of Reverberation and the Absence of Background Noise
- Recursive Observation Model in the Presence of Reverberation and Background Noise
- Overview of Recursive Observation Models
- Vector of Phase Factors
- General Properties
- Empirical Distribution
- Parametric Approximation to its Distribution
- Analytic Solution to its Central Moments
- Observation Errors
- Presence of Reverberation and Absence of Background Noise
- Presence of Reverberation and Background Noise
- Absence of Reverberation and Presence of Background Noise
- Inference
- Approximate Multi-Model Inference
- Approximate Model-Specific Inference
- The Non-Recursive Observation Model in the Presence of Reverberation
- The Non-Recursive Observation Model in the Presence of Reverberation and Background Noise
- The (Non-Recursive) Observation Model in the Absence of Reverberation and the Presence of Background Noise
- The Recursive Observation Model in the Presence of Reverberation and the Absence of Background Noise
- The Recursive Observation Model in the Presence of Reverberation and Background Noise
- Evaluation
- AURORA 2 task
- AURORA 2 Database Description
- Recognizer Setup
- Baseline Results
- Bayesian Feature Enhancement Setup
- Results with Bayesian Feature Enhancement
- AURORA 4 task
- AURORA 4 Database Description
- Recognizer Setup
- Baseline Results
- Bayesian Feature Enhancement Setup
- Results with Bayesian Feature Enhancement
- AURORA 5 task
- AURORA 5 Database Description
- Recognizer Setup
- Baseline Results
- Bayesian Feature Enhancement Setup
- Results with Bayesian Feature Enhancement
- MC-WSJ-AV task
- Conclusion
- Appendix
- Properties of Gaussian distributions
- Alternative Formulation of the Equivalent Mean
- Derivation of (4.115)
- Derivation of (4.139)
- Mean Vector and Covariance Matrix of the Observation Error in the Presence of Reverberation and Noise
- Frequency Dependent Power Compensation Constant
- Derivation of the AIR Representation in the Mel Power Spectral Domain
- MMSE Estimate of the MPSC Feature Vector of Reverberant Speech
- Moments of the Phase Factor
- Moments of the Transformed Phase Factor
- Multivariate Normal and Log-Normal Distribution
- Vector-Taylor Series Expansion
- Acronyms
- Notation
- List of Figures
- List of Tables
- Bibliography
- Own publications
