Files in this item



application/pdfECE499-Sp2015-sivaraman.pdf (566kB)Restricted to U of Illinois
(no description provided)PDF


Title:Quantization Error Tolerance in Hashed Audio Spectra
Author(s):Sivaraman, Aswin
Contributor(s):Smaragdis, Paris
Subject(s):Locality sensitive hashing
winner take all hashing
source reconstruction
hierarchical clustering
Abstract:Matching an input spectrum with a learned dictionary of spectral frames is common in audio signal processing, especially for speech de-noising and one-word speech recognition. For large disordered spectral dictionaries, exhaustively searching for nearest-neighbor spectra is computationally expensive. The proposed methodology utilizes hierarchical clustering of winner-take-all (WTA) semantic hashes of the spectral frames in the dictionary. We define a custom Hamming distance metric between hash codes that is analogous to the original error (cross entropy). After clustering the training data, we evaluate the functionality of this framework by assessing signal-to-noise ratio (SNR) for test signal reconstruction, exploring the quantization effects of truncating the hierarchical clustering tree (dendogram). By defining a tolerance level for noise, we seek to considerably reduce the search space for spectral frames and significantly improve spectrogram-matching speed. An extended application of this work is reduced power consumption for active listening devices ("Hey Siri", "Ok Google", etc.), as well as increased transmission quality without forsaking device talktime. The proposed framework proved to be sub par, but possible improvements to this research are discussed.
Issue Date:2015-05
Date Available in IDEALS:2015-09-28

This item appears in the following Collection(s)

Item Statistics