Files in this item



application/pdfECE499-Sp2018-chen-Yikuan.pdf (451kB)Restricted to U of Illinois
(no description provided)PDF


Title:Exploration into rare sound detection using LSTM-RNN
Author(s):Chen, Yikuan
Contributor(s):Chen, Deming
recurrent neural network
scream detection
gunshot detection
audio event detection
Abstract:Rare Audio Event Detection (AED) plays a crucial role in domestic and public security applications. The goal of this research is to recognize key acoustic events using Long Short-Term-Memory Recurrent Neural Network (LSTM-RNN) based classifiers. We compared different existing methods on rare sound recognition, such as Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), zero-phase signal method and neural networks. Specifically, we investigated different neural network architectures, such as feedforward DNN, RNN, LSTM-RNN, bi-directional RNN etc. After experimenting with different neural network structures and different acoustic features, we propose a mixed neural network which consists of multiple subnets, each dedicated to recognizing one type of sound. Each subnet contains multi-input layers, feed forward layers, LSTM-RNN layer and output smoothing units. The final classification will be given based on the output of all subnets. Different acoustic features are fed into the network at different input layers to enhance the efficiency. Our model has exceeded the baseline performance of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 competition. However, there still exists a performance gap between our model and the current best model, and we are currently analyzing the advantages and drawbacks of our model and the top-ranking model.
Issue Date:2018-05
Date Available in IDEALS:2018-05-22

This item appears in the following Collection(s)

Item Statistics