Files in this item



application/pdfECE499-Sp2019-morshed.pdf (568kB)Restricted to U of Illinois
(no description provided)PDF


Title:Spiral representations in end-to-end Bengali articulatory feature identification
Author(s):Morshed, Mahir
Contributor(s):Hasegawa-Johnson, Mark
Subject(s):articulatory feature identification
Bengali speech recognition
connectionist temporal classification
discrete wavelet coefficients
Abstract:The use of end-to-end neural network architectures for speech recognition applications has brought a transition from using mappings of a speech signal's frequency spectra as inputs for a model to using the frequency spectra themselves as inputs. Such architectures, however, may attain different levels of recognition accuracy for certain tasks when presented with alternate representations of training data, such as rescaled and transformed spectra. This thesis presents the findings of an investigation into using such transformed representations to develop a model for identifying different articulatory feature classes in read Bengali speech using connectionist temporal classification on a gated recurrent unit-based network setup. Audio from a variety of speakers was used to train such a setup to discern places or manners of articulation of individual speech sounds within a given utterance. The results of error rate comparisons when given transformed inputs under consistent network configurations suggest that certain signal representations provide better performance in identifying different articulatory feature classes.
Issue Date:2019-05
Date Available in IDEALS:2019-06-17

This item appears in the following Collection(s)

Item Statistics