Withdraw
Loading…
Understanding modern deep learning techniques for audio applications and beyond
Phan, Duc Huy
Loading…
Permalink
https://hdl.handle.net/2142/125528
Description
- Title
- Understanding modern deep learning techniques for audio applications and beyond
- Author(s)
- Phan, Duc Huy
- Issue Date
- 2024-07-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Jones, Douglas L
- Doctoral Committee Chair(s)
- Jones, Douglas L
- Committee Member(s)
- Choudhury, Romit Roy
- Smaragdis, Paris
- Do, Minh N
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Machine Learning Training
- Training by Pairing Samples
- Audio Classification and Detection
- Deep ReLu Network
- Abstract
- Deep neural networks (DNN) have been widely applied in different application domains. The DNN was first studied intensively in vision applications before adapting it to other fields. To migrate DNN solutions from the vision domain to another application domain, a neural network solution may be influenced by new structures or improved training processes. This thesis introduces several enhancements of the DNN in both model architectures and training processes for acoustic classification and detection applications. First, in acoustic scene classification applications, the introduction of time- frequency separable convolutions reduces the model size 8 to 10 times without loss of performance. In addition, our experiments with an alternate DNN architecture which is a predictor feed forward network for anomaly detection in machine audio data show 5% to 20% improvement in Area Under the Curve (AUC). Finally, for training process modification, we propose a pairing technique that simultaneously optimizes the performance of the models on pairs of training samples during the training process. A pair includes an original training example and a corresponding modified version. Our pairing techniques adapt the similarity part of the constrastive loss as an additional regularization term for the loss function of a given machine learning task. The pairing techniques show at least 1% improvement in network accuracy on top of mix-up augmentation for the CIFAR10 dataset and 2% increase in accuracy for DCASE 2020 Task 1A data. We show that the proposed training by pairs provides parameter regularization for ReLU deep networks. As a result, the technique can potentially apply to many other machine learning applications.
- Graduation Semester
- 2024-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/125528
- Copyright and License Information
- Copyright 2024 Duc Huy Phan
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…