Files in this item

FilesDescriptionFormat

application/pdf

application/pdfGU-DISSERTATION-2020.pdf (26MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:A journey to photo-realistic facial animation synthesis
Author(s):Gu, Kuangxiao
Director of Research:Hasegawa-Johnson, Mark
Doctoral Committee Chair(s):Hasegawa-Johnson, Mark
Doctoral Committee Member(s):Huang, Thomas S.; Morrow, Daniel G.; Liang, Zhi-Pei; Shi, Honghui
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):facial animation synthesis
talking head
patient portal
deep learning
neural network
audio-driven facial animation synthesis
Abstract:This dissertation presents preliminary work in facial animation generation with applications in educational psychology. In the first part of this dissertation, we describe two psychology studies as well as the computer vision techniques and platforms being used. Both studies investigate using conversational agents (CAs) as a way of delivering medical messages to patients. By incorporating CAs in the system, both semantic and emotional information can be delivered, which helps the patients, especially those with low heath and numerical literacy, to get a better understanding of their test results and medical instructions. Human studies were conducted to test the effectiveness of CA. In addition, the whole system was integrated with speech recognition as well as a natural language processing module to enable teach-back capability of the CA by providing a correct answer in case of a given wrong answer by the user, which will help them to get a better understanding of the medical messages being delivered. The second part of this dissertation documents the details of a proposed neural network based facial animation synthesis method. By unifying both appearance-based and warping-based methods in an end-to-end training process, the proposed system was able to generate vivid facial animation with highly preserved details. We show both qualitatively and quantitatively that the proposed system achieved a higher performance than baseline methods. In addition, visualization and ablation studies were conducted to further justify the effectiveness of the proposed system. In the third part, the previous facial animation synthesis system was integrated with another audio speech processing system. The final system was able to take speech signal and sample face images as input and generate the corresponding talking head animation as output. Comparison to the previous state-of-the-art method shows that the proposed system in this work achieves better performance.
Issue Date:2020-07-08
Type:Thesis
URI:http://hdl.handle.net/2142/108579
Rights Information:Copyright 2020 Kuangxiao Gu
Date Available in IDEALS:2020-10-07
Date Deposited:2020-08


This item appears in the following Collection(s)

Item Statistics